UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Diagnosing construction performance by using causal models Li, Mingen 2009

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2009_fall_li_mingen.pdf [ 49.51MB ]
Metadata
JSON: 24-1.0063154.json
JSON-LD: 24-1.0063154-ld.json
RDF/XML (Pretty): 24-1.0063154-rdf.xml
RDF/JSON: 24-1.0063154-rdf.json
Turtle: 24-1.0063154-turtle.txt
N-Triples: 24-1.0063154-rdf-ntriples.txt
Original Record: 24-1.0063154-source.json
Full Text
24-1.0063154-fulltext.txt
Citation
24-1.0063154.ris

Full Text

DIAGNOSING CONSTRUCTION PERFORMANCE BY USING CAUSAL MODELS  by MINGEN LI B.A.Sc., Tsinghua University, China, 1999 M.A.Sc., Tsinghua University, China, 2002  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE STUDIES (Civil Engineering)  THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver)  September 2009  © Mingen Li, 2009  Abstract To date, most of the research that has addressed construction performance diagnosis has focused on identifying important factors having impact on construction performance and on establishing related performance models. The majority of the models were developed from a predictive perspective, not an explanatory one. The general goal of this research is to develop a construction performance diagnostic approach capable of assisting in identifying likely actual causes along with supporting evidence, and capturing and modeling experience-based diagnostic knowledge for current and future project use. The diagnostic approach implemented is based on a holistic structured causal model based diagnostic process which is applicable to key project performance measures. The approach is comprised of three layers: 1. a performance measure layer to determine if there exists a performance deviation to explain; 2. a quantitative casual models layer that makes use of quantitative causal relationships to identify causal variable variances, and 3. a user-defined experience-based causal models layer that makes use of experience-based knowledge to help further explain reasons (causal factors) for the causal variable variances. The design of the diagnostic approach involves five connected components which include an integrated information platform that treats the heterogeneous data collected in support of different construction management functions, a component related to making use of quantitative causal models, two components related to an experiencebased causal modeling approach that allows the flexible formulation, automatic selection and use of experience-based causal models to help further explain performance variances, and a component responsible for searching and reporting evidence with the guidance of the experience-based causal models. A realistically-sized building project was used to demonstrate the workability of the diagnostic approach for time performance as the representative measure studied in this thesis. The incremental value of the approach compared with current diagnostic practice was demonstrated through an experiment involving individuals with knowledge of construction. The approach was also assessed in terms of some tests formulated to assess the fit of a diagnostic approach with the construction industry context, which is important if the research findings are to have any impact on practice. ii  Table of Contents Abstract ............................................................................................................................... ii Table of Contents............................................................................................................... iii List of Tables ................................................................................................................... viii List of Figures .................................................................................................................... ix Abbreviations................................................................................................................... xiii Acknowledgments............................................................................................................ xiv Dedication ......................................................................................................................... xv  1  Introduction ................................................................................................................. 1 1.1 Chapter Overview............................................................................................... 1 1.2 Research Background and Motivation ............................................................... 1 1.3 Research Objectives ........................................................................................... 8 1.4 Research Scope................................................................................................. 10 1.5 Key Research Assumptions.............................................................................. 11 1.6 Research Roadmap ........................................................................................... 12 1.7 Thesis Outline................................................................................................... 15 1.8 Summary of Validation and Contributions....................................................... 17  2  Construction Control Practice and Literature Review............................................... 19 2.1 Chapter Overview............................................................................................. 19 2.2 Construction Control Practice .......................................................................... 20 2.2.1 Controls in Construction Project Management .......................................... 20 2.2.2 Characteristics and Issues of Time Control................................................ 21 2.2.3 Overview of State-of-the-Art Construction Planning and Control Tools .. 25 2.3 Overview of Reviewed Academic Literature ................................................... 27 2.4 Research on Identifying Critical Factors.......................................................... 29 2.4.1 Importance Index Method .......................................................................... 31 2.4.2 Statistical Analysis ..................................................................................... 33 2.4.3 Others (Case Study, AHP, Etc.) ................................................................. 34  iii  2.5 Research on Developing Predictive Construction Performance Models ......... 35 2.5.1 Statistical Regression.................................................................................. 35 2.5.2 Neural Networks......................................................................................... 38 2.5.3 Mathematical Models ................................................................................. 41 2.5.4 Expert Systems and Fuzzy Logic ............................................................... 43 2.6 Research on Developing Explanatory Construction Performance Models ...... 48 2.6.1 Conceptual Framework .............................................................................. 48 2.6.2 Bayesian Networks..................................................................................... 50 2.6.3 Decision Support System ........................................................................... 52 2.7 Summary .......................................................................................................... 54 2.7.1 Findings on Identified Critical Factors....................................................... 54 2.7.2 Findings on Established Construction Performance Models...................... 55  3  Overview of Proposed Diagnostic Approach ............................................................ 57 3.1 Chapter Overview............................................................................................. 57 3.2 Desired Properties of a Versatile Diagnostic Approach................................... 57 3.2.1 Comparison of Techniques Used to Develop Construction Performance Models ........................................................................................................ 60 3.2.2 Comparison of Decision Support System Based Explanatory Models ...... 60 3.3 Data Integration for Performance Diagnosis.................................................... 65 3.3.1 General Approaches for Data Integration................................................... 65 3.3.2 Integrated Information Platform Selection................................................. 66 3.4 Performance Diagnosis Knowledge and Management..................................... 67 3.4.1 Knowledge and Its Management ................................................................ 67 3.4.2 Knowledge Representation......................................................................... 68 3.5 Causal Models .................................................................................................. 70 3.5.1 Quantitative Causal Models ....................................................................... 71 3.5.2 Experience-based Causal Models............................................................... 73 3.5.3 Overall Causal Diagram for Diagnosing Performance............................... 80 3.6 Schema of Proposed Diagnostic Approach ...................................................... 83 3.6.1 Integrated Information Platform (Component 1, Figure 3.7) ..................... 83  iv  3.6.2 Quantitative Causal Model Analysis (Component 2, Figure 3.7) .............. 91 3.6.3 Knowledge Base Composed of Standard Experience-based Causal Models (Component 3, Figure 3.7) ............................................................ 91 3.6.4 Hypothesis Generation for Specific Experience-based Causal Models (Component 4, Figure 3.7) ......................................................................... 93 3.6.5 Search and Report Data Evidence (Component 5, Figure 3.7) .................. 93  4  Quantitative Causal Models in Performance Variance Analysis .............................. 95 4.1 Chapter Overview............................................................................................. 95 4.2 Introduction ...................................................................................................... 96 4.3 Quantitative Causal Models in Cost Variance Analysis................................... 96 4.3.1 Prepare Cost Performance Baseline and Measure Actual Cost.................. 97 4.3.2 Causal Diagram for Analyzing Cost Variance ......................................... 100 4.4 Challenges for Cost Variance Analysis.......................................................... 105 4.5 Quantitative Causal Models in Schedule Variance Analysis ......................... 107 4.5.1 Preparation of As-planned and As-built Schedule ................................... 107 4.5.2 Causal Diagram for Analyzing Schedule Variance.................................. 108 4.6 Challenges for Schedule Variance Analysis................................................... 112 4.7 Implementation of the Component for Making Use of Quantitative Causal Models ............................................................................................................ 115 4.7.1 Selecting the Target Schedule to Evaluate Current Time Performance ... 117 4.7.2 Selecting Activities to Diagnose .............................................................. 120 4.7.3 Determining the Variance of Schedule Causal Variables ........................ 125 4.8 Summary ........................................................................................................ 139  5  Experience-based Causal Models in Performance Variance Analysis.................... 141 5.1 Chapter Overview........................................................................................... 141 5.2 Causal Factors Affecting Construction Time Performance............................ 142 5.3 Standard Experience-based Causal Models.................................................... 145 5.3.1 Creating and Organizing Standard Experience-based Causal Models ..... 145 5.3.2 Defining Causal Factors ........................................................................... 152  v  5.4 Use of User-defined Experience-based Causal Models for a Specific Project............................................................................................................. 163 5.4.1 Retrieve and Modify User-defined Experience-based Causal Models..... 163 5.4.2 Apply Causal Models to Explain Construction Performance Variances.. 171 5.4.3 Report Data Evidence............................................................................... 177 5.5 Summary ........................................................................................................ 186  6  Conclusions ............................................................................................................. 188 6.1 Chapter Overview........................................................................................... 188 6.2 Summary of the Research............................................................................... 188 6.3 Validation ....................................................................................................... 190 6.3.1 Apply the Diagnostic Approach to Upper Crust Manor Building Project 191 6.3.2 What Does the Example Demonstrate?.................................................... 218 6.3.3 Experiment and Results............................................................................ 219 6.3.4 Feedback from Industry Personnel........................................................... 224 6.4 Research Contributions .................................................................................. 226 6.5 Recommendations for Future Research.......................................................... 231 6.5.1 Single-layer Experience-based Causal Model Format ............................. 231 6.5.2 Apportioning Variance to Causal Factors ................................................ 232 6.5.3 Enhancing Insights from Data Search Results ......................................... 232 6.5.4 Extending the Functionality of the Prototype........................................... 233  References....................................................................................................................... 235  Appendix A – Planned/Actual Dates, Daily Status, and Schedule Variance Report for the Example Project .................................................................................................. 252 A.1 Base Schedule of the Example Project........................................................... 253 A.2 Activity Daily Status Between 29-Dec-2006 and 31-Jan-2007 ..................... 254 A.3 Updated Dates at Progress Date of 31-Jan-2007............................................ 256 A.4 Schedule Variance Analysis Report (Base Schedule and Schedule as of 31-Jan-2007)................................................................................................... 257  vi  A.5 Activity Daily Status Between 1-Feb-2007 and 28-Feb-2007 ....................... 259 A.6 Updated Dates at Progress Date of 28-Feb-2007 ........................................... 261 A.7 Activity Daily Status Between 1-Mar-2007 and 27-Mar-2007...................... 262 A.8 Updated Dates at Progress Date of 27-Mar-2007........................................... 264 A.9 Schedule Variance Analysis Results (Target: 31Jan07 Schedule, Active: 28Feb07 Schedule) ......................................................................................... 265 A.10 Schedule Variance Analysis Results (Target: Base Schedule, Active: 27Mar07 Schedule) ........................................................................................ 266 A.11 Schedule Variance Analysis Results (Target: 31Jan07 Schedule, Active: 27Mar07 Schedule) ........................................................................................ 267 A.12 Schedule Variance Analysis Results (Target: 28Feb07 Schedule, Active: 27Mar07 Schedule) ........................................................................................ 268  Appendix B – Causal Model Analysis Report for the Example Project......................... 269  Appendix C – Planned/Actual Dates, Daily Status, and Causal Model Analysis Reports for the Upper Crust Manor Project.................................................................... 276 C.1 Base Schedule of the Upper Crust Manor Project.......................................... 277 C.2 Activity Daily Status Between 20-Oct-2003 and 28-Nov-2003..................... 285 C.3 Updated Activity Dates at Progress Date of 28-Nov-03 ................................ 287 C.4 Causal Model Analysis Results (Target: Base Schedule, Active: 28Nov03 Schedule) ........................................................................................................ 295 C.5 Activity Daily Status Between 29-Nov-2003 and 31-Dec-2003.................... 302 C.6 Updated Activity Dates at Progress Date of 31-Dec-03................................. 303 C.7 Causal Model Analysis Results (Target: Base Schedule, Active: 31Dec03 Schedule) ........................................................................................................ 312 C.8 Causal Model Analysis Results (Target: 28Nov03 Schedule, Active: 31Dec03 Schedule)......................................................................................... 333 C.9 Activity Attribute Values ............................................................................... 348  vii  List of Tables Table 1.1  Pre-existing and Newly Implemented Features in the REPCON ................ 15  Table 2.1  Categorization of Reviewed Articles ........................................................... 30  Table 3.1  Comparison of Techniques Used to Develop Performance Models in terms of Identified Properties....................................................................... 61  Table 3.2  Comparison of DSS Based Explanatory Models in terms of Identified Properties ..................................................................................................... 62  Table 4.1  Different Classes in Cost Estimating (AACE 2007).................................... 97  Table 4.2  Cost Variance Analysis Conceptual Example ........................................... 102  Table 4.3  Pseudo Code for Identifying Activities in Defined Time Window ........... 125  Table 4.4  Pseudo Code for Determining Finish Date Variance................................. 127  Table 4.5  Pseudo Code for Determining Start Date Variance ................................... 128  Table 4.6  Pseudo Code for Determining Duration Variance ..................................... 129  Table 4.7  Pseudo Code for Determining Finish Predecessor Variance ..................... 129  Table 4.8  Pseudo Code for Determining Start Predecessor Variance........................ 130  Table 4.9  Pseudo Code for Determining Implicit Start Predecessor Variance .......... 131  Table 4.10 Pseudo Code for Determining Idle Time as part of Duration Variance .... 132 Table 4.11 Pseudo Code for Determining Working Time Variance............................ 132 Table 5.1  Some Identified Causal Factors Having Impact on Time Performance .... 143  Table 5.2  Extracting Precipitation Data for Different Activities ............................... 157  Table 5.3  Example Showing Difference Between Working Dates and Calendar Dates .......................................................................................................... 157  Table 5.4  How Casual Factor Example of Problem Code Work on Project Side...... 171  Table 5.5  Work Type as Context Information to Identify Causal Model .................. 175  Table 5.6  Construction Phase as Context Information to Identify Causal Model ..... 175  Table 5.7  Pseudo Code for Finding Relevant Experience-based Causal Models...... 176  Table 5.8  Pseudo Code for Finding Associated Evidence of Recorded Problem Codes.......................................................................................................... 179  Table 5.9  Causal Factor Examples for Explaining Time Variances .......................... 180  Table 6.1  Causal Factor Definitions for Upper Crust Manor Project ........................ 201 viii  List of Figures Figure 1.1  Research Roadmap....................................................................................... 13  Figure 2.1  Construction Project Management............................................................... 20  Figure 2.2  Variance Analysis Capabilities in MS Project 2003.................................... 26  Figure 2.3  Variance Analysis Capabilities in Primavera Project Planner (P3) ............. 27  Figure 2.4  General Structure of a Neural Network ....................................................... 39  Figure 2.5  Performance Analysis Flowchart (Modified from Maloney 1990) ............. 49  Figure 2.6  Example Bayesian Network (Modified from McCabe et al. 1998) ............. 51  Figure 3.1  General Integration Approaches (Modified from Ziegler and Dittrich 2004) ............................................................................................................ 66  Figure 3.2  Cause-Effect Diagram Example .................................................................. 69  Figure 3.3  An Example of Multiple-Layer Causal Diagram......................................... 72  Figure 3.4  Reciprocal, Loop and Dependent Causal Relations..................................... 74  Figure 3.5  Single Layer Experience-based Causal Diagram......................................... 76  Figure 3.6  Overall Causal Diagram for Diagnosing Performance ................................ 81  Figure 3.7  Overall Schema of Proposed Diagnostic Approach..................................... 84  Figure 3.8  PCBS in Physical View ............................................................................... 86  Figure 3.9  Drawing Control System in the Physical View ........................................... 87  Figure 3.10 Project Phases, Resources in the Process View ........................................... 87 Figure 3.11 Activity List Interface in the Process View ................................................. 88 Figure 3.12 Organizational/Contractual View ................................................................ 89 Figure 3.13 Records Management in the As-built View................................................. 89 Figure 3.14 Daily Site Data in the As-built View ........................................................... 90 Figure 4.1  Component Responsible for Quantitative Causal Model Analysis.............. 95  Figure 4.2  Cost and Schedule Control System Criteria................................................. 99  Figure 4.3  Causal Diagram Composed of Quantitative Causal Models for Cost Performance Variance Analysis................................................................. 101  Figure 4.4  Causal Diagram Composed of Quantitative Causal Models for Schedule Variance Analysis ...................................................................................... 110  Figure 4.5  Shift of Crticial Activity ............................................................................ 113 ix  Figure 4.6  Implicit Predecessors ................................................................................. 115  Figure 4.7  Example Project for Illustrating the Quantitative Causal Model Variance Analysis for Time Performance and Implementation Details .... 116  Figure 4.8  Menu Options to Access Schedule Variance Analysis Function............... 118  Figure 4.9  Window for Selecting Target Project ........................................................ 118  Figure 4.10 Suggestive Information for Selecting Right Target Schedule ................... 119 Figure 4.11 Comparison Bar Chart Between Active and Target Schedule................... 119 Figure 4.12 Filters for Selecting Activities of Interest.................................................. 121 Figure 4.13 Activity Profile Filter................................................................................. 122 Figure 4.14 Different Scenarios of Activities in Target and Active Schedule.............. 123 Figure 4.15 Result of Applying Filters to the Example Project .................................... 124 Figure 4.16 Options for Finding Schedule Variances ................................................... 126 Figure 4.17 Relation Among Start, Finish and Duration Variances ............................. 128 Figure 4.18 Schedule  Variance  ‘Variance=Active  Analysis  Result  Schedule-Target  with  Schedule”  Using (Active  Option  of  Schedule:  Progress Date of 31-Jan-07; Target Schedule: Original Base Schedule) .. 134 Figure 4.19 Possible Comparison Scenarios Between Active and Target Schedule..... 137 Figure 4.20 Schedule  Variance  ‘Variance=Active  Analysis  Result  Schedule-Target  with  Schedule”  Using (Active  Option  of  Schedule:  Progress Date of 28-Feb-07; Target Schedule: Original Base Schedule).. 138 Figure 5.1  Components Responsible for Making Use of Experience-based Causal Models........................................................................................................ 141  Figure 5.2  Accessing the Standard Experience-based Causal Models Component .... 146  Figure 5.3  Standard Experience-based Causal Models Component ........................... 146  Figure 5.4  Interface for Creating/Deleting/Editing Standard Experience-based Causal Models............................................................................................ 148  Figure 5.5  Specifying Attributes for a Standard Experience-based Causal Model..... 149  Figure 5.6  User Interface for Specifying Project Types.............................................. 150  Figure 5.7  User Interfaces for Defining and Editing Standard Phases and Work Types.......................................................................................................... 151  Figure 5.8  Adding/Deleting/Editing Causal Factors in a Standard Causal Model...... 152  x  Figure 5.9  User Interface Showing Causal Factor Definition ..................................... 153  Figure 5.10 User Interface for Defining Causal Factor State........................................ 154 Figure 5.11 User Interface for Selecting Desired Data Field ........................................ 155 Figure 5.12 User Interface for Defining Filters............................................................. 156 Figure 5.13 Using +/- to Specify Condition Value for Filter Key in Date Type........... 158 Figure 5.14 Multiple Filters Defined for Data Field Selected....................................... 160 Figure 5.15 Using Multiple Data Fields to Define a Causal Factor .............................. 160 Figure 5.16 Define Factor’s Attribute Association and Activity’s Attribute Values .... 161 Figure 5.17 Comments Tab in Causal Factor Window ................................................. 162 Figure 5.18 To Enter Into Project Experience-based Causal Models Component........ 164 Figure 5.19 Project Experience-based Causal Models Component .............................. 164 Figure 5.20 User Interface for Choosing Type of Project of Interest............................ 165 Figure 5.21 Standard Causal Model Examples to Copy Over to Project Side .............. 165 Figure 5.22 Menu Options to Add/Delete/Edit Causal Factors in a Project Causal Model ......................................................................................................... 166 Figure 5.23 User Interfaces for Editing Causal Factors at Project Side........................ 167 Figure 5.24 Interface for Developing Standard Daily Site Problem Codes .................. 168 Figure 5.25 Interfaces for Developing and Using Project Side Problem Codes ........... 169 Figure 5.26 Warning Information for Modifying Problem Code Categories................ 170 Figure 5.27 Menu Option of Causal Model Analysis ................................................... 172 Figure 5.28 User Interface for Selecting Variance(s) to Explain .................................. 172 Figure 5.29 Experience-based Causal Models to Explain Variances of Activities....... 173 Figure 5.30 Causal Model Analysis Results ................................................................. 177 Figure 5.31 Selection of Report Content Profile to Report Results .............................. 178 Figure 5.32 Defining a Report Content Profile ............................................................. 178 Figure 5.33 Causal Model Analysis Report Example ................................................... 183 Figure 6.1  Elevations of Upper Crust Manor Project.................................................. 192  Figure 6.2  Upper Crust Manor Project’s Foundation Plan View ................................ 192  Figure 6.3  Physical Components of Upper Crust Manor Project................................ 193  Figure 6.4  List of Activity Planning Structures of Upper Crust Manor Project ......... 194  Figure 6.5  Participants of Upper Crust Manor Project................................................ 196  xi  Figure 6.6  Schedule Variance Analysis Results for Upper Crust Manor Project (Schedule as of 28-Nov-03 with Base Schedule) ...................................... 198  Figure 6.7  User-defined Causal Models for Upper Crust Manor Project ................... 200  Figure 6.8  Causal Models Automatically Selected to Explain Schedule Variances ... 204  Figure 6.9  Sample of Causal Model Analysis Results for Upper Crust Manor Project ........................................................................................................ 204  Figure 6.10 Schedule Variance Analysis Results for Upper Crust Manor Project (Schedule as of 31-Dec-03 with Base Schedule)....................................... 208 Figure 6.11 Causal Models Automatically Filtered to Explain Schedule Variances .... 210 Figure 6.12 Schedule Variance Analysis Results (as of 31Dec03 Schedule (Active) with as of 28Nov03 Schedule (Target))..................................................... 215 Figure 6.13 Causal Models Automatically Selected to Explain Schedule Variances ... 216 Figure 6.14 Answers to the Experiment Questions....................................................... 223  xii  Abbreviations CPM  Critical Path Method  P3  Primavera Project Planner  SDK  Software Development Kit  AHP  Analytic Hierarchy Process  DSS  Decision Support System  AACE  Association for the Advancement of Cost Engineering  BCWS  Budgeted Cost of Work Scheduled  BCWP  Budgeted Cost of Work Performed  ACWP  Actual Cost of Work Performed  AV  Accounting Variance  CV  Cost Variance  SV  Schedule Variance  C/SCSC  Cost and Schedule Control System Criteria  xiii  Acknowledgments After a long, challenging but not lonely journey, I have now completed my Ph.D. studies at the University of British Columbia. Honestly, without the great help from the following people, I cannot imagine how I could get to the end and have this achievement. First and most importantly, I would like to sincerely thank my research advisor Professor Alan Russell. I learnt from him not only how to do research work, develop research skills, have deep insights into research problems, but also a very rigorous attitude toward research. I was also moved a lot after knowing he was still helping me revise my thesis during his recovery from serious surgery. His excellent guidance, valuable advice, consistent encouragement, and great support were, are and will be greatly appreciated forever and remembered. The programming work conducted by William Wong to help implement my research ideas is very much appreciated. I would also like to express my thanks to my committee members, Dr. Thomas Froese, Dr. Sheryl Staub-French, and Dr. Scott Dunbar, for their advice and guidance throughout my research work. My sincere gratitude also goes to the following graduate students, Ryan Jian Zhang, Ismail Ibrahim, Xiao Zong, Ngoc Tran, Jehan Zeb, and Madhav Nepal for their support to my research work, and the people from industry, Arda Cicek, Daniel Grigg, and Steven Jukes for their valuable feedback on my research results. Finally, I am very grateful for the self-giving support from my wife Ming Yang, and my parents Ai-Zhu Liu and Guang-Rong Li. Their encouragement, patience and endless love have been invaluable and indispensable for me during my Ph.D. journey which I will remember forever.  xiv  Dedication  To my wife Ming Yang  and  my parents Guang-Rong Li Ai-Zhu Liu  xv  1  1.1  Introduction  Chapter Overview  The primary focus of this chapter is on: (a) providing the motivation for undertaking research on a holistic-causal-modelbased-performance-diagnostic approach; (b) identifying important characteristics of the construction industry that should be reflected in the formulation of a performance-diagnostic approach along with several tests expressed as questions to help judge the fit or match of a diagnostic approach with these characteristics; (c) articulating an overall research goal and specific research objectives; (d) describing key assumptions and a roadmap for the research work described in this thesis; and (e) providing an overview of the structure of the thesis.  1.2  Research Background and Motivation  The construction industry plays a vital role in support of worldwide economic growth. World construction spending in 2004 was 4.5 trillion US dollars, and is projected to reach 7.2 trillion US dollars in 2010 (Global Insight 2005). In Canada, as one of the ten largest industry sectors, the construction industry contributes annually more than 100 billion Canadian dollars to the economy (National Research Council Canada 2005). At the same time, construction projects are increasing rapidly in scale and complexity as economies grow and become more technologically sophisticated. However, the goal faced by construction management personnel remains unchanged, which is to deliver each project in line with prescribed objectives of time, cost, scope, quality and safety by effectively and efficiently making use of information, financial, material and human resources. In order to achieve this goal, a general project management process, which can be described simply as a loop cycle of plan-execute-control-correct, has been widely accepted and applied in practice in a manner which reflects each project’s characteristics and context. 1  Nevertheless, many construction projects are completed with time and/or budget overruns and sometimes with unsatisfactory quality and safety performance. And, even when project control approaches are used for which a performance diagnostic function is a critical component, it is still difficult and time consuming to determine whether inaccurate planning, poor execution, events beyond the contractors’ control, wrong corrective measures or some combination thereof were the source of the undesired results (Navon 2005, Staub-French 2002). Therefore, a construction performance diagnostic approach which is able to identify problem causes and search out supporting evidence in an ongoing and timely manner from a project’s database during the course of construction is desired. Using such an approach it is believed that construction management personnel could better identify and take effective proactive corrective actions to minimize the impact from undesired causes, allowing the project to remain on or get back on track. Only with deep insight into the performance deviations observed, which could be obtained by using an effective diagnostic approach, can management resources be applied efficiently to attack the problems in relatively real-time. Otherwise, limited resources could be wasted and the window of opportunity to bring the project back on track could disappear. However, based on an extensive review of the literature, it appears that past research efforts focused on assisting practitioners in capturing and using their experience-based knowledge to explain unsatisfactory construction performance on an ongoing basis is limited. It is observed that the power of current computer technology combined with the advantages associated with state-of-the-art research ideas such as data integration and knowledge management have yet to be fully explored in these research efforts. Hardware and software technologies permit more and more construction data to be electronically collected, saved, retrieved and analyzed. But, the more data collected, including greater heterogeneity of data type, the more difficult it is to extract useful information from large and ever-growing volumes of data. This has lead to increased interest in data integration by researchers. The benefits of data integration are often described by the research community in terms of increased “information sharing”. An integrated construction information environment is thought to be able to not only reduce errors and inefficiencies resulting from inaccurate, untimely, redundant or missing  2  information, but also help foster better coordination and cooperation amongst the highly fragmented construction participants (Russell and Froese 1997). To the extent that some diagnostic approaches have previously been proposed for construction time and cost performance (e.g. Abu-Hijleh and Ibbs 1993, Diekmann and Al-Tabtabai 1992), they were not built on an integrated construction information environment. As a result, only data used for the time and cost management functions was accessed and assessed and no other causes from other management functions were considered. In reality, a large number of events such as accidents, change orders, labor unavailability or inclement weather could be the root causes for undesired construction performance. Having considered this, a diagnostic approach that is based on an integrated construction information platform, in which data supporting different ongoing management functions can be easily accessed, is desired. This will facilitate an approach for digging out as complete a set of causes for project performance as possible along with supporting data. Although the construction industry is a mature one, many problems encountered in it are still poorly defined and structured, in part because of the ever changing context of projects and the unique elements of each. As a result, a substantial amount of expertise and knowledge in the minds of seasoned practitioners is heavily relied on to identify and analyze the problems encountered. In order to prevent invaluable knowledge from being lost after the resignation or retirement of these people and make it a valuable asset for future use, knowledge management, a relatively hot research topic in the construction management domain, has generated considerable interest in many large, geographically dispersed construction companies. For these companies, there is a need to tap into the knowledge and expertise of employees, regardless of location (Carrillo and Chinowsky 2006). A survey of United Kingdom project-based organizations shows that about 50% of the respondents (the majority were from the construction industry) noted that knowledge management would result in new technologies and new processes that will benefit the organization (Carrillo and Chinowsky 2006). Some attempts have been made to build construction performance diagnostic approaches with rule-based knowledge bases (e.g. Diekmann and Al-Tabtabai 1992), but they provide very little flexibility for the individual construction user to make use of his/her own accumulated knowledge. We assert that to date a practical knowledge capture and validation tool for diagnosing  3  construction performance has yet to be fully developed. Thus, the need still exists for research focused on formulating a diagnostic approach capable of allowing the user to not only express his/her own experience-based knowledge, but also test the validity of the knowledge for a specific case, and save it for future re-use if validated. In general, the nature of academic research can be: (i) problem driven, i.e. find a solution to an identified problem of importance which could be a desired improvement in an area of interest; or (ii) tool/technology driven, i.e. explore how well a technology adopted/adapted from another domain could improve upon existing processes. Further, researchers can choose to focus on one or a few parts of a problem or adopt a holistic approach. As to diagnosing construction performance, we believe seeking a structured holistic approach or framework is possible and would be more beneficial than adding parts to existing approaches for treating a subset of performance measures. In past research efforts, several frameworks for diagnosing construction performance have been proposed, but they are not sufficiently general to be applicable to all key performance measures such as time, cost, quality, safety (e.g. the framework of Suraji et al. (2001) was specifically directed at explaining safety performance), and some others, although claimed as being general, neglected the difference in characteristics associated with various performance measures (e.g. Maloney (1990)’s general performance analysis flowchart). Thus, research focused on formulating a structured holistic approach or framework which a diagnostic approach could be based on is still needed. As to technology/tool driven research, which forms the majority of past research, the problem and user context must be examined to ensure suitability of choices made. It is observed that adopting a technology/tool from another field could involve considerable complexity and in the academic sphere such complexity may be prized, even if there is not a real fit with the problem and user context. Such lack of fit could result in the approach developed having little impact on practice, i.e. would not be accepted by practitioners. This appears to be the case for construction industry. That definitely is not what we want in this research. It is fair to say that the construction industry has its own significant characteristics and context different from other industries which in our view must be reflected in solutions proposed for improved diagnostic practices. Firstly, the construction industry is fragmented (Flood et al. 2002) and its  4  organization structure is very decentralized (Brockmann and Birkholz 1999). With such a decentralized organization structure, the power of control is usually dispersed among departments or individuals (Naoum 2001). For a construction company, that means its individual project teams usually work independently of each other, thus demanding decentralization of decision making by various specialists working on different projects (Naoum 2001). This observation has implications for how knowledge is stored, shared, and adapted to a specific project context. Secondly, it is noted that construction products and processes are usually of unique in terms of design, participants, construction methods, and social and environmental surroundings. Construction projects are also full of uncertainties and easily subject to the impact of many project context related factors (e.g. composition of labor force, material availability) and changes from outside physical and social environment (Oglesby et al. 1988, Levitt 1987), which are very likely to be different from project to project, activity to activity, and at different points in time. Thus, some techniques (e.g. statistical analysis) mostly suitable for identifying general causal factors or patterns across multiple projects are, in fact, not very appropriate for figuring out specific, non-general, causes for performance deviations detected for individual projects or activities. In response to the complexity of project management (i.e. inability to definitively evaluate the effects of events or actions because too many variables or factors interact) and ambiguity (i.e. events or causality being unknown or uncertain), Pich et al. (2002) argued that a combination of two fundamental strategies, learning and selectionism, are required. Learning requires that one actively incorporates new information and responds to new events during the course of a project, which cannot be planned in advance. It requires that the management team or individual be very flexible in problem solving. Selectionism, the pursuit of multiple independent attempts or approaches at the same time and choosing the best one ex post facto, is preferred if signals during the course of project cannot be perceived by the project management team and/or not sufficiently rich to learn from, or learning is costly or difficult. Thus, faced with the situation of many influencing factors involved and the nature of uncertainties in construction project management, learning from new information and observations for the project at hand and being flexible in problem solving and decision-making is very important. Also, pursuing multiple  5  parallel independent hypotheses proposed by different seasoned practitioners to help explain construction performance deviations is helpful. Thirdly, for the construction industry, heavy reliance is placed on the accumulated knowledge, experience, and judgement of the individual (Kazi 2005). This is in accord with the observation of Halpin (1998) that usually construction contractors work intuitively based on experience with similar jobs and situations, and most construction contractors feel that analytical tools restrict use of their intuitive approach to problem solving. Levitt (1987) also agreed that the construction industry runs on conventional knowledge and experience-based judgement. If the results obtained from an approach are generally not consistent with their experience-based perceptions, they would very likely doubt the validity of the approach, especially when it is based on some black box technique which cannot let users know clearly how the results were obtained. In this sense, techniques with black box characteristics are seen as undesirable for embeddeding in a diagnostic approach. That might be one primary reason why many of the research efforts (discussed in detail in chapter 2) conducted over the past 25 years have had limited impact on real diagnosis practices, although some useful insights have resulted from them. Fourthly, a reality faced by the industry is that during the execution phase of a project, the resources available for an in-depth diagnosis of performance to date are very limited. Levitt (1987) agreed that in the industry in many situations, there is not enough time to make detailed analysis of all the influencing factors, and decisions often have to be made in the midst of ongoing processes. Managers also rarely have the extra human resources necessary to collect and examine a significant amount of extra data, e.g. data from many previously finished projects, a requirement for some techniques such as statistical analysis and neural networks, to help explain the current project performance at hand. So, thus a diagnostic approach should maximize the use of data already collected for the project at hand. Finally, frontline construction management personnel, who are assumed to be the primary users of the diagnostic approach, usually are not technocrats versed in complex analysis tools (e.g. neural nets, regression analysis, etc.). And seldom if ever are they supported by special analysts except for the largest of firms. That means they have to use  6  the diagnostic approach and interpret the results by themselves without the assistance of specialists from other domains. Thus, diagnostic approaches based on techniques, which are unfamiliar to practitioners and not simple to use are not seen as a “fit” with the industry. From what has been introduced, several important tests expressed as questions can be formulated to help to test the fit or match of a diagnostic approach with the industry context. Since in the academic literature reviewed no model or approach was found that satisfies the majority of these tests, a research contribution could be claimed if the diagnostic approach developed in this thesis meets these tests. The tests for assessing the fit of a diagnostic approach and hence its potential for workability are:  Generality Is it based on a framework general enough to be applicable to all key performance measures (e.g. time, cost, quality)? Is it applicable to projects of different kinds and work/activity of different types? Integration & Data Is it embedded in an integrated construction information environment that makes it possible to search out data from different construction management functions as performance diagnostic evidence? Does it make use of data already collected for the project at hand and there is not a need to collect extra data for the current project or assemble and analyze data from past projects? Transparency & Ease of Use Does it build on known and generally accepted quantitative modeling paradigms and relationships? Does it employ a transparent and easily understandable diagnostic process? Is it relatively simple to use and a self-sufficient approach, which means construction practitioners can use it and interpret the results without assistance from specialists in other knowledge domains? Flexibility & Customization Does it adapt to semi-autonomous decentralized decision making, i.e. is it able to  7  make use of available explanatory knowledge and reflect the judgments or hypotheses from individual users working on isolated projects as well as the accumulated knowledge of the firm? Does it allow users to easily and flexibly express, revise and repeatedly make use of personal expertise-based diagnostic knowledge?  In what follows, detailed research objectives, followed by descriptions of the research scope and key assumptions are presented. A research roadmap and thesis structure are then given at the end of the chapter.  1.3  Research Objectives  For diagnosing reasons for construction performance, two critical questions need to be answered: (i) What performance deviations from planned results happened, i.e. decide what needs to be explained? (ii) Why did the deviations happen, i.e. what are the causes or reasons for the deviations observed? Quality answers to these questions requires full consideration of the project’s context, a thorough examination of as-planned and as-built performance and data collected during project execution in support of essential management functions, and use of construction knowledge expertise pertaining to likely cause-effect relations. Thus, the broad goal of this research is to develop a construction performance diagnostic approach capable of assisting the user in identifying performance deviations and causes along with supporting evidence, and capturing and modeling performance-diagnosing knowledge for current and future project use. Specific research objectives pursued to realize the general goal of this thesis are:  Identify the strengths and weaknesses of the construction performance models developed heretofore, potential factors that impact various construction performance measures, and the core set of properties and data required for a practical diagnostic  8  approach; Pursue a holistic and structured diagnostic process that is based on both quantitative and experience-based causal model reasoning and which has general applicability to different key performance measures. Causal model is a representation of causal relationships between events/variables (Pearl 2000). Herein, quantitative causal model refers to a basic causal equation (e.g. activity duration = scope / (productivity • resource_usage)) or formal logic models (e.g. CPM network), which are axiomatic and always used without much dispute. By experience-based causal model it means that the causal relationship between the variables/factors cannot be quantitatively expressed without dispute and usually proposed based on experience for different scenarios (e.g. productivity is affected by weather conditions and work face access); Employ and refine existing quantitative casual models for the performance measures of interest and the structured relationships between them, which can help narrow the focus to different performance variances causally contributing to project level performance deviations; Formulate an experience-based causal modeling approach that allows users to construct personal hypotheses in the form of causal models, in which causal factors and their threshold states should be able to be flexibly proposed and specified, to help explain performance parameter variances. The experience-based causal models and factors should also be organized as a function of project and activity contexts to assist in efficient and transparent performance reasoning. Use should be made of search engine guided by the causal models to find and report supporting data evidence, if any exists, in support of the user specified knowledge/experience-based hypotheses; and Demonstrate the workability of the concept and operation of the structured diagnostic process, thereby validating to some extent the effectiveness of a holistic diagnostic approach prototype and its responsiveness to the tests set out for an approach reflective of industry needs.  While some research conducted by others before has pursued performance diagnostic approaches as discussed in Chapter 2, what mainly differentiates the research described  9  herein from them is the comprehensive exploration of a unified approach with causal model (quantitative and experience-based) based reasoning to help explain different project performance measures. Of itself this is also a valid research objective, and mirrors previous work by others which has involved the exploration of the usefulness of a specific technology for the diagnosis problem (e.g. neural nets, Bayesian networks, fuzzy logic). Motivation for this objective stems from the belief that causal model based reasoning is a good fit with industry characteristics, and hence its use has the potential to enhance industry performance.  1.4  Research Scope  Given finite resources, including time, the scope of this thesis has been bounded in the following manner:  The focus of the diagnostic approach is on the construction phase, including execution and post-project analysis; The primary perspective is that of the general contractor or construction manager. Other beneficiaries could be the project client and sub-trades; The diagnostic approach is meant to be applicable for a number of performance measures including time, cost, quality, and safety. However, the primary focus in this thesis is time performance, although some discussion of cost performance is presented to demonstrate the generality of the approach and that various measures may share the same experience-based causal models; The diagnostic approach is meant to be applicable to any type of construction project (e.g. building, infrastructure projects). However, to demonstrate workability of the approach and to validate or confirm its response to the properties desired of a practical diagnostic approach, an example of a building project is used; and, The diagnostic approach primarily aims to identify causal factors responsible for the variances observed, but not to quantitatively apportion the variance observed to these factors.  10  Other observations that affect the scope of work are as follows. The reality is that construction projects are undertaken in complex contexts. Factors that affect performance can affect each other. Seeking ‘Root Cause’ factors is an endless exercise because no matter how deep one goes, there is always at least one more cause you can look for (Bellinger 2004). Therefore, to what extent causes can be figured out is an important scope related question. In reality, the amount of information available to project participants is limited, often making the identification of “true” root causes difficult if not impossible. For example, it is impossible for the approach to assist in identifying the fundamental causes for poor drawing quality if there is no access to the designer’s confidential information such as the detailed design process used, individual designer competency and training, etc. Further, data that provides direct measurement of some causal factors may not be collected in support of management functions, limiting the completeness of a causal model or requiring the use of a surrogate data item. For example, drug taking or alcohol abuse by one or more workers can impair labor productivity. But seldom does or can a contractor collect such data, and hence a surrogate measure of absenteeism or amount of rework may have to be used. Facing these realities, the diagnostic approach is meant to identify only causal factors for which direct or surrogate evidence data can be found.  1.5  Key Research Assumptions  The use of realistic assumptions helps to provide a research project with a solid foundation. Two important assumptions, which relate to planned performance metrics and causal relationships, are made. First, a successful construction project generally means that planned or expected outcomes, such as cost budgets, planned schedules, safety and quality requirements, are met. Performance deviations are usually detected by comparing expected results with actual ones. But mistakes incurred in the planning phase could be the real cause of deviations in the execution phase. Staub-French (2002) found that cost overruns are a problem on many construction projects. Nevertheless it is difficult to determine if the cost overruns result from inaccurate cost estimates, poorly managed projects, or procurement  11  problems, or some combination thereof. Navon (2005) also argued that the reasons for performance deviation can be schematically divided into two groups (a) unrealistic target setting (i.e. planning) or (b) causes originating from the actual construction. Notwithstanding the foregoing, an assumption underlying the approach formulated herein is that planned performance outcomes are without errors. It offers the diagnostic approach a rational foundation for assessing variances and identifying probable causal factors that occur during the construction phase and contribute to the variances. Second, in order to infer the causal relationship between two variables X and Y, e.g. one factor and one observed performance deviation, there must be a concomitant variation between them (Selltiz et al. 1959). In reality, many factors can contribute to performance deviations, such as inclement weather, poor labor skills, etc. In the planning stage it is almost impossible to take all of these factors into consideration and accurately estimate their expected states. Without the expected baseline states, it is difficult to determine whether one or more of the factors experienced a variation or not, and further, it is often not possible to definitively prove whether they are the cause of a particular variance. Therefore, as a compromise measure in this research for the factors hypothesized as possible determinants of performance, the corresponding threshold states specified by the user are assumed to be an upper or lower bound, as the case may be, on a baseline value that does not contribute to measurable performance variance. In other words, expected performance results as planned should be achieved given that factor values fall below the threshold state specified. Conversely, any exceedance of the threshold states is likely to result in performance deviations.  1.6 Research Roadmap Figure 1.1 shows the research roadmap used to guide the research work described in this thesis. Personal research interest is the best motivator and indispensable for the success of a research project, but external constraints such as time limitations and other resource availability also play an important role in selecting the research topic for a doctoral study. After discussing with the author’s supervisor, who has a great amount of experience doing research and directing graduate students, and more importantly has deep insights  12  into the knowledge needs of the construction industry, the author decided to research the topic of diagnosing construction performance.  Personal Interest  Thorough Literature Review  Research Topic  1  External Constraints  2  3  Observation of Current Practice  Detailed Research Objectives  7  4  Detailed Research Proposal  5  Implement Proposed Approach  6  Results Validation  8  Claim Contributions  Figure 1.1 Research Roadmap Several rounds of literature review were then conducted in order to assess the stateof-the-art and set up detailed research objectives suitable for a doctoral study, and which if met, should advance the state-of-the-art. During this review process, past research focused on developing construction performance models was searched out and reviewed carefully along with research directed at identifying important factors affecting various construction performance measures. The review results confirmed that the majority of the attention of past work was focused on predictive models more than on explanatory models, and a diagnostic approach capable of taking advantage of seasoned practitioners’ knowledge to search out reasons for performance deviations along with supporting evidence had yet to be found. Also, as part of the literature review, technologies and techniques used to develop construction performance models were examined. The strengths and weaknesses associated with them were identified and compared with each other in terms of their applicability to the research domain of interest. In addition to an extensive literature review, it was believed that only after knowing more about the status of current industry practice in terms of software tools used and data collected could the diagnostic approach developed be made compatible with this practice in terms of data already collected while at the same time advancing current capabilities of reasoning about construction performance. Thus the construction control practice and the capabilities and deficiencies of currently available leading commercial construction planning and control  13  systems (Microsoft Project and Primavera Project Planner) were examined as well. This also helped to identify opportunities for improvement and properties desired for a diagnostic approach. Following the foregoing, a detailed research proposal was brought forward, which included a conceptual framework for explaining various construction performance measures. Properties desired for the diagnostic approach, such as making use of data collected in support of various day-to-day management functions and making the techniques adopted readily compatible with the skill set of technically trained/educated construction personnel, were fully taken into consideration when constructing the framework. This framework then became the foundation and the point of departure for the next step--implementing the diagnostic approach proposed. The implementation work was carried out within the REPCON research system, which has been developed at the University of British Columbia. It was selected as an implementation platform because the REPCON system offers an integrated construction information environment in which a great amount of data supporting different daily construction management functions can be easily accessed, it permits relatively rapid prototyping using existing interfaces or adaptations thereof, and programming assistance is available through the senior programmer, William Wong, who has been involved within the overall research program for many years. However, it is important to state that the author has been responsible for the thesis concepts, algorithms, interface designs, testing, debugging, generating examples, and validating the work. Pre-existing and newly implemented features in REPCON, which are relevant to this research, are summarized in Table 1.1. It should be noted, however, that while the performance diagnostic approach was implemented in REPCON, the concepts are applicable to any system environment which affords access to the data involved within a construction project. After implementing the approach, it was applied to a reasonably comprehensive example to demonstrate its full dimensionality and the caliber of its diagnostic capabilities. Finally, the features of the approach and observations on its use were compared with the research objectives and responsiveness tests in order to define the contributions of this doctoral research work and to identify potential improvements and lines of inquiry which could be pursued in  14  future research work. Table 1.1 Pre-existing and Newly Implemented Features in the REPCON Pre-existing features •  •  Various data fields in different views in the REPCON system. For more detailed information, please refer to the section of integrated information platform in Chapter 3. Scheduling function to obtain activity’s planned early/late start/finish dates and floats.  New features implemented • •  • •  1.7  A quantitative causal model based time variance analysis function. An experience-based causal modeling function with a standard/project side framework to facilitate the direct expression of causal model hypotheses by construction users. A data evidence search function guided by project side experience-based causal models, and an evidence reporting function Minor changes to problem codes in the asbuilt view (add a data field of performance affected to help search right daily recorded problem codes as supporting evidence)  Thesis Outline  The thesis is organized in six chapters including this one, which first presents the research background followed by defining research objectives and boundaries, key assumptions and research roadmap. Chapter 2 mainly addresses the first specific research objective, i.e. to identify the strengths and weaknesses of past research by others focused on identifying important casual factors and developing construction performance models. This chapter first summarizes general construction control practice, followed by a more detailed discussion about the issues of time control - the representative performance measure studied in this research. A brief overview of industrially accepted construction planning and control applications (Microsoft Project and Primavera Project Planner) is then given. Then the extensive literature reviewed is categorized into four dimensions and discussed in detail in terms of its strengths and weaknesses for explaining construction performance. The general findings given at the end of this chapter on established performance models and identified factors that influence performance provide a solid foundation for this research and observations made throughout. These findings can be viewed as an additional contribution to help direct future relevant research.  15  The primary goal of Chapter 3 is to articulate a conceptual framework reflecting a desired holistic structured diagnostic process based on causal model reasoning which is applicable to different key performance measures. That is, this chapter is focused on the second specific research objective. Considering the literature review findings and construction industry characteristics, this chapter first presents a set of desired properties for a construction performance diagnostic approach which were then used to evaluate work done by others to date. Then key concepts relevant to data integration and knowledge management are introduced since they are believed to be central to a diagnostic approach. Two types of casual models (quantitative and experience-based), which play different roles in performance reasoning, are discussed followed by emphasizing the differences between rule-based and model-based knowledge representing and reasoning. Some emphasis is placed on the careful use of terminology in this section. In the last section of this chapter, the conceptual framework with five main components responsible for dealing with different functions of the whole structured model-based diagnostic process is presented, which provides the foundation for implementing the diagnostic approach. Chapter 4 focuses primarily on the third research objective, that is the exploration of existing quantitative causal models and structured relationships among them for the performance measures of interest. In other words, with the diagnostic approach conceptual framework at hand, this chapter goes deeper to discuss the implementation details and relevant issues about the component in the framework which is responsible for making use of quantitative causal models to do performance variance analysis. In order for the reader to have a complete view of the role of quantitative causal models and see the generality of the whole diagnostic process for different performance measures, quantitative casual models for any arbitrary performance measure is first discussed, followed by discussion of cost as a performance measure, which is then followed by a detailed exposition on the quantitative casual models for time performance – the representative measure in this thesis. In the next section of this chapter, a small example project along with a series of screen captures is used to illustrate the implementation of the quantitative causal model component for time performance within the REPCON system. The ability to identify time variances such as idle time, implicit predecessor and  16  extended working time, not addressed before in the literature, is seen as one of the contributions of this research, however it is observed that the generality of the structured diagnostic process is seen as the primary contribution. Chapter 5 is focused on the fourth research objective, i.e. formulate an experiencebased casual modeling approach which allows users to make use of available experiencebased diagnostic knowledge to construct personal experience-based hypotheses to help further explain detected performance variances. Three components in the diagnostic approach conceptual framework: (i) formulating and saving standard experience-based causal models; (ii) customizing standard models on the project side for explaining identified variances; and (iii) searching and reporting evidence found in support of the causal model hypotheses. Chapter 6 is the concluding chapter of the thesis. A summary of the research work is given first. The validation section then follows. A detailed description of primary and secondary research contributions is then given and the thesis concludes with recommendations for future work.  1.8  Summary of Validation and Contributions  As to the validation work, a relatively comprehensive hypothetical building project is first used to demonstrate the workability of the implemented holistic diagnostic approach capabilities to see if it meets the test questions discussed before and can correctly identify performance variances and likely actual causes. The project was developed at arm’s length from the author by his supervisor and is of sufficient generality to be able to represent a large class of building projects along with the kinds of time performance problems encountered in practice. The benefits of an integrated project information environment for diagnosing performance and the kinds of insights into performance deviations that can be obtained are demonstrated. In the second part of the validation section, results of an experiment are presented. The experiment was conducted in three phases, each of which has two comparable parts, and on participants with different backgrounds. The results of the experiment lend support to the assertion of the effectiveness of the diagnostic approach as compared with current practice.  17  Primary research contributions are claimed as: 1. development of a holistic structured model-based performance diagnostic process which is based on both quantitative and experience-based causal model reasoning and which has general applicability to key performance measures; and 2. implementation of an experience-based causal modeling approach that allows the flexible formulation by users of causal models and their automatic selection according to project and activity context in order to search out supporting data evidence to explain construction performance. The secondary contributions from the research include: 1. a diagnostic approach that is responsive to the test questions set out for an approach reflective of industry needs and distinct characteristics; and 2. a thorough assessment of research to date which focused on identifying factors that have impact on different performance measures and on developing approaches for forecasting likely performance as a function of factor values or explaining performance achieved to date. The findings from the thorough literature review provide a useful point of departure for those wishing to contribute to the topic of construction performance diagnosis.  18  2  2.1  Construction Control Practice and Literature Review  Chapter Overview  The primary focus of this chapter is on: (a) summarizing the characteristics of construction planning and control processes that are relevant to the formulation of a holistic structured performance diagnostic process; (b) reviewing two prominent planning and control applications that reflect the current commercial state-of-the-art of performance diagnostic capabilities; and (c) reviewing the academic literature to determine findings to date with respect to critical factors having impact on various performance measures, and explanatory and predictive performance diagnosis models.  The general goal of this thesis is to develop a holistic construction performance diagnostic approach as part of an overall control process. Thus, studying construction control process and reviewing the planning and control applications widely accepted by the industry has the purposes of (1) understanding the characteristics and context of current industry control practices, which can help make the diagnostic approach developed responsive to industry realities, and (2) identifying deficiencies in current construction performance diagnostic practices and the popular commercial planning and control applications, which we believe can be improved through the use of computerbased technologies such as an integrated data/information platform and knowledge management, and formulation of a structured holistic diagnostic process which includes an allowance for the user to incorporate their experience-based knowledge. The review of academic literature then can help the author identify strengths and weaknesses of approaches or techniques explored to date and opportunities for improvements in the construction performance diagnosis research field. Thus, the thorough literature review provides, in part, a point of departure for pursuing an alternative approach to performance diagnosis.  19  2.2 2.2.1  Construction Control Practice Controls in Construction Project Management  A project is a temporary endeavor undertaken to create a unique product, service, or result (Project Management Institute 2004). To manage a project is to meet ‘project objectives’ by applying ‘management processes’ to manage ‘project resources’ within the existing ‘project context’ (Froese 2004). From Figure 2.1 it can be seen that for a construction project, from the contractor’s perspective the project context usually refers to the project characteristics and the external physical and social environment in which the project is to be constructed. Project resources encompass the various physical, financial, human and information resources required to build the project. The desired construction results, i.e. project objectives, are expressed in terms of metrics dealing with time, cost, scope, quality and safety. The last, but the most crucial part for successfully managing a construction project is the management processes component, which mainly includes activities relevant to plan, execution and control. It is the management processes that integrate the project resources and context to achieve the project objectives. Project Context Physical Environment: location, weather, etc. Social Environment: regulations, industry culture etc. Project Characteristics: type, size, complexity, etc.  Project Objectives Time objective  Project Resources Physical: material, tool, equipment  Constraints  Cost objective  Financial: money, insurance, bonds  Management Processes  Scope objective  Human: labor, supervisor, manager  Input  Plan  Execute  Output  Control  Information: data, knowledge etc.  Figure 2.1 Construction Project Management  20  Safety objective Quality objective  Within the management processes, the role of the control process is to monitor and evaluate progress on an ongoing basis to determine if execution results to date match planned objectives or not, and if not, why not, i.e. there is a need to identify actual causal factors. From Figure 2.1, obviously, it can be seen that causes for performance deviations could be from various sources, which might be resource related problems (e.g. unavailable or low quality resources), unexpected actual project context (e.g. inclement weather, regulation change), and/or management process related problems (e.g. poor communication, too many change orders). A more specific description of the control process followed with varying degrees of rigour in the construction industry involves: (1) measuring the work finished to date, (2) comparing these measurements with planned objectives, (3) diagnosing deviations from planned objectives to determine causes for the deviations, (4) identifying corrective actions to offset the deviations found, and (5) forecasting revised objectives for the work yet to be done. Normally, such a general control process will be executed repeatedly during the whole construction phase for different performance measures at different project definition levels. The frequency of running the control process spans from one week to several months in terms of current industry practice, which is dependent on project size and complexity, and at which project definition level the performance measure of interest is to be controlled.  2.2.2  Characteristics and Issues of Time Control  In this section more detailed observations on control practices are presented to help understand the characteristics of construction control, with emphasis on time performance as it is the representative performance measure studied in depth in this thesis. For time control, as-planned schedules and associated methods provide the benchmarks (i.e. planned objectives) for actual time and productivity performance. Understanding how the as-planned schedule was developed, including underlying assumptions can help in identifying deficiencies from the performance diagnosis perspective. For each activity in the schedule, taking into account the quantity to be installed and the corresponding production or productivity rate which assumes a specific  21  resource assignment and method of construction, the planned duration is obtained. The activities with planned durations are then assigned responsibilities, applicable calendars, necessary date constraints and various material, human and equipment resources compatible with the production or productivity rates assumed. Logic relationships (i.e. start to start, start to finish, finish to start and finish to finish) among the activities are then determined. Once the foregoing is done, schedule computations are made to determine early and late start/finish dates and free and total floats for all activities. Various alterations can then be made to satisfy resource constraints, intermediate milestone requirements, etc. As to the aforementioned expected productivity or production rates used to determine the planned duration of each activity, they are usually derived from the experience of seasoned personnel or statistical analysis of previous project data. They are obviously dependent on the expected states of many factors, such as labour skill, working environment, weather conditions and so on. But in actual practice no clear description can be found in as-planned schedules as to what factors have been taken into consideration in determining the planned production rates and what the assumed or expected states are for the factors. Thus, when deviations from planned production rates are observed, which happens quite often, there is no expected factor states that one can check against. As to planned logic relationships among the activities, it is also noticed that implicit predecessors, which are necessary conditions (e.g. the availability of construction drawings, material test results, suitable weather conditions, access to the work face, power being available, etc.) for the start of an activity and which are always consciously or unconsciously assumed to be fulfilled, are not included or explicitly expressed in the formal as-planned schedules. If any implicit predecessor related problem happens, an all too common occurrence, personnel have no plan to check against and must identify what the implicit predecessor problems are according to their own experience. Overall, this is a distinguishing characteristic of the construction industry and construction projects – many of the assumptions behind scheduling and estimating are not documented in explicit form. As a result, when deviations from planned objectives are observed personnel won’t have a plan with recorded detailed assumptions to help figure  22  out what the actual causal factors are. Accurate measurement of the actual performance to date is essential for a successful control process. The measurement work aims to obtain a representation of the current performance status by processing data collected on site or in support of other management functions such as drawing control and change order management. Obviously, without the timely and effective storage of data recorded daily in support of various management functions (e.g. daily site reporting, drawing control, change order management), it is not possible to explain performance variances later on. As to time performance, after recording each activity’s actual start and actual or projected finish date, the schedule must be recomputed to determine the project’s overall time performance to date, because there are complex logic relationships among the activities, and any advance or delay in one activity’s planned start or finish date doesn’t necessarily indicate that the whole project is ahead of the planned schedule or delayed. The new project expected finish date obtained from recomputing the schedule can then be compared with the planned one to determine whether the whole project would be completed on time or not without taking any corrective actions. Following the measurement work, the next step is to compare actual with planned performance to determine the magnitude of any deviation between them. Deviations that exceed a user specified threshold need to be diagnosed to determine causes for them. For performance measures of cost and time, the total deviation can be decomposed into sub deviations. For instance, total cost deviation can be decomposed into the deviations associated with labor cost, equipment cost, overheads, etc. Such is not the case for quality, safety and to a lesser extent productivity measures. As to time performance, from the diagnosis perspective sub deviations or start/finish date variances associated with actual critical activities should be the focus if undesired time deviations at the project definition level are detected. However, it is noted that the critical path(s) could be different in the updated schedule from the one(s) in the planned one because of the change in activity durations (also work could be done out of sequence), and more importantly actual critical activities might not be the critical activities in the planned schedule. Thus, focusing on critical activities alone in the planned schedule could be misleading. Having said this, identifying actual critical activities is desirable, but due to the most basic tenet of CPM  23  (Critical Path Method), namely “you only really know the actual criticality of any activity on the data date (i.e. progress date) of the schedule” (Winter 2004), it is found that identifying the actual critical path is difficult without real-time updating and rescheduling schedule that is almost impossible to implement in actual practice. Facing this issue, as an alternative way to go, planned critical activities and near critical activities which have relatively small total float usually become the focus of management personnel. Challenges that exist in making comparisons between planned and actual performance also involve consistent use of measurements and comparable levels of detail between data describing as-planned and as-built performance or a means to derive comparable levels of detail. It is noticed that planned schedules are developed at different levels of detail, usually from coarse grained to fine grained representations, but there are practical bounds on the level of detail possible or describable. This means that even for an activity in a very detailed as-planned schedule, it still might represent several tasks. For instance, concrete column construction might be the finest level of detail in a schedule, which embraces the tasks of form erection, reinforcing, concrete pouring, curing and stripping. Thus, planned task duration is not available and logic relationships among these tasks are intentionally ignored. As a result, baseline task-detail level will not be available to compare with actual time performance of tasks, i.e. if any activity has experienced start, finish and/or duration variance, it is difficult to further attribute the variances to tasks and figure out which tasks were critical and how they contributed to the time variances observed at the activity-detail level. Further, from the performance diagnosis perspective knowing only the value of sub deviations or variances is far from sufficient. Identifying the factors causing the variances is essential in order to determine which participant, if any, should be held responsible for the variances experienced, and what management actions should be carried out to prevent such factors from arising again and affecting upcoming construction work. It is observed that current industry control practice relies heavily on practitioners’ experience-based knowledge to identify the causal factors that ultimately produced the variances observed. Due to the large volumes of heterogeneous and often incomplete data collected during the construction phase, and its distributed nature (i.e. lack of one central repository), the burden on practitioners to search out the desired data evidence to prove their subjective  24  explanations is very heavy and time consuming. Considering this characteristic of construction control practice, the diagnostic approach discussed in this thesis aims to tackle this very real problem. Identifying corrective actions and forecasting revised objectives are another two necessary steps in control practice, but they are beyond the scope of this research.  2.2.3  Overview of State-of-the-Art Construction Planning and Control Tools  In this section the author briefly overviews the current state-of-the-art planning and control tools. The focus is on to what extent these tools are able to help construction practitioners diagnose construction performance, whether there is room for improvement in terms of better satisfying the needs of practitioners in explaining poor construction performance, and if yes, whether these tools are suitable to be selected as a platform in this thesis to implement the diagnostic approach and validate relevant research ideas. Currently, Microsoft Project (MS Project) and Primavera Project Planner (P3) are the most popular and industry accepted planning and scheduling tools in the construction industry, and hence the author’s attention was directed at these two tools in particular. They provide very substantial assistance in developing as-planned schedules at any desired level of detail and monitoring actual time and cost performance. After reading through some P3 and MS Project related reference books (Primavera Systems Inc. 1999, Harris 1999, Chatfield and Johnson 2000) and personally using the tools, it is found that in terms of the capability to do performance causal analysis these tools can help users identify some kinds of variances. As shown in Figure 2.2 and 2.3, basing on selected target/baseline project and active project both MS Project and P3 can identify activity start, finish and duration variances. Some other variances, like cost variance and schedule variance measured in terms of earned value can be figured out as well. However, no tool offers help in further explaining why such variances occurred in a specific project, e.g. whether observed start date variance is caused by implicit predecessors, if yes, what they are, and whether duration variance is caused by extended working time or just composed of idle time, and what actual causal factors are behind extended working time and/or idle time. More importantly, these tools don’t have functions which allow practitioners to  25  capture and reuse their experience-based knowledge to identify actual causal factors (e.g. inclement weather, change orders, drawings unavailable, etc.) contributing to the identified variances. Obviously, such diagnostic capability would be useful for practitioners, and thus room for improvement is still there to explore.  Figure 2.2 Variance Analysis Capabilities in MS Project 2003 It is observed that both P3 and MS Project have the potential to act as platforms to implement extended functions by using their corresponding Software Development Kit (SDK). But from the perspective of performance diagnosis, it is noted that these tools have some limitation, e.g. a considerable amount of physical component related data (e.g. drawings, planned and actual attribute values of physical components constructed) and as-built data (e.g. daily construction site status, various daily site records) are not available in these tools. These data which provide essential support for different daily management functions are very important for helping diagnose and prove why deviations in construction performance occurred. Such limitations prevented these tools from being selected as a platform to implement and test the diagnostic approach in this thesis.  26  Figure 2.3 Variance Analysis Capabilities in Primavera Project Planner (P3) 2.3  Overview of Reviewed Academic Literature  To date, significant research efforts have been carried out by the academic community to identify critical factors that are usually regarded as the causes for unsatisfactory construction performance. Research on predictive or explanatory construction performance models has been pursued as well, in which the critical factors play the role of independent variables and the performance measures studied are the dependent variables. For purposes of this thesis, a predictive model means that given an estimate of the likely state of a set of critical factors, the established model can be used to provide or predict a priori an accurate estimate of performance measure achievements for part or all 27  of a project. The state values of the critical factors or variables, which are believed to have impact on the performance measure of interest, are assumed to represent an average of the conditions forecast to be encountered. Such models may be useful in providing the benchmarks required for project control and reminding people in advance what critical factors should be given priority attention in order to prevent actual performance deviation from occurring. However, research remains to be done on determining the accuracy of such predictive models by applying them to projects independent of the ones already used in developing the predictive models. Another important question to answer is whether the factors involved in these models are generally critical or only critical for the projects studied? Different from a predictive model, an explanatory model means that after observing a deviation of actual performance from expected performance during the construction phase, the model can help construction personnel identify what the most plausible causes are for the deviation based on an examination of relevant project data. Thus, this type of model is useful in helping personnel diagnose actual construction performance and offering a basis for identifying potential corrective actions to bring a construction project back on track. The extensive literature review conducted has provided the author with deep insights into the current state-of-the-art of research relevant to this topic, and more importantly, has assisted in identifying research gaps and the strengths and weaknesses of techniques explored to date to improve construction performance. A systematic search was conducted for relevant research articles, which resulted in the identification of 144 articles published over the past 25 years, primarily in construction management related academic journals such as Journal of Construction Engineering and Management, Journal of Management in Engineering, International Journal of Project Management, Construction Management and Economics, AACE International Transactions, and Canadian Journal of Civil Engineering. Some articles studied performance measures of interest from different project participant perspectives simultaneously (e.g. Kumaraswamy and Chan 1998, Chua et al. 1999). The findings obtained from the contractor’s perspective are of central interest here, and thus were examined in the most detail. Although the representative project performance measure selected for study in this research is time, the articles that focused on productivity, cost,  28  safety, and quality performance measures were reviewed as well since the structured diagnostic process pursued is desired to be applicable to all key construction performance measures. The 144 articles were first categorized in terms of four dimensions: 1.Research aim (to identify critical factors only, to develop predictive performance models, or to develop explanatory performance models); 2.Performance measure studied (productivity, time, cost, quality, safety, overall success and others such as partnering (Chan et al. 2004a), interactions among participants (Pocock et al. 1996) and overall motivation level (Borcherding et al. 1980); 3.Project definition level of interest (activity, trade or overall project); and, 4.Research technique used (Statistical Regression-SR, Neural NetworksNN, Mathematical Model-MM, Fuzzy Logic-FL, Decision Support System-DSS, CFConceptual Framework, Bayesian Networks-BN, Importance Index-II, Statistical Analysis-SA, and Others-OT such as analytic hierarchy process (AHP) and case study etc.). The result of categorizing the articles is shown in Table 2.1, in which the numbers correspond to the articles in the reference list at the end of this thesis. In some articles, two or more performance measures were studied simultaneously (e.g. Maloney 1990, Chua et al. 1999), so the same article may appear in more than one performance measure category. A similar observation applies to techniques used, but for this case the primary technique was used to categorize such articles. Details on the strengths and weaknesses of the techniques for performance diagnosis are discussed in the following sections along with the primary findings from the literature reviewed.  2.4  Research on Identifying Critical Factors  From Table 2.1 (third column) it can be seen that a significant number of the articles only focused on identifying critical factors, which sometimes are also referred to as important factors (e.g. Aibinu and Odeyinka 2006) or important attributes (e.g. Iyer and Jha 2006). For such articles, quantitative cause-effect relationships between the factors identified and the performance measures studied were not pursued by the authors. Another observation is that, independent of the performance measures studied, almost all of the reviewed research in this category was carried out at the definition of the project level, i.e.  29  Table 2.1 Categorization of Reviewed Articles Project Performance Definition Measure Level  Research Aim Critical Factors SA- 50,122,136, OT- 127,  Activity Productivity  Trade  OT- 87,137,  Project  II- 22,45,55,90,119, SA- 12,18,32,66,75, 85,106, OT- 51,87,111,  Activity Time  Predictive Models SR- 63,112,123,128, 133,134,135,138, NN- 3,117,126,129,142, MM- 36,105,113, DSS- 43, SR- 53,56,57,58,103,132  CF- 97, DSS- 4,37,38,104,  SR- 59,67,80, NN- 95,144, MM- 48,76,105,140,  CF- 52,  SR- 10, NN- 142, FL- 115,  CF- 97, BN- 98,99, DSS- 4,88,104,141,  SR- 23,27,28,49,54,73, 83,91,92,102, 116, 139, NN- 5,81,  DSS- 88,  NN-142, FL-115,  CF- 97, BN- 98,99, DSS- 4,88,104, DSS-120, DSS- 88,120,  Trade  Project  II- 7,11,13,15,16,41, 77,84,96,108,110, SA- 6,42,68,70,82,85, 100,107,118, OT- 9,33,61,130,  Activity Trade Cost Project  Quality  Overall  II- 11,40,69,77,110, SA- 8,68,71,79,82, 100, OT- 9,33,35,61,130,  SR- 39,46,49,67,73,91, 92,102,116, NN- 5,34,78, DSS- 60,  Activity Trade Project  Safety  Explanatory Models  Activity Trade Project Activity Trade  Project Success  Project  Other  Activity Trade Project  CF-97, II- 2,14, SA- 68,100, OT- 9,33,93,94,  SR- 91,92,102,  FL-86,  CF- 97,131,  SA- 64, SA- 65,68,72,74,125,  II- 143, SR- 21,25,73,121, SA- 20,109,124, OT- 1,19,26,33,62,89, 101,114,  II- 17,22, SA- 68, OT- 9,61,  SR- 24,31,44,91,92,116, NN- 29,30,  30  CF-47,  a highly aggregated treatment of a project. Thus, it is debatable whether the findings at the project definition level can be applied directly to individual construction trades or activities with different characteristics. Techniques typically used for the foregoing type of investigation included importance index, simple statistical analysis, case study and AHP method.  2.4.1  Importance Index Method  Generally speaking, the importance index method is used to analyze some seasoned practitioners’ subjective views on the importance or criticality of factors assumed to have impact on a specific performance measure, with such views usually being obtained through questionnaire surveys. Besides asking respondents questions as to their working experience and relevant backgrounds in such questionnaire surveys, their subjective views on the relative importance and frequency of occurrence of the factors listed are extracted by using Likert scale values (e.g. rarely, sometimes, often and always, corresponding to numeric values from 1 to 4). An index function is then applied to aggregate these subjective values to obtain the importance index value for each factor. Factors with a high importance index value are then asserted to be the primary causes of unexpected performance results. For example, Assaf et al. (1995) used this method to study the causes of delay in large construction projects located in the eastern province of Saudi Arabia. 73 potential causes of delay classified into 9 groups (project, owner, contractor, consultant, design, materials, equipment, labor, external) were listed in the questionnaire. The importance index function used is [F.I (%)*S.I. (%)]/100, where F.I. indicates frequency index and is equal to ∑a(n/N)*100/4 (a is the constant expressing weighting given by each response (ranges from 1 for rarely up to 4 for always), n is the frequency of the responses, and N is total number of responses), and S.I. indicates severity index and has the similar function as the frequency index, but in which a is from 1 for little up to 4 for severe. The most common cause of delay identified in this research is ‘change order’. Other important causes from the perspective of contractors include ‘delay in progress payments by owner’, and ‘delay in reviewing and approving design documents’.  31  The importance index method was also used by Aibinu and Odeyinka (2006) to identify the causative factors for construction delays for projects in Nigeria. 44 factors classified in 8 categories respectively related to client, quantity surveyor, architect, structural engineer, contractor, sub-contractor, supplier and external were evaluated in that research, but the frequency of occurrence of these factors was not taken into consideration. The importance index function made use of is ∑Wi/(A*n), where ∑Wi is the total score assigned to a factor by the respondents, A is the highest importance weight being able to assign to a factor by one respondent (5 in that study) and n is the number of respondents (100 in that research). The top important causative factors identified are ‘contractors’ financial difficulties’, ‘client’s cash flow problem’, and ‘architects’ incomplete drawing’. For other performance measures like productivity, cost and quality, importance index method was also made use of to figure out what factors are important or critical in terms of causing variances from planned results (e.g. Fazio et al. 1984, Elinwa and Buba 1993, Arditi and Gunaydin 1998). Overall, the findings obtained by using this method provide its users with insights into the various possible causes for undesired performance outcomes, which can assist them, to some extent, to formulate hypotheses to explain a particular performance deviation observed. From a performance diagnosis perspective, however, this method cannot be used to prove an explanatory hypothesis. It is also observed that the various questionnaire surveys were carried out at different geographic locations, the subjective views of respondents might be based on different types of projects undertaken by them, and, the researchers carrying out the various surveys didn’t use consistent Likert scale values and/or importance index functions. Therefore, it was found that even for the same construction performance measure, the consistency of the critical factors identified and their relative ranks amongst the various studies conducted tended to be very low. Furthermore in some of the literature, no clear discussion was found as to the desired state for the factors identified, which in some cases led to ambiguity of the meaning of the factors. For instance, ‘client’s cash flow problem’ was identified as a critical factor for construction delay by Aibinu and Odeyinka (2006), but without giving a clear variable state that describes the nature of the problem, the intended meaning of this factor may be misunderstood.  32  2.4.2  Statistical Analysis  Statistical analysis is another method applied to identify critical factors that affect construction performance. As used herein, statistical analysis only refers to the calculation of mean, standard deviation, and correlation values plus significance tests, but does not include statistical regression analysis which will be discussed later. For this research category, use was often made of a questionnaire to collect the opinions of practitioners, but sometimes statistical analysis was also conducted on data collected on construction sites leading to more objective findings. For example, Thomas and Raynar (1997) studied the impact of scheduled overtime on labor productivity. In contrast to ‘spot or unscheduled overtime’, ‘scheduled overtime’ refers to a planned decision to accelerate construction progress by scheduling more than 40 working hours per week for an extended period of time. Thomas and Raynar (1997) collected data on electrical and piping crafts from four construction projects, and then analyzed it. The analysis results confirmed that scheduled overtime schedule can result in a loss of productivity, and hence should be considered as a factor in explaining productivity loss. As more days per week were worked, on a scheduled overtime basis, additional difficulties were encountered in providing resources, i.e., materials, equipment, tools and information, which explain in part the reduction in productivity. Thus, there can be a causal relationship amongst factors relevant to productivity. Choromokos and McKee (1981) conducted a survey of the top 400 Engineering News-Record contractors to identify the factors perceived by construction executives as having the greatest impact on productivity. The respondents were asked to rate each factor as high, medium or low for potential impact on productivity. Arditi (1985) used the same questionnaire and statistical analysis method to survey the top 400 contractors again a few years later, and further compared the results obtained with those from the previous one. It was found that practitioner opinions on the relative importance of some of the factors listed in the questionnaire (e.g. labor availability, equipment maintainability) had changed, as expressed through statistical significance tests. Statistical analysis techniques were also used by Ahmed et al. (2003) to study the Florida construction industry in order to identify the ten most critical causes of delay, which included ‘building permits approval’, ‘change order’, ‘changes in drawings’, and 33  ‘incomplete documents’. As for other construction performance measures like cost, quality and safety, similar studies were also performed (e.g. Jahren and Ashe 1990, Arditi and Gunaydin 1998, Sawacha et al. 1999). Overall, these research findings are based on the data collected in a specific time period, from a limited range of respondents, or for specific project types and activities. As a result, the generality of the findings cannot be guaranteed and it is debatable whether they are general, and applicable to different geographic regions or are invariant with time. Even if the findings could be generalized to apply to all situations, the results obtained using statistical analysis still just tell the audience what factors could be critical ones in terms of causing various performance deviations, but cannot help people confirm which one(s) actually showed up in a specific case.  2.4.3  Others (Case Study, AHP, Etc.)  Other methods made use of to identify critical causal factors include case study, analytical hierarchy process (AHP), and literature review. Thomas and Sanvido (2000) studied the quantitative impact of fabricator performance on construction labor efficiency by conducting case studies on three projects constructed on the Pennsylvania State University campus. ‘Late vendor deliveries’, ‘fabrication or construction errors’, ‘out-ofsequence deliveries’ and ‘material availability’ were identified as the causal factors plaguing each of the three projects. Love et al. (1999) used two case studies to study the influence of rework in construction. The analytic hierarchy process (AHP) is a technique developed by Wharton School’s professor Thomas Saaty in the 1970s. It uses people’s ability to compare multiple alternatives in terms of a single property, and then synthesizes the results (see McCaffrey 2005). Chua et al. (1999) tried this technique to identify the factors affecting performance measures of cost, time and quality at the project definition level. For time performance, the most critical factors identified in the research included ‘realistic obligations/clear objectives’, ‘frequency of schedule updates’, ‘adequacy of plans and specifications’, ‘capability of contractor key person’ and ‘constructability’, which to some extent are different from the factors influencing the other two performance measures. In essence,  34  the results obtained by using AHP are also based primarily on the respondents’ subjective opinion, which is similar to use of the importance index method. Reviewing the available literature is also regarded by some researchers as a way to identify critical factors affecting construction performance. It is found that lists of factors abound in the literature, however, no general consensus has been reached on a definitive set of factors (Chan et al 2004b).  2.5  Research on Developing Predictive Construction Performance Models  In order to not only identify critical factors, but also to explore what the achievement of a performance measure of interest would be given an estimate of the likely state of the relevant critical factors, i.e. how the factors quantitatively affect the performance measure, many researchers have tried to establish quantitative predictive construction performance models, in which independent variables are often regarded as the critical factors. Listed in the fourth column of Table 2.1 is the reviewed literature relevant to this approach. It is observed that up to now, except for researchers interested in productivity, most researchers are more interested in developing predictive models at the project definition level than at the trade or activity definition level. Whether such predictive models can forecast the performance of all of the trades or activities in a project is simply unclear. From the diagnosis perspective, although predictive models are not diagnosis-oriented and have the deficiency of not being able to help search out evidence to explain actual performance, they could provide hypotheses for explaining deviations. Statistical regression and neural networks are found to be the most popular techniques adopted to develop such predictive models. Other techniques like simple mathematical models and fuzzy logic have also been examined by researchers.  2.5.1  Statistical Regression  Regression analysis as an important statistical method and which is sometimes regarded as a data mining technique, is usually applied to curve fitting, modeling and testing of causal relationships between variables. Generally, the steps involved in statistical regression analysis include 1.statement of problem; 2.selection of potentially relevant 35  variables; 3.data collection; 4.model specification; and, 5.model fitting and testing (Chatterjee and Hadi 2006, Gilchrist 1984). According to different kinds of models specified, regression analysis is usually classified into types of linear regression and nonlinear regression, and within each there are some sub types such as logistic regression and exponential regression. The coefficients in the models, which indicate the strength of the relationships between the dependent and independent variables, are estimated based on the data collected. Regression models are often extrapolated for values of the independent or explanatory variables outside their original data range, which can lead to erroneous predictions. As seen in Table 2.1, statistical regression technique has been applied to study various construction performance measures. Thomas and Yiakoumis (1987) made use of multiple non-linear regression to study the effect of temperature and relative humidity on productivity. Data for this study were collected from three commercial projects in central Pennsylvania, for different kinds of activities over a period of 78 project working days. The regression model generated and tested for statistical significance is as follows: PR’ = 9.448 + 0.0518T - 2.819lnT + 3.89×10-37eH in which PR’ represents daily performance ratio (actual/expected), T is temperature at 1:00 PM in degrees F, and H is relative humidity at 1:00 PM expressed in a percentage value. The ideal state for the temperature and humidity causal factors is 55°F for temperature and less than 80% for relative humidity. The authors believed that this methodology is fundamental and can be used to quantify the effects of other factors. Thus in 1990 Thomas et al. (1990) brought forward the following general format of a factor model with respect to construction labor productivity: AUR = IUR(q) + ∑aixi + ∑f(y)j where AUR is the actual (or predicted) crew productivity, IUR is the ideal productivity under standard conditions and is a function of the number of quantities installed q, ai is a constant representing the increase or decrease in productivity caused by a corresponding one-zero (denoting the presence or not) causal factor xi, while other integer and continuous variable factors, such as weather, crew size, absenteeism and so forth, are included in sub-models of f(y)j in which y represents the causal factors. Obviously, using regression techniques to determine the coefficients in such factor models for different 36  types of activities requires a huge amount of data from several, if not many, construction projects, which for real practice is very difficult, if not impossible, to achieve. At the project definition level, Jaselskis and Ashley (1991) made use of the logistic regression technique, which utilizes the logit as its specified function, to study the factors affecting construction project cost, time and overall performance. The logit function is expressed as ey/(1+ey), in which y=∑aixi, and xi represents the causal factors and ai is the corresponding coefficient which is determined by running the regression analysis. The critical factors identified in that research for time performance include ‘project manager education (number of years after high school)’, ‘constructability program (yes or no)’, ‘team turnover (%/year)’ and ‘budget updates (number per year)’. Note that no factors regarding project context, complexity, designer competence or client capability and personality are included, making the results somewhat counter-intuitive. The same technique was also applied by Hanna and Gunduz (2005) to study the productivity performance of 116 construction projects. The critical factors included in their final logistic regression equation are ‘did owner and contractor work together before?’, ‘percentage of design complete at the start of the construction’, and ‘as a project manager, total number of projects of this construction type and size’. As to other performance measures like cost and quality, regression analysis was also found to be used (e.g. Mohsini and Davidson 1992, Pocock et al 1996, Chan et al. 2001). In comparison to previously discussed techniques applied to identify critical performance factors, regression analysis is founded more on real “objective” data rather than people’s subjective opinions. Thus, it not only can help to identify critical factors, but also establish quantitative causal relationships for the performance measure of interest, providing some predictive capabilities. Another advantage of the approach is in order to measure factor values, their meaning must be made clear for measurement purposes, a source of weakness for some other approaches. Nevertheless, the use of statistical regression is not without some deficiencies. In order to apply it, a decision has to be made as to relevant variables for which data is needed, which will then form the basis for the regression analysis to determine the most significant causal factors. In the research reviewed, potential causal variables were often obtained through surveys of seasoned personnel and/or literature reviews. The underlying assumption is thus that other factors  37  not subjectively selected cannot be significant casual factors. Then, in order to conduct a regression analysis, a functional relationship (e.g. linear or non-linear etc.) between the potential causal variables and the dependent variable must be assumed, although it is possible to experiment with different functional forms. It is clear that different functions could result in very different findings. Another observation is that although a covariance test is often conducted for the potential causal variables selected, most regression models developed are still found to be in the form of a ‘single-layer’ model, which means no further cause-effect relationships among the causal factors themselves are quantitatively expressed in the models, i.e. the factors are assumed to be causally independent with each other. As noted previously, successful application of this technique requires lots of data from different activities, trades, or projects along with specific knowledge of how to apply regression techniques, a skill set not often found in the construction industry. Therefore, without expert assistance, it is rare to find daily use of regression analysis to exploit the information content of a firm’s or a project’s database, except for very special circumstances such as a claim case (Ameen et al. 2003). While regression models have limitations in explaining construction performance deviations, the ones reviewed have provided some useful insights into actual construction performance outcomes.  2.5.2  Neural Networks  In order to capture quantitative causal relationships while avoiding having to assume a specific kind of function in advance as required by statistical regression analysis, neural networks, as another data mining technique, has enjoyed relatively wide use to identify critical factors and establish predictive construction performance models (refer to the fourth column of Table 2.1). The concept of artificial neural networks was brought forward first by Warren McCulloch, a neurophysiologist, and a young mathematician, Walter Pitts in 1943. This idea was to mimic the ability of real neurons in the human brain to process information to get knowledge. Shown in Figure 2.4 is the general structure of an artificial neural network, in which each layer can have one or more artificial neurons, and the neurons in the input layer usually represent causal or independent variables, and the ones in the output layer are dependent variables. The  38  major components making up a neuron, no matter which layer it is located in, include 1. weighting factor; 2. a summation function; 3. a transfer function; 4. scaling and limiting; 5. an output function; 6. an error function and back-propagated value; and 7. a learning function (Anderson and McNeill 1992, Chester 1993). A subset of the data collected is usually used to train a neural network, followed by use of the remaining data to validate the usefulness of the network model. Two different kinds of training or learning method can be employed: supervised or unsupervised. Once a neural network model is established, the input variables are regarded as the critical factors and the output variable’s state can be estimated given the state of input variables. Input Layer  Hidden Layer  Output Layer  Neuron  Connection  Figure 2.4 General Structure of a Neural Network In the domain of studying construction performance, Kog et al. (1999) developed a neural network model for predicting construction time performance at the project definition level. The data used were from the previous research of Jaselskis and Ashley (1991) in which the logistic regression technique was tried. The final neural network model made use of two hidden layers, and the five critical factors identified were ‘time devoted by the project manager to a specific project’, ‘frequency of meetings between the project manager and other project personnel’, ‘monetary incentives provided to the designer’, ‘implementation of constructability program’, and ‘project manager experience on projects with similar scope’. Chua et al. (1997) also applied the neural network technique to analyze the same set of data collected by Jaselskis and Ashley (1991), but 39  they focused on cost performance. Eight critical factors were included in the model established, which are ‘project management levels to craftsman’, ‘% design complete at construction start’, ‘number of construction control meetings’, ‘number of budget updates’, ‘constructability program’, ‘team turnover rate’, ‘control system budget’, and ‘project management technical experience’. Compared with the results obtained by Jaselskis and Ashley (1991), the interesting finding is that although the same data were analyzed by different researchers, the factors identified based on use of the neural network technique, no matter for time or cost performance, are not the same as by using the logistic regression analysis. Use has also been made of neural nets for analyzing productivity with respect to different kinds of activities, such as pile construction (Zayed and Halpin 2005), earthmoving (Shi 1999), welding and pipe installation (AbouRizk et al. 2001), and formwork, concrete pouring and finishing (Sonmez and Rowings 1998). For the foregoing research work, no consensus was found as to the best structure of neural networks selected, e.g. number of hidden layers, connection type and functions of transfer and learning. Further, there was little consensus on the critical factors identified, although it is observed that many critical factors would be involved for analysis of productivity for different types of work. And after comparing the critical factors identified with the ones identified as having impact on productivity at the overall project definition level (Lu and Hermann 2001, Zhi et al. 2002), the consensus found was very limited. In general, the neural network technique can be useful for establishing a quantitative causal relationship between input and output variables. Compared with regression analysis, pre specifying a fitting function is not necessary, but the structure of neural networks for analyzing the data collected needs to be decided in advance. Specifically, two fundamental problems still remain: 1. one must identify the factors of interest, and 2. how best to express the values or states of each factor must be determined. This is especially challenging for factors which involve a subjective assessment of value. Thus, like the regression technique, subjectively pre-selecting potentially relevant factors actually might unintentionally filter out other potential causal factors, which will result in missing some critical factors. Another disadvantage for the construction user of a neural net model is lack of transparency, i.e. the weights associated with the connections  40  amongst the artificial neurons are hidden from the user. As to the potential explanatory ability, neural network models have the same problem as regression models, i.e. the causal relationships established are not necessarily definitive. This observation also applies to other modeling paradigms, e.g. experience-based causal models. For a specific case, if the factors in the input layer of a neural network model are not causative for the unsatisfactory performance outcome, then no further diagnostic analysis can be carried out based on the established model. Other than being a source of data used for generating a neural network model, the valuable experience of a practitioner cannot be used to adapt the model to a specific project context. As a methodology, the neural network technique is thought by the author as not being capable of being conveniently applied by construction personnel in daily practice, because determining the appropriate neural network structure including transfer and learning functions requires specialized knowledge, which is rarely possessed by construction practitioners, as well as sizeable data sets. Nevertheless, its application by the research community has provided useful insights into which factors possibly have significant influence on performance measures of interest, which can be helpful in proposing hypotheses for explaining actual construction performance.  2.5.3  Mathematical Models  With regard to predicting productivity at the activity or project definition level, the use of mathematical models has also been explored (refer to the fourth column of Table 2.1), in which the critical factors considered are always allocated indices or weights, the state values for these factors are estimated subjectively by seasoned practitioners, and then simple mathematical equations involving only +, -, *, / and ∑ operators are used to aggregate the impact of the factors to predict performance outcomes. These mathematical equations are experience-based, and different from regression models because their significance cannot be tested in a statistical sense without a substantial set of data. Neil and Knack (1984) proposed such a mathematical model for predicting productivity. It took the form of: Productivity Multiplier = Area Productivity Index / (1 + ∑ Adjustment Factors), where the productivity multiplier is considered as an index of  41  productivity to be compared with a base one, area productivity index is an index relating a new area’s population group to the population group of a base area, and adjustment factors are applied to any condition which may affect the base productivity (0 means the status of a factor is expected to be moderate and have no impact on the base productivity, and the larger the value is than 0, the more negative impact the factor has on the base productivity). Any factor can be included in this equation and the example ones given in Neil and Knack’s (1984) research include ‘available craft force’, ‘crew work space’, ‘overtime’, ‘weather’, and ‘material availability’ factors which resonate with front-line construction personnel. Woodward (2003) developed a different mathematical model for predicting productivity, and formulated it in the form of a worksheet to enhance its practical use. The model took the following form: Productivity = Pb* (1 + ∑Wi*(Factori-1)), where productivity refers to predicted one having considered critical factors’ influence already, Pb is base productivity, and ‘Wi’, ‘Factori’ are weight and state values expected for factor i. Some examples of the factors presented in the research are ‘site access and egress of employees’, ‘Altitude’, and ‘Design completion at bid’. The base status for them are expressed relatively clearly and simply, such as ‘parking and craft facilities adjacent to site’, ‘near sea level’, and ‘50-60%’. If undesired status is expected, for instance ‘facilities more than 15 minute walk from work area’, ‘sea level between 6000-7000 feet’, and ‘20-50% design completed at bid’, then the corresponding factor value, 1.10, 1.20 and 1.30 will be respectively allocated to these factors. The final productivity estimated is then computed using the mathematical model given previously. While such a mathematical model is intuitively satisfying and its features are transparent to the user, the weights used in the model were not validated. Given the availability of sufficient data, regression analysis could be useful in this regard. The model also has the potential for assisting with the task of explaining productivity variance, but little work seems to have been done on this. Mathematical models of similar type for productivity can also be found in other research (Dieterle and Destephanis 1992, Ovarain and Popescu 2001, Jergeas and McTague 2002), but no research of this type has been found for other performance measures. Compared with statistical regression and neural network models, one similar  42  feature mathematical models have in common is that the critical factors involved in the models are all arranged in the form of a single layer, i.e. inter-causal relationships among the critical factors are assumed to be non-existent. Although mathematical models are experience-based and subjective in terms of factors considered and functional form used, they are much easier to develop, understand and use. Typically, most of them are developed by practitioners as opposed to academic researchers. There is no limitation on the number of critical factors that can be treated in mathematical models, i.e. they are readily extendable by users in terms of adding factors according to different construction context and personal experience. From the previous discussion on Woodward’s (2003) model, it is also noted that this kind of model allows for judgments regarding the base or anticipated states of critical factors on the part of users, which in turn allows the models to analyze performance prospectively or retrospectively (e.g. Ovarain and Popescu 2001), i.e. they can be used not only for prediction, but to some extent for explaining performance by using actual factor values realized - features highly desirable from a practical diagnosis perspective.  2.5.4  Expert Systems and Fuzzy Logic  An Expert System (ES) can be regarded as knowledge-based system, which can store knowledge of a particular domain and use that knowledge to solve problems from this domain in an intelligent way (Kriz 1987, Adeli 1988). The three indispensable modules of an ES are referred to as ‘knowledge base’, ‘inference engine’ and ‘user interface’ (Jackson 1990, Marshall 1990, Adeli 1988). A majority of ES applications developed in civil engineering are founded on a rule-based paradigm (Mohan 1990, Allen 1992), which uses rules in the form of IF (antecedence/condition) THEN (consequence/action) as the knowledge representation schema. Two main inference mechanisms applied in rule-based systems are forward chaining, which is data (antecedence/condition) driven and good for tackling planning or predicting problems (e.g. Perera and Imriya 2003, Lee and Halpin 2003 – as discussed later, these authors also made use of fuzzy logic), and backward chaining, which is goal (consequence/action) driven and often employed in diagnostic expert systems (e.g. Yates 1993 – her work has been put into the category of  43  explanatory decision support system and will be discussed later). Advantages often claimed for ES are: 1. ES are not based on black-box technique and can explicitly explain its behavior and inference results, 2. ES once established are easy to reproduce and distribute throughout an organization, 3. new knowledge with consensus can be gradually added to the knowledge base over an extended period of time to improve the ES performance, and 4. ES don’t produce biased judgments or results given a uniform or consensus domain knowledge (Adeli 1988, Terry 1991, Partridge and Hussain 1992). In general, ES work best when the problem is not very large (i.e. does not involve too many variables/factors) and the facts and rules associated with the problem can be clearly stated (Kasabov 1996, Heathcote 2004). However, uncertainty is an intrinsic characteristic of the construction industry in which projects and construction processes are unique and easily subject to an unusually wide range of factors and disturbances (Wild 2005). For example, Chan et al. (2004b) argued that, as compared to structural engineering problems, many construction management problems involve too many variables/factors and cannot be structurally well defined. More importantly, as found in the literature review results, little consensus exists as to what factors are the most important ones affecting construction performance and consensus knowledge as to the cause-effect relationships between factors and construction performance also does not exist. Thus, development of an ES at the level of the firm for performance diagnosis may have limited appeal. It has been observed that ES performance can deteriorate rapidly near the boundaries of encoded knowledge due to limited flexibility for knowledge and information updating (Terry 1991, Adeli 1988, Yang et al. 1996, O’Brien and Marakas 2007), which for the construction performance diagnostic problem we interpret to mean the rapid adaptation of knowledge to the project context at hand. Tullet (1996) found that from a cognitive science perspective, individuals can produce very different solutions to the same problem because of inflexible thinking styles, even though they are equally capable, and each one would be quite convinced that his/her solution is the right one. Considering the construction industry characteristic that the power of construction control is very decentralized and dispersed among independent project teams who very much rely on their own collective experience-based judgement, and combined with the observation that  44  an ES with a uniform knowledge base has no or limited ability to quickly adapt to individual experience-based judgement, it is asserted that an ES based approach to construction performance diagnosis would likely be unacceptable by practitioners. This is not to say that research on the use of ES for performance diagnosis should not be pursued, especially with regard to knowledge content. The main point here is that it is worthwhile to explore other approaches that allow more direct and rapid access by users to adapt knowledge to the context of a project, basically in real time. And, the complexity of the control or inference process used should reflect the kind of information/answers sought – i.e. elegant simplicity should be a goal, especially in light of the breadth of the performance diagnostic problem – several measures, many activities and components, etc. Having said the foregoing, there can be an intersection between approaches used – e.g. knowledge representation expressed in the form of rules is not unique to ES. As a technique capable of dealing with the subjective linguistic description of states (e.g. ‘moderate’, ‘high’, ‘very high’ etc.) of critical factors assumed to have impact on construction performance, and to facilitate approximate rather than precise reasoning, fuzzy logic has been adopted for activity delay analysis (e.g. Oliveros et al. 2005) or in conjunction with expert systems for predicting construction performance (e.g. Lee and Halpin 2003, Perera and Imriya 2003). Fuzzy logic techniques are derived from fuzzy set theory first introduced by Zadeh (1965). In essence, a fuzzy logic based expert system builds on the knowledge representation structures of a traditional rule-based expert system (Kasabov 1996). The key distinctions from a classical rule-based system can be described under three headings (Kasabov 1996, Horstkotte 2000, MathWorks 2009): 1. Fuzzification of variables, i.e. the use of membership functions to transform explicit values into grades of membership for the variables or factors considered. The methods often used to develop the membership functions include a horizontal approach, a vertical approach, and a pair wise comparison. All are based on the opinions collected from a group of interviewees or domain experts; 2. Inference process, including: (2.1) using fuzzy rules in the form of “IF-THEN” statements wherein an individual’s or group’s experience-based knowledge for a specific domain can be represented;  45  (2.2) applying fuzzy operations on the variables in the antecedent part of IFTHEN rules (Minimum or Product method for ‘AND’ relationship, Maximum or Probabilistic-OR method for ‘OR’ relationship); (2.3) implementing an implication process (Minimum method which truncates the output fuzzy set, or Product method which scales the output fuzzy set) to get a fuzzy set for each linguistic value of each variable in the consequent part of the rules; and (2.4) implementing a composition process (Maximum method or Summarization method) to get an aggregated fuzzy set for each variable in the consequent part of the rules; 3. Defuzzification of results, i.e. to get a single crisp value for a fuzzy output variable. The methods widely used for defuzzification include center of gravity method, i.e. centroid method, and maximum method. In terms of specific applications of fuzzy reasoning in conjunction with the use of an expert system, Lee and Halpin (2003) formulated a fuzzy logic based system for estimating accident risk, i.e. safety performance, in the context of the utility-trenching process. The system can be used to predict the probability of accidents in a utilitytrenching operation with a shielding system. Among the seven subjective factors listed as being relevant, the three critical ones identified through interviewing seasoned practitioners were ‘training for trenching’, ‘supervision’ and ‘preplanning’. Linguistic values used for describing a factor state were good, moderate, and poor. The fuzzy logic technique was also applied by Perera and Imriya (2003) to develop an expert system for predicting cost and time performance at the activity level of detail, in which 15 critical factors were included such as ‘loss of material’, ‘labour skill’, and ‘gang size’. Linguistic terms used corresponded to ‘very high’, ‘high’, ‘medium’, ‘low’, and ‘very low’. An example of an IF-THEN rule is “IF labor productivity is very low THEN cost overrun is very high, ELSE IF plant productivity is very low THEN cost overrun is very high”. Articulating membership functions for fuzzy terms is the main problem in fuzzy logic knowledge representation (Kasabov 1996). Domain experts are supposed to specify the shape of membership functions, the number of labels, and so forth, but the observation is that most often the domain experts are unfamiliar with fuzzy sets or fuzzy  46  logic (Kasabov 1996). Thus, asking individual construction practitioners to specify from scratch or customize membership functions on an ongoing basis is not practical, especially given the breadth of the performance diagnostic problem at the project level (e.g. the ability to treat a diverse range of activity types, not just a specific type of work). Singpurwalla and Booker (2004) also asserted that membership functions are usually very subjective, and to achieve consensus on them is very difficult. Viewed in light of the distinct characteristics of the construction industry discussed in Chapter 1, the generality or applicability of membership functions to all projects is very debatable. In addition to the foregoing, it is noted that there are several different fuzzy methods for implication, composition and defuzzification that can be used in fuzzy based systems. Freitas (2002) believed that there is no consensus on what is the “best” one and which one should be chosen is not an easy question to answer. Results can differ depending on choice. For example, Horstkotte (2000)’s noted that results can be quite different depending on the defuzzification method used. How best to made the choice for the construction performance diagnostic problem is simply not clear. Another observation for the fuzzy logic models reviewed is that they usually involve a limited number of rules and a fixed structure, and the user is rarely allowed to add or modify system rules according to his or her own experience-based knowledge and specific construction context. Almost without exception, the rules used reflect only immediate causal relationships among the subjectively selected critical factors for the performance measure studied, i.e. possible interactions among the factors themselves or causal chains are seldom expressed in the fuzzy rules. In this sense they are no different than the single layer cause-effect relationships found in the reviewed regression and neural network models (as well as the experience-based causal models described in this thesis). In essence, statistical regression, neural network, fuzzy logic expert system, and mathematical model approaches, as applied in the construction literature, all share three things in common: 1. the factors examined are subjective in that they are based on experience and/or the work of others; 2. performance models in most cases are single layer based; and 3. these models lack broad generality as they are usually derived for a specific context or type of work.  47  2.6  Research on Developing Explanatory Construction Performance Models  Predictive performance models can be made use of to a certain extent to help explain actual performance. But as discussed previously, it is very possible to find that for a particular performance deviation the limited number of critical factors in a predictive model cannot explain the deviation, i.e. the model is “incomplete”, a reality confronting all predictive models, especially higher level ones. Thus, some researchers have focused their attention on developing special explanatory models, aimed primarily at helping people dig out the most plausible causes for a particular performance deviation observed, and for a specific type of work. Such models could be important for improving the efficiency and effectiveness of real construction control practice, provided that sufficient breadth of coverage of different project and work type could be achieved. However, based on the literature review, it is observed that the number of research efforts in this category (last column of Table 2.1) is far less as compared to the number of efforts focused on developing predictive models (fourth column of Table 2.1). To date, conceptual frameworks and Bayesian networks have been explored to establish explanatory models, but the use of decision support system appears to be the dominant choice for establishing explanatory models. And yet, there is still much room for improvement in terms of the explanatory models developed to date, including an overall framework in which to embed performance models. Search for such improvement is the goal of this thesis.  2.6.1  Conceptual Framework  In research, a conceptual framework is often used to outline a series of courses of action or components, and the relationships amongst them are usually represented by arrows. Applying this approach to the domain of construction performance diagnosis can make the investigation process more structured and help the target audience to develop a clearer understanding of how to diagnose a particular performance deviation observed. Maloney (1990) proposed a general framework including 5 steps (Figure 2.5) to analyze various performance measures (e.g. productivity, time, cost, quality, and safety) at the activity definition level. In order to draw a conclusion for each step, the importance 48  of finding relevant supportive data evidence was strongly emphasized by the author, who claimed that “by using this framework, the real cause of poor performance can be identified more quickly and actions to eliminate the problem can be devised”. Performance evaluation for unfinished work  Performance Performance Data 1. Does actual performance equal expected performance?  Yes  No  Data collected in support of different management functions is necessary for doing these evaluations  Are you satisfied with the level of performance? No  2. Is the work environment free from organizationally imposed constraints?  No  Yes 3. Do the workers possess necessary knowledge, skills, and ability? Yes 4. Do the workers possess the necessary motivation?  No  Take actions to improve job management, labor training, change job context and adjust estimate, etc. No  Yes 5. Is the estimate realistic?  No  Change technology and/ or methods  Figure 2.5 Performance Analysis Flowchart (Modified from Maloney 1990) As to figuring out causal factors for accidents, i.e. safety performance at the activity definition level, Suraji et al. (2001) posited a conceptual causal framework in which causal factors are classified as distal or proximal. Through the detailed study of hundreds of actual accident records, relationships among the components in the conceptual framework were validated and the detailed distal and proximal factors in each component were also identified. The authors asserted that “application of the model would help to structure future accident investigations to improve the comprehensiveness of accident causation data”. Two other conceptual frameworks developed respectively for diagnosing productivity and overall project performance were also identified and reviewed (Halligan et al. 1994, Fortune and White 2006). In general, although no consensus was found 49  regarding the structure of the reviewed conceptual frameworks, almost all of the authors emphasized the importance of seeking out data evidence in support of explanations. Under the guidance of the frameworks proposed in the literature, users are supposed to be able to systematically figure out the causes for a particular performance deviation. However, without the support of information technology, searching out evidence in the huge amount of data that accompanies a construction project is very difficult and time consuming. In addition, a seasoned practitioner’s valuable knowledge as to performance diagnosis cannot be readily or flexibly captured in the conceptual frameworks described in the literature.  2.6.2  Bayesian Networks  Another technique used by some researchers to develop explanatory construction performance models is Bayesian network which is also referred to as a belief network. Bayesian networks are directed acyclic graphs whose nodes represent variables (factors, performance outcomes), whose arcs signify the existence of direct causal influences between the linked variables, and the strengths of these influences are expressed by forward conditional probabilities (Pearl 1988). Such networks can be used to calculate the probability of occurrence of certain causal factors, given the absence or presence of certain performance symptoms. The calculation is primarily based on Bayes’s theorem P(B|A)=P(A∩B)/P(B), where P(B|A) is the probability of occurrence of event B given the fact that A has occurred, P(A∩B) is the probability of occurrence of both A and B, and P(B) is the probability of occurrence of event B. The main steps involved in developing a Bayesian network include 1. define the relevant variables (again it is left to the user to identify the relevant factors as a function of the performance measures of interest and the particular type of work, however broadly or narrowly defined); 2. define the relationships between the variables; 3. define the states of the variables (this step requires defining the detail level of the system); and 4. define the conditional probabilities of the relationships (Poole et al 1998). McCabe et al. (1998) applied this technique to study how to diagnose construction time and cost performance. An example of Bayesian network analysis presented in that  50  research is to determine how much confidence the user has to conclude that there are two many trucks given the observation that that the productivity is acceptable and the road surface has been damaged. The casual relationships among the considered factors in the Bayesian network are shown in Figure 2.6. The conditional probabilities associated with these factors and the detailed Bayesian network calculation process can be found in McCabe et al. (1998).  Number of loaders Af  Number of trucks  fec t  Aff ec  t fec Af  t  Road surface condition  Queuing time Af fec  t  Productivity  Figure 2.6 Example Bayesian Network (Modified from McCabe et al. 1998) The Bayesian network technique was also made use of by McCabe and AbouRizk (2001) for studying simulated construction operations. It has the potential to be good at quantitatively diagnosing performance deviations. The ability to flexibly add/delete causal factors and adjust corresponding conditional probabilities without redesigning the system is the chief advantage of belief networks over other forms of artificial intelligence, such as neural networks. However, there are also some obvious drawbacks with respect to its practical application. Although expert opinion on probability can be used instead of using data from other projects to get the required conditional probabilities (virtually a necessity given the lack of detailed data from previous projects), the major disadvantage of incorporating expert opinion into such networks is the general lack of understanding of probability theory by practitioners (McCabe et al. 1998). This has prevented the practical use of Bayesian networks in the construction industry. As a further impediment to its use, it is required that all variables in Bayesian networks have at least a binary value, i.e. true or false. If variables have more than two state values, which is common for potential critical factors influencing construction performance, then many more conditional 51  probabilities are required, making the approach intractable because of the very large number of probabilities that must be estimated. Finally, although the Bayesian network model may be able to present the probability of occurrence of the casual factors hypothesized to affect performance, it does not provide any function to identify project data evidence in support of the conclusion offered.  2.6.3  Decision Support System  Use has also been made of a decision support system as a method for developing explanatory construction performance models. As a method it involves a number of techniques responsible for data input/output, data processing and validation, reasoning and suggestion offering. No one dominant technique can be singled out for categorizing the approach, hence just the broad label DSS (decision support system) is used. As noted from the last column of Table 2.1, this approach is most preferred by researchers for developing explanatory models. Roth and Hendrickson (1991) discussed the potential for including automated explanatory capabilities in a project management system, including natural language generation and graphics for presenting the explanation results. The decision support system prototype developed by them was applied to diagnose a cost deviation detected at the project definition level, which under the guidance of a cause-effect diagram allowed more detailed variances at the trade and activity definition levels to be automatically tracked down. But that system could not provide explanations for the most detailed variances identified, i.e. what actual causal factors happened and with what state value. Further, no ability was included to access data from other management functions (e.g. schedule control and change order management) to help explain the cost deviations detected. Another automated management exception reporting system was developed by AbuHijleh and Ibbs (1993) to assist in explaining deviations of labour cost and schedule from planned results by exploring quantitative causal relationships amongst variables such as unit labour cost, unit productivity, and labour unit price. The variance analysis path diagram made use of in the system also takes the form of a cause-effect diagram.  52  However, in the decision support system described by Abu-Hijleh and Ibbs (1993) only a very limited number of causal factors (crew mix and absenteeism) were included for further examination to explain the deviations observed. Neither of the two DSS systems described in the foregoing makes allowance for inserting the user’s experience based knowledge to assist in diagnosing performance. Moselhi et al. (2004) established a web-based decision support system for diagnosing construction cost and schedule performance. After detecting the sub variances under the guidance of a cause-effect diagram, the user is able to select causal factors (e.g. bad weather conditions, inferior labor productivity, etc.) they believe to be able to explain the variances, according to his or her own experience-based knowledge and particular construction context. However, the approach is not accompanied by a function for searching the project database to find corroborating evidence that justifies the user’s choice of causal factors. Without such a function, it is difficult to have confidence in confirming the occurrence of hypothesized causes. Yates (1993) constructed a decision support system which in essence is a rule-based expert system. A knowledge base as to what the causes could be for time delays for different types of activities was established by using a questionnaire survey and performing statistical tests on the survey data. Given that a time delay has been encountered for an activity, the DSS/ES suggests what causal factors could be the reasons for the time performance along with corresponding corrective actions to take. However, the rules in the knowledge base cannot be flexibly modified by the users to adapt their experience-based diagnostic knowledge and very different specific project contexts. Dissanayake et al. (2008) also proposed a causal-relationship performance reasoning approach which made use of AHP, fuzzy set theory and neural networks. However, these techniques are neither transparent to or within the skill set of most construction personnel, thereby severely limiting their use by most firms. As well, the approach described was very limited in scope and lacked broad applicability. In general, all of the decision support systems reviewed are supposed to be able to explain a particular performance deviation, which cannot be achieved by the predictive models discussed previously. It is also observed that use of a cause-effect diagram with quantitative causal models is widely accepted in developing explanatory construction  53  performance approaches. But besides diagnosing variances associated with various variables which usually have demonstrable causal relationships with the performance measure of interest, how best to include and diagnose the potential range of causal factors involved has yet to be fully explored, especially in terms of a general approach. Further, no explanatory model developed to date has focused research attention on how to capture and readily reuse a practitioner’s personal experience based performance diagnosis knowledge, and how to extensively and effectively search a project’s database to find supporting evidence to corroborate user-hypothesized causal explanations. Thus, despite considerable work to date by others, a pressing need still exists for developing an effective, efficient and practical performance diagnosis approach that covers the broad spectrum of activities required to realize a project.  2.7  Summary  In summary, through the observation of actual control practice, it is noted that diagnosing unsatisfactory construction performance relies strongly on a practitioners’ personal, experience-based knowledge. The burden for them to search out data evidence in support of their hypotheses to explain performance is very heavy. Currently, mainstream construction planning and control tools (e.g. MS Project, Primavera P3) offer very limited assistance in diagnosing performance and pinpointing reasons for it. A significant number of research efforts conducted by the academic community to identify critical factors affecting construction performance outcomes and to develop construction performance models have been carefully reviewed herein. The primary review findings are summarized as follows.  2.7.1  Findings on Identified Critical Factors  No matter what technique is made use of, as to the critical factors identified in the literature it was found that: For any construction performance measure of interest, the number of critical factors having potential impact is large, but consensus on the factors has not been achieved. 54  This finding is consistent with the conclusions obtained by Chan et al. (2004) and Fortune and White (2006). The critical factors identified lack clarity in their definition. In some cases different terminology is used to express the same factor or different factors are expressed using similar terminology. Through reviewing 63 publications focusing on studying critical success factors, Fortune and White (2006) also arrived at the same conclusion “It should be noted that in a number of articles, factor definitions were unclear”. No consensus exists as to the kind of data most suitable for expressing the state values for critical factors. The ability to clearly express the states of critical factors is necessary for confirming that one or more of them is the cause for a particular performance deviation detected. Unfortunately, most research articles only identified critical factors, but did not discuss how to express their states. In some cases, factors and their values (when available) advocated by researchers did not correspond to data collected in support of day-to-day management. This lack of correspondence may impede the practical application of the construction performance models developed by researchers.  2.7.2  Findings on Established Construction Performance Models  As to the construction performance models reviewed, by they predictive or explanatory, the primary findings are: Predictive models are predominant compared with explanatory models in terms of number, and are more big picture in scope as opposed to considering the range of events that take place over time on an actual project. Thus, they offer limited assistance during the execution of a project. Most models include one layer of crtical factors or parameters. They either implicitly or explicitly assume independence of the factors or minimal interdependence. The topic of possible cause-effect relationships among the critical factors is rarely discussed in detail. Despite this apparent criticism, the same approach is used for formulating experience-based causal models as described later in the thesis. This is  55  because of the almost intractable complexity of trying to formulate multi-layer models. No consensus has emerged as to the best method for developing construction performance models, but statistical regression and neural networks are the two techniques most frequently used to develop predictive models, and use of a decision support system is the most popular mode for developing explanatory models. It is observed that cause-effect diagrams in some form has been used in almost all the explanatory models. There is little evidence of extended practical use of the models developed to date. Several reasons for this are suggested, including 1. many models are assumed to be research as opposed to operational tools and have not fully taken distinct construction industry charactertistics into consideration, 2. there can be a conflict between the informal hypotheses about performance, which practitioners have developed based on years of experience, and the formal causal relationships embedded in the various models presented in the literature, lessening the likelihood of adoption of the models, 3. most construction practioners are believed to not have the knowledge and resources required to make use of the aformentioned techniques or tailor them to suit their particular needs/project context effectively in real time, and 4. most models lack generality in terms of treating the spectrum of activities that make up a project. Finally, further consideration of past work in terms of forming a framework for a holistic diagnostic approach and/or a basis for one or more components of the approach (e.g. reasoning) is provided in Chapter 3. This assessment is made in terms of several of the tests set out in Chapter 1 and their elaboration in terms of desired properties in Chapter 3.  56  3  3.1  Overview of Proposed Diagnostic Approach  Chapter Overview  The primary focus of this chapter is on: (a) discussing a set of general properties desired for a practical performance diagnostic approach; (b) evaluating strengths and weaknesses of the primary techniques explored by researchers to date to develop construction performance models in terms of the desired properties, and also for the decision support system explanatory models reviewed; (c) introducing key concepts relevant to data integration and knowledge management; (d) discussing causal model related theories with an emphasis on the necessary conditions for inferring actual causes and how to structure causal relationships in the form of causal models for diagnosing construction performance; and (e) presenting the overall causal model based schema of the diagnostic approach proposed in this thesis, as well as emphasizing the differences between rule-based and modelbased reasoning.  3.2  Desired Properties of a Versatile Diagnostic Approach  After considering: (i) the review findings listed at the end of the previous chapter; (ii) observations of industry practice and the availability of management resources in practice (e.g. personnel skills and available time); (iii) the diversity of projects and the very large amounts of data in heterogeneous format generated and collected in support of construction management functions; and (iv) a very strong desire to try to influence construction management practice in the near term, some general properties believed to be central to the development of a practical yet more effective construction performance diagnostic approach than heretofore developed were identified. These properties have already been identified briefly in Chapter 1 in the form of tests that should be applied to assess the merits of any proposed performance diagnostic approach. They are described 57  in more detail as follows: Generality: To the extent possible, the structured diagnostic process of the approach should be equally applicable to (I) as many project performance measures as possible, and to (II) different project types and activities. Clearly, different performance measures have their own respective characteristics. For instance, ones such as cost and time are easier to be objectively and quantitatively measured than others such as quality and safety. Nevertheless, for any performance measure it is observed that there is no fixed and absolutely comprehensive set of causal relationships with which to diagnose the root causes for any deviation observed. Inevitably, it is the seasoned practitioners’ subjective experience-based knowledge which is relied heavily upon to reason the causes, aided to the extent possible by data gathered in the course of executing the project. This situation applies to all types of construction projects. The author believes that it is possible to formulate a general structured diagnostic process that can be applied to most key performance measures and different project types and activities, and has the capability to take advantage of practitioner experience-based knowledge to assist in performance diagnosis. Integration & Data: During the course of a construction project, the amount of data generated and collected for various construction management functions is very large. One of the difficult tasks currently faced by the industry is how to extract the information desired from such a large amount of dispersed data, and quickly, to help explain actual performance. Thus, (III) it is necessary to base a performance diagnostic approach on an integrated construction information environment that makes it possible to search out data from different construction management functions as performance diagnostic evidence. Further, the time available for construction practitioners to collect and process data and extract lessons learned on a daily basis is very limited. There is no appetite for collecting more data unless it can be demonstrated that the incremental value in doing so far exceeds the cost of collecting it. Thus, for a construction performance diagnostic approach to have a chance of success in practice, (IV) use should be made of data already collected in support of essential day-to-day management functions, with no requirement for the collection of data needed solely by the diagnostic approach. It is acknowledged that adopting more rigorous management processes in support of day-to-day management  58  functions would also be helpful. Meant by this is the adoption of and adherence to consistent management processes for as many construction management functions as possible. Transparency & Ease of Use: Since performance measures such as cost and time can be quantitatively calculated, so the deviation of these performance measures at a higher project definition level can be attributed to differences between the planned and actual values of the variables involved in the relevant quantitative causal relationships. A quantitative causal relationship/model refers to basic equations (e.g. activity duration = scope / (productivity • resource_usage)) or formal logic models (e.g. CPM network), with the assumption being that the basic model (e.g. CPM network) and variable values are correct. Thus, (V) full advantage should be taken of quantitative causal models to help pinpoint the variables of concern for further investigation. For instance, a project’s cost deviation must come from one or more deviations associated with variables linked to labor, equipment and material cost or scope. Then deviation in a higher level variable, e.g. labour cost, can be broken down to individual work packages or activities, and/or contributions to labour cost – e.g. man hours used, unit rate, etc. Taking advantage of quantitative causal relationships helps construction personnel to narrow the focus down to what actually needs further explanation. Further, a performance diagnostic approach has to be capable of being used by management personnel on an ongoing basis. To achieve this, it means that (VI) the structured diagnosis process embedded in the overall approach should be transparent and intuitively understandable for practitioners, and (VII) no requirement should exist for the users to run complex analytical tools for which they lack the necessary training and require specialist assistance. Flexibility & Customization: It is observed that for performance measures of productivity, safety and quality, no quantitative causal relationships can be taken advantage of. Some casual variables in the quantitative causal models mentioned early also may be functions of complex, basically unknown or imperfectly understood causal relationships. In this situation, practitioners’ experience-based knowledge plays an important role in figuring out further causal factors. Along with considering the characteristic of decentralized decision making in the construction industry, a performance diagnostic approach should (VIII) permit the capture, expression and reuse  59  of an individual’s or firm’s expertise and knowledge as to the possible reasons for the performance deviations observed, and (IX) allow the flexible and easy revision or modification of such experience-based knowledge by practitioners working on isolated projects with very different construction contexts.  3.2.1  Comparison of Techniques Used to Develop Construction Performance Models  In the previous chapter, the strengths and weakness of the different techniques used to develop predictive or explanatory performance models formulated to date by academics and practitioners were discussed. These same techniques are compared here in terms of the properties identified in the previous section. The results of this comparison are shown in Table 3.1, from which it can be seen that all of these techniques satisfy some of the properties identified, but not one satisfies all of them. One conclusion that can be drawn from this is that no one single technique by itself is sufficient for developing a practical, user-oriented diagnostic approach. A general diagnostic framework should be able to accommodate a number of techniques.  3.2.2  Comparison of Decision Support System Based Explanatory Models  Besides the performance models developed using only one of the techniques identified in Table 3.1, some explanatory models developed are decision support system based in which multiple techniques, but no dominant one, are involved. These explanatory models were also compared with each other in terms of the propertied identified. The results are shown in Table 3.2. From an examination of the table, it can be observed that these explanatory models typically possess more than one of the foregoing properties desired, but typically not all of them. Hence the challenge remains of formulating a diagnostic approach that reflects all of the properties identified.  60  Table 3.1 Comparison of Techniques Used to Develop Performance Models in terms of Identified Properties Properties Transparency & Easy Use VI. VII. No need Techniques I. II. Project III. Use with IV. No need to V. Use of Transparent for specialist Performance types and an integrated collect extra quantitative and intuitively assistance to Measures activities data platform data causal models understandable use Can be used to Can be used to Can be Can be used to Doesn’t explore It is a Need statistics study most key study different embedded in an explore data quantitative transparent related Statistical performance types of causal technique and knowledge to integrated data already Regression measures + relationships - the result is easy use it projects and platform + collected + to understand + activities + Generality  Integration & Data  Flexibility & Customization VIII. Capture experience-based knowledge  IX. Flexible and easy modification  It can capture Once a statistic experience-based regression model knowledge only in established, it cannot the sense that the be flexibly modified. expert selects the factors for analysis It can capture Once a NN model Can be used to Can be used to Can be Can be used to Doesn’t explore It is a black-box Need NN experience-based established, it cannot study most key study different embedded in an explore data quantitative technique and related Neural performance types of integrated data already causal the analysis knowledge to knowledge only in be flexibly modified. the sense that the Networks measures + projects and platform + collected + relationships - process is not use it expert selects the activities + transparent factors for analysis Can be used Doesn’t explore The analysis Easy to use Can flexibly capture User can flexibly Can be used to Can be used to Can be study most key study different embedded in an with data quantitative process is w/o needing user knowledge as to modify mathematical integrated data collected, but causal transparent and specific what factors are models in terms of Mathematical performance types of measures + projects and platform + cannot relationships - the result is easy knowledge + important and set adding/deleting Models factors and change activities + extensively to understand + corresponding search the data thresholds + their thresholds + Need to collect Not made use The reasoning Need specific Can capture Can be used to Can be used to Can be Have low flexibility to study most key study different embedded in an information to of to explore process is not knowledge to experience-based customize fuzzy integrated data construct fuzzy quantitative transparent and establish fuzzy knowledge in fuzzy membership functions Fuzzy Logic performance types of measures + projects and platform + membership causal intuitively membership rules + and rules once they are activities + functions relationships - understandable - functions established Can be used to Can be used to Can be Need priori Not made use The analysis is Need probabilityCan capture user Not easy to customize study most key study different embedded in an probabilities of to explore transparent, but related an established knowledge as to integrated data which are not quantitative the computation knowledge to what factors are network because Bayesian performance types of projects and collected in platform + causal process is not construct and important + priori probabilities Networks measures + activities + daily practice - relationships - intuitively are needed to change use the understandable - networks (Note: “+” indicates the technique is suitable for achieving the corresponding desired property, and “-” means not.)  61  Table 3.2 Comparison of DSS Based Explanatory Models in terms of Identified Properties Properties Generality Integration & Data Transparency & Easy Use Flexibility & Customization DSS-Based I. Performance Measures III. Use with an integrated data V. Use of quantitative causal models VIII. Capture experience-based Explanatory II. Project types and platform VI. Transparent and intuitively knowledge Models activities IV. No need to collect extra data understandable IX. Flexible and easy modification VII. No need for specialist assistance to use I. Performance measure III. The system is not built on an V. Available quantitative causal relationships VIII. No experience-based knowledge Roth and studied is cost. The general integrated data platform. Thus, data were explored for the performance measure for reasoning deeply about casual Hendrickson of interest (cost) in the system. VI. The structure of the system can supporting other management factors can be captured in the system. (1991) also be applied to time functions were not accessed and reasoning process embedded in the system is Thus, no further causal factors for cost performance, but not other explored in the system to help with transparent and the results are intuitively variances could be figured out. IX. measures like productivity a deeper diagnosis of reasons for understandable. VII. Using the system Users cannot flexibly express and and quality etc. II. There is cost IV. However, use of the doesn’t require any particular knowledge modify their own experience-based no limitation on the project system doesn’t require extra data most construction personnel don’t have. performance diagnostic knowledge in types and activities that the that are not collected in daily the system. system can be applied to. construction management practice. VIII. No personal experience-based III. The system is not built on an V. In the system quantitative causal Abu-Hijleh and I. Labor cost and time knowledge for reasoning about casual performance (L$ or Mhrs) integrated information platform. relationships for labor cost and time Ibbs (1993) performance were explored to narrow the factors for labor cost and time Thus, data supporting other were studied. But the focus down to what needed to be further performance could be captured in the system’s structure cannot management functions such as change orders were not accessed explained. VI. The reasoning process in the system. Except crew mix and be applied to diagnose absenteeism, many relevant factors and explored in the system to help system is transparent and easily causes for other were not addressed. IX. In the system performance measures like with further diagnosis of reasons understandable. VII. Using the system user is not given any flexibility to productivity, safety, and for labor cost and time variances. doesn’t require knowledge about complex analysis tools. modify their own experience-based IV. Using the system doesn’t quality. II. There is no performance diagnostic knowledge. limitation on the project require extra data not collected in types and activities that the daily construction management system can be used to help practice. diagnose labor cost and time performance.  Cont’d  62  Properties Generality Integration & Data Transparency & Easy Use Flexibility & Customization DSS-Based I. Performance Measures III. Use with an integrated data V. Use of quantitative causal models VIII. Capture experience-based Explanatory II. Project types and platform VI. Transparent and intuitively knowledge Models activities IV. No need to collect extra data understandable IX. Flexible and easy modification VII. No need for specialist assistance to use I. In essence, this is a rule- III. The system is not built on an V. A schedule performance index along with VIII. An experience-based knowledge Yates (1993) a performance index reflecting construction base as to what possible causes could based expert system. Time integrated data platform. Only performance is the focus of limited construction management efficiency were used to indicate an activity’s be for time delays was established by using a survey and performing this system, but in general information can be examined. IV. actual time performance directly. No the structure of the system Use of the system on daily basis quantitative causal relationships were used. statistical tests on the survey data. IX. VI. Statistical analysis on a survey was used User is not allowed to flexibly can be applied to analyze does not impose additional data collection requirements. to establish the knowledge base containing hypothesize possible casual factors other key performance possible causes and corrective measures and also not be able to modify the measures. II. The project relative to type of construction, type of established knowledge base according types that the system can contract, geographical location, and dollar to his or her own experience-based be used to help diagnose volume of project. The knowledge base, performance diagnostic knowledge. time performance include which was not transparent, might be buildings, power, inconsistent with the user’s firmer personal heavy/light industrial etc. knowledge. VII. The system is easy to use and no specialist assistance is required. III. The system is built on an V. No quantitative causal relationships were VIII. The user can select causal Dissanayake et I. Productivity explored in the system since the performance factors according to personal performance was studied. integrated construction al. (2008) The framework used could management information platform. measure studied was productivity, which has experience-based knowledge, and be applied to reason about Use was made of data collected in no corresponding quantitative causal model. which correspond to data already support of different management VI. The reasoning process involves black box collected. IX. User can also flexibly deep casual factors for modify his or her own selection of the functions. IV. However, additional technique, e.g. neural nets, and is not other performance casual factors in different construction measures like safety, etc. data for AHP and fuzzy logic (e.g. transparent to the users. VII. AHP, fuzzy logic and neural networks were involved in contexts. II. There is no limitation data for constructing fuzzy membership functions) was the system, which are not within the skill set on the project types and of most construction personnel. Daily use of activities that the system required by the system. these techniques requires specialist can be used to help assistance. diagnose productivity performance.  Cont’d  63  Properties Generality Integration & Data Transparency & Easy Use Flexibility & Customization DSS-Based I. Performance Measures III. Use with an integrated data V. Use of quantitative causal models VIII. Capture experience-based Explanatory II. Project types and platform VI. Transparent and intuitively knowledge Models activities IV. No need to collect extra data understandable IX. Flexible and easy modification VII. No need for specialist assistance to use III. The system is not built on an V. As to performance measures of interest the VIII. Personal experience–based I. Cost and time Moselhi et al. knowledge could be captured through performances were studied. integrated data platform. Data used quantitative causal relationships that (2004) comprise an earned value system were used. allowing the user to hypothesize causal were mainly from cost (earned The system’s overall factors to explain the time and cost VI. The process for identifying performance structure could be used to value) and time management functions. The ability to access and variances and causal factors was transparent and variances observed, but no supporting reason about other performance measures like explore data collected in support of easy to understand. But fuzzy logic was involved evidence could be searched out to productivity or safety. II. other management functions was in identifying corrective actions, which requires prove or disprove such hypotheses in the system. IX. User has the flexibility There is no limitation on not provided. IV. Data used by the user to have related knowledge to ensure system are daily collected. Use of understanding the relevant reasoning process. to hypothesize different casual factors the project types and in different construction contexts. activities that the system the system does not require extra VII. The system is easy to use and no specialist assistance is required in using it to can be used to help explain data collection. help explain performance measures of cost and time performance. interest.  End  64  3.3  Data Integration for Performance Diagnosis  In practice, a construction company constantly generates a great amount of data to support its daily operations, which are distributed across various functional databases and typically maintained separately by the various knowledge specializations within a firm. As a result, data sharing can be difficult between management functions (Rujirayanyong and Shi 2005). In order to satisfy the diagnostic approach property of access to data generated/collected in support of essential daily management functions and make sure that there is data consistency amongst the management functions, an integrated information environment allowing easy access to various heterogeneous data is desired. In the past, significant research focused on data integration related topics has been carried out (Rujirayanyong and Shi 2005, Owolabi et al. 2006, Halfawy et al. 2007), but little has been written about the advantage such integration can offer to performance diagnosis work. This lack of access to data across a range of construction management functions has limited the usefulness of the various schema proposed to date (see Table 3.2).  3.3.1  General Approaches for Data Integration  Data integration is the problem of combining data residing at different sources, and providing the user with a unified view of these data (Lenzerini 2002). Computer information systems are usually constructed on a multiple layered architecture, normally including data, business specific and user interface layers. Choices exist as to at which layer or layers data integration could be sought. Ziegler and Dittrich (2004) distinguished general integration approaches according to the level of abstraction where data integration is performed. As shown in Figure 3.1, at least six different general approaches for data integration are available. (1) manual integration at the user’s layer, in which users directly interact with different information systems with different user interfaces and manually integrate selected data; (2) common user interface, in which the users can access data in different information systems through a uniform look and feel user interface (e.g. a web browser), but the data is still separately presented and integration work has to be done manually by the user; (3) integration by application, this approach 65  uses an integrated software application to access various data sources dealt with in separate information systems and return integrated information; (4) integration by middleware, where middleware is a type of computer software used to connect and support distributed applications to achieve the objective of data integration; (5) uniform data access, in which a logical integration of data is accomplished at the data access level. Through the uniform access layer physically distributed data resources can be easily accessed; and (6) common data storage, where physical data from several sources are extracted, transformed and loaded (ETL) into a common data storage place (e.g. data warehouse), which then can be accessed by different applications, to achieve the  From lower to deeper layers  objective of data integration. Manual Integration by User Common User Interface Integration by Application Middleware Integration  General integration approaches based on six different information system layers  Uniform Data Access Common Data Storage  Figure 3.1 General Integration Approaches (Modified from Ziegler and Dittrich 2004) 3.3.2  Integrated Information Platform Selection  It is observed that in actual practice, practitioners have the tendency to only focus attention on the management functions for which they are responsible, and rarely look at a project from a unified perspective, mainly because of time pressures. Russell and Froese (1997) and Froese and Staub-French (2003) highlighted the importance of having a unified or holistic view of a project. The ability to make associations amongst different management functions, to some extent, can help achieve the unified project view, and such associations are also believed by the author to be beneficial for construction performance diagnosis in terms of searching out supportive data evidence. The key focus of this thesis is not to study the pros and cons associated with 66  different data integration approaches, but to formulate and demonstrate the merits of an approach that makes use of the experience-based knowledge of practitioners to explain actual construction performance on an ongoing basis. An important premise of the thesis is that access can be readily achieved to models and data generated or colleted in support of the primary functions of construction management, and in addition, associations can be forged amongst the data representations used for these functions. Thus, while it is desired to test the diagnostic approach ideas presented in an integrated construction information environment, they are largely independent of any one specific approach used to create an integrated data environment. Use is made herein of a research system called REPCON to implement and test the approach for three reasons (a) it supports multiple construction management functions within a single system and associations can be forged between the data elements of these functions; (b) access to the source code and programming assistance allows reasonably rapid prototyping, thereby allowing more comprehensive development of the approach; and (c) it allows the approach developed to be tested on “full scale” projects. A quick overview of REPCON system features relevant to this thesis is given in the last part of this chapter.  3.4  Performance Diagnosis Knowledge and Management  In this section primary concepts and constituents of knowledge management, a discipline born in the mid-1990s (Koenig and Srikantaiah 2004) and researched heavily in the past decade, will be introduced and discussed from the performance diagnosis perspective.  3.4.1  Knowledge and Its Management  In the Merriam-Webster dictionary knowledge is defined variously as (1) the fact or condition of knowing something with familiarity gained through experience or association, and (2) acquaintance with or understanding of a science, art, or technique. In academia, there is a range of views as to what constitutes knowledge. For instance, Harris (2005) thought of knowledge as a state of 'knowing' (from information, ideas, or understanding gained from experience, observation, or study) and is the sum of what has been perceived, discovered or learned. But as one of the most cited authors on knowledge 67  management between 1995-2001 (Koenig and Srikantaiah 2004), Davenport and Prusak (1998) defined knowledge as a fluid mix of framed experience, values, contextual information, and expert insight that provides a framework for evaluating and incorporating new experiences and information. However in the words of Albert Einstein, “The only source of knowledge is experience.” It seems that Kazi (2005) totally agreed with the genius since he argued that experience is the most important and valuable knowledge source for the construction industry. Consensus has yet to be achieved by academics on a definition of knowledge management. In Davenport and Prusak’s (1998) viewpoint, knowledge management is concerned with the exploitation and development of the knowledge assets of an organization with a view to furthering the organization’s objectives. Quintas’ (2002) definition for knowledge management is that it is the discipline of creating a thriving work and learning environment that fosters the continuous creation, aggregation, use and re-use of both organizational and personal knowledge in the pursuit of new business value. Although researchers have different views on what knowledge management is, most of them acknowledge that knowledge management can bring significant benefits to a company. For instance, Kazi (2005) thought that for construction contractors, knowledge or “experience” management can be the solution to counter balance the effect of skilled employees who will be retiring soon. There does seem to be a consensus by researchers that capture, save, transfer, refine and reuse of knowledge are the primary constituents for knowledge management (Davenport and Volpel 2001, Tserng and Lin 2004). All of these constituents need to be addressed when formulating a knowledgebased diagnostic approach.  3.4.2  Knowledge Representation  In order to capture knowledge on causal relationships, knowing how to represent or express the causal knowledge is necessary. Different knowledge representation techniques have been developed which have their own pros and cons. Of these, semantic networks, production rules, and case-based reasoning are thought to be the ones from which some inspiration on how to capture and represent causal relationships can be  68  obtained. Semantic networks are composed of nodes and links. Nodes represent concepts or objects, and links represent relationships among them. If causes, effects and causal relationships are represented, then a semantic network can be regarded as a cause-effect diagram (Figure 3.2). Bayesian networks which were discussed in Chapter 2 are a type of cause-effect diagram, upon which Bayesian analysis can be conducted. Although the states associated with the factors on the nodes, which might trigger the occurrence of the undesired states of successor factor(s), are not represented directly on the diagram, such diagrams are easy to understand in terms of illustrating the relationships amongst different events. The clarity that semantic networks offer should be a feature of the causal relationships used in the diagnostic approach formulated in this thesis.  Affect  Raining  Resource availability  Ground condition  Affect  Activity start  Affect  Figure 3.2 Cause-Effect Diagram Example Production rules, sometimes also called IF-THEN rules, constitute another prominent knowledge representation technique whether or not used in an expert-system application. Along with operators such as equal to/is (=), not equal to/is not (≠) and greater than (>), etc., IF-THEN rules can represent causal relationships as well. Two examples of IF-THEN rules are: IF precipitation is greater than 40mm THEN ground condition is poor; and IF resource is not available THEN activity start is delayed. Compared with semantic networks the triggering states of the factors expressed in terms of quantitative or qualitative values can be directly represented in these rules. This advantage is desired for the diagnostic approach pursued in this thesis. 69  As to case-based reasoning, it is an approach of solving new problems based on the solutions to similar past problems. The most critical step for this technique is to find similar cases in terms of their attributes, characteristics, and/or contexts. Case-based reasoning mirrors in part the informal reasoning processes people use consciously or unconsciously to recall similar cases they experienced in the past to help construct hypotheses for explaining the current outcome observed. Although the results from casebased reasoning are often questioned for not being supported by statistically relevant data, the characteristic of this technique of being able to capture and retrieve the attributes and contexts of past cases as well as the one under analysis is desired by the diagnostic approach. As to saving, transferring, refining and reusing causal relationship knowledge, different  characteristics  of  the  quantitative  and  experience-based  causal  relationships/models must be taken into consideration. For quantitative causal relationships, in the construction management domain, the number of such relationships and supporting tools (e.g. CPM) is relatively limited. They are axiomatic, have always been used without much dispute, and more importantly will not change over time in different project contexts. Therefore, once such relationships are codified and saved for doing diagnostic work, they can be applied to all scenarios and don’t need any further refinement. But for experience-based causal relationships, the situation is different. The number of causal relationships needed to represent all project types, contexts, etc. is uncountable, and consensus on them is very difficult to achieve. And as experience is accumulated, experience-based causal relationships are subject to change or refinement. Therefore, the schema for managing experience-based causal relationships differs considerably from that for dealing with quantitative causal relationships. Detailed discussion on the overall schema of the proposed diagnostic approach and how it reflects the forgoing discussion and features desired are elaborated upon in the last section of this chapter.  3.5  Causal Models  In human thought, causal models are pervasive and they play a critical role in the major  70  domains of human cognition: reasoning, decision-making, judgment, language use, categorization, and learning (Sloman 2005). What are causal models? Simply put, they are a representation of causal relations between events (Pearl 2000).  3.5.1  Quantitative Causal Models  As described before, a quantitative causal relationship refers to a basic equation (e.g. activity duration = scope / (productivity • resource_usage)) or formal logic models (e.g. CPM network). However, it is important that the users of such relationships understand the assumptions/limitations that underlie them. It is also important that users understand that while the relationships are axiomatic or self-proved in nature, the values assigned to specific variables may reflect a causal reasoning process that to some extent is experience-based in nature, e.g. assigning a value of productivity to determine an activity’s duration. In such equations or formal logic models, the order of variables can be permuted without changing the rationality of the mathematical relations among them. For instance, after permuting the variables in the activity duration equation mentioned previously, the new one Productivity=scope/(activity duration • resource_usage) is still valid in an algebraic sense. However, from the diagnosis perspective, this equation doesn’t correctly represent the causal relations between the dependent and independent variables since obviously changing an activity’s duration cannot be the cause for a change in productivity. The notion of independent vs. dependant variables is an important one. Sometimes, independent variables are not truly so, and the relationship may be in the form of an experience-based causal relationship. For example, under certain conditions, productivity may be influenced by the resource usage level. Such considerations are usually handled when assigning values to the independent variables, e.g. estimating productivity as a function of project context and resource usage level. As well, for diagnostic purposes, sometimes quantitative relationships need to be expressed in a way that minimizes the number of underlying assumptions. For example, activity duration=scope/(productivity* resource_usage)+idle time, where idle time does not involve resource usage and is usually expected to be zero in planning. Nevertheless, equations serve a conceptual function beyond their mathematical one in that they provide  71  a tool for thinking. Specifically, in the causal domain, equations are not understood symmetrically but rather in correspondence with the conceiver’s causal model (this claim was verified by an experiment on the equation of pressure=weight/area in a psychological laboratory (Sloman 2005)). Therefore, in order to take advantage of basic quantitative equations and logic models to diagnose construction performance, arranging the variables involved in a form with causal meaning is necessary and important. For the convenience of discussion, hereinafter basic quantitative equations or logic models having causal meaning are referred to as quantitative causal models, and the independent and dependent variables in these equations/models are titled respectively as causal variables and effect variables. It is observed that causal variables in one model could also be effect variables in some other quantitative causal models. In other words, multiple-layer quantitative causal relationships could exist. Recalling the previous discussion on semantic networks or cause-effect diagrams, the integration of multiple quantitative causal models can be conceptually and readily expressed in the form of such a network diagram (Figure 3.3). The nodes and arrows in the diagram represent respectively the causal/effect variables and the causal relations among them. For consistency, hereinafter such arrow diagrams (arrow lines start from causes and end at effects) are referred to as a causal diagram.  X2  X4 Quantitative Causal Models : X1= X2+X3 X2= X4*X5 X3= 2*X5 (assume these equations having causal meaning )  X1  X3  X5  Figure 3.3 An Example of Multiple-Layer Causal Diagram Based on the quantitative causal models that accompany a causal diagram, if any variance between planned value and actual value of an effect variable is observed, then the variance can be quantitatively attributed to the causal variables linked directly with that effect variable. Obviously, then the causal variables with the attributed variance can  72  be rationally confirmed as the actual causes (i.e. causes which actually occurred) and the ones without variance are not the actual causes. For the identified actual causes, the variances associated with them (now they could be regarded as the effect variables) can be further attributed to the causal variables on the next layer. Following such a process, for a specific case at hand a series of actual causes can be figured out from all the causal variables involved. For performance measures like time and cost, such quantitative causal models exist. Identifying and arranging them in the form of a causal diagram can guide the performance diagnosis.  3.5.2  Experience-based Causal Models  Basing a performance diagnosis only on quantitative causal models is insufficient, as quantitative causal models do not exist for performance measures such as safety, quality, and productivity. Further, after confirming one or more of the causal variables in the last layer of the quantitative causal model diagram as the actual causes of the performance deviation, practitioners still need to know if they are the root causes, i.e. are there other experience-based layers that contain further causes for the performance deviation? Thus, based on the events experienced in a specific project and the experience-based knowledge of practitioners, usually other causes will be proposed. For instance, based on the quantitative causal model activity duration=scope/(productivity • resource_usage), if an activity’s duration is found to be extended and no actual variance is detected on its scope and resource usage, then obviously it is the causal variable of productivity that in fact caused the extended duration. Here, a caveat re the quantitative causal model must be made. It is possible that productivity also had no variance because in reality work may have been interrupted resulting in no work being done for a period of time. For discussion purposes here, this caveat is ignored. Under this situation, practitioners often would like to hypothesize other cause events, e.g. inclement weather conditions, or poor work face access conditions, etc., as the possible causes for the undesired actual productivity. Similarly, one could hypothesize various events to explain reasons for a variation observed in resource usage rate. Unfortunately, since there are no further quantitative causal models linking such causes (e.g. weather conditions, work face access) with the  73  last layer causal variables (e.g. productivity) of the quantitative causal models, so diagnosing the actual causes is not nearly as straightforward as for the causal variables in the quantitative causal models. For distinction and convenience in the discussion that follows, causal models used to represent such experience-based causal relations are named as experience-based causal models, in which the cause and effect events are respectively titled as causal factors and effect factors. With respect to the causal modeling framework adopted in this thesis, the effect factor in turn maps one-on-one to the lowest level causal variable (i.e. a variance) in a quantitative causal model. Similar to quantitative causal models, experience-based causal models could also take the form of a multiple-layered causal diagram. In reality, experience-based causal models that relate causal factors and effect factors are much more complicated than the quantitative causal relationships. For example, the factors might affect each other directly (i.e. reciprocal causal linkage shown in Figure 3.4(a)), through a loop causal path (Figure 3.4(b)), or for one effect factor there exists causal relation among its causal factors (Figure 3.4(c)). More problematic is that it is very difficult to achieve consensus on the construction of the relations among the selected causal factors, given a group of knowledgeable individuals. CF 2  CF: Causal Factors CF 1  CF 2  CF 1  CF 2 CF 1  CF 3 (a)  (b)  CF 3 (c)  Figure 3.4 Reciprocal, Loop and Dependent Causal Relations Further, from the perspective of root cause analysis, the number of layers in a causal diagram representing experience-based causal relations could be unlimited, since in an apparently endlessly interconnected world everything seems to influence so many other things and no matter how deep you go there’s always at least one more cause you can look for (Bellinger 2004). From a pragmatic perspective, especially a construction one, as 74  soon as there is a high level of interconnectivity in a causal diagram, one encounters a combinatorial problem that will overtax the capabilities of industry personnel. That is, it is impractical for practitioners to specify complicated multiple-layer experience-based causal relations among the hypothesized factors given the different possible states for each factor. And in most cases, it does not mirror the way construction personnel causally think in practice (e.g. rainy weather slowed excavation, as opposed to rainy weather caused poor ground conditions, which in turn slowed productivity due to equipment breakdown). Thus, while in an ideal world one would like to persuade industry personnel to use explicitly or implicitly more rigorous causal models, either through improved training and/or the allocation of more resources, this is unlikely to happen in the actionoriented world of construction. Hence, simplified and thus imperfect models are sought to help ensure workability while offering substantive improvements in performance diagnosis because the reasoning process is made explicit. The adoption of such simplification is not unique to this thesis work, in fact it is reflective of much of the previous work by other researchers (see the second finding as to construction performance models at the end of Chapter 2). Specific features of the experience-based causal model form adopted herein in order to reflect the needs for simplicity and practicality and the current state of knowledge are as follows: •  a single layer format as depicted in Figure 3.5, which consists of user-specified causal factors, CFi, i=1, 2, …, n and one effect variable;  •  an implicit assumption of independence amongst the causal factors, although there is no formal requirement that causal factors be independent of one another;  •  avoidance of treating a continuum of values for a causal factor by applying a user-specified threshold test – i.e. if the value of a factor falls below a specified value, then it is deemed to have no impact on the effect variable and is thus considered to be inactive – otherwise it is treated as an active contributor to the value of the effect variable; and  •  no attempt is made to apportion the variance value for the effect variable amongst the active causal factors.  75  CF 1  Effect Factor (Last Layer Causal Variable)  CF 2 CF 3 CF n  Figure 3.5 Single Layer Experience-based Causal Diagram It is important to emphasize that the purpose of the causal model is not to forecast the value of the effect variable. Rather, it is to try and explain the known value of the effect variable, as determined using the quantitative causal model. With respect to the use of a single layer model, this pragmatic choice mirrors the work of other researchers who have explored the use of regression and neural networks. Although the use of the single layer format ignores the complexity of the causal interactions that might occur amongst the causal factors (e.g. combination of poor weather and labour unrest can affect labour motivation) and has limitations on how well it can represent the reality of actual projects, there is a number of important features or benefits associated with such a format for organizing causal factors to form a causal model: (a) this format avoids the combinatorial problem of accounting for different factor state values; (b) such a format is readily modifiable by users in terms of adding or deleting factors thought to be relevant without needing to specify causal relations among the causal factors; (c) the single layer format allows for the direct specification of the state values (threshold values as treated herein) of a causal factor which are assumed to contribute to unsatisfactory performance; and (d) a single layer format can capture most research findings to date on the critical factors that affect project performance, because most performance models developed to date have been single layer ones. With respect to the implicit assumption of independence amongst causal factors, all that is meant that the user is free to specify causal factors without considering possible interactions or causal relationships amongst factors (e.g. precipitation leads to poor ground conditions and hence reduced productivity). In the future, consideration should be given to potential correlation amongst factors. Such information could be used to  76  demonstrate corroborating data, and thus help to strengthen assertion as to likely causes for the variance being analyzed. In terms of treating a spectrum of values for a causal factor and apportioning the value of the effect variable amongst active causal factors, the challenges are very significant. We elaborate on them briefly as follows and deduce how some of these issues can be addressed in a workable manner. In theory, to infer the actual causal relationship between two variables X and Y, three conditions must be met: (1) there must be concomitant variation or co-variation between X and Y; (2) there is temporal asymmetry or time ordering between the two variations, i.e. cause (X) variation must occur before effect (Y) variation; (3) elimination of other possible causal factors that may be producing the observed relationship between X and Y (Selltiz et al. 1959). These three conditions provide useful guidance when attempting to determine which, if any causal factors, may have contributed to the variances of interest. As to the first condition, a variation in the effect factor (Y) is not difficult to determine, i.e. an unsatisfactory performance outcome in safety, quality, or a variance in the last layer of causal variables in the quantitative causal models, can be readily detected. But as to determining the variation of the hypothesized causal factors (X) in the experience-based causal model, the big challenge is that the baseline state of most causal factors is not specified clearly in the plans or estimates, which poses difficulties in determining whether or not the causal factors experienced a variation. Considering this, the diagnostic approach assumes the threshold values specified by the user for the hypothesized causal factors constitute the baseline states. This is one of the primary assumptions discussed in Chapter 1. In reality, construction personnel may not have thought deeply about likely baseline state values for causal factors, and may at best only have a vague idea as to threshold values that once exceeded contribute to negative performance (although the fuzzy logic technique might help with this problem, most practitioners lack knowledge of this technique and generation of the requisite fuzzy membership functions would be an enormous task given the goal of being able to treat a broad spectrum of activities and project types). Having considered this, the diagnostic approach should be able to directly show users the actual states or values of a model’s  77  causal factors, thereby allowing practitioners to subjectively and directly evaluate whether or not the corresponding factor values could result in a performance problem. For instance, for some activities, ‘precipitation’ might be a causal factor affecting their productivity, but how much precipitation required to impair the productivity is unclear. For such a situation, searching out actual data on precipitation and displaying it to construction personnel directly could provide them with a clear idea as to what actually happened and decide with the supportive evidence in hand if ‘precipitation’ is a likely actual cause. As to the second condition that cause events must happen before effect events, the diagnostic approach should, to the extent possible, allow users to set up a time filter to screen the data and then to examine if these filtered data provide supportive evidence for asserting that the hypothesized causal factors are likely actual causes. For example, high velocity winds could be a causal factor causing a delayed start for an activity sensitive to wind (e.g. tower crane is necessary for the activity). Then obviously, daily wind speed data for the days immediately before the actual start date of that activity would be of interest, but not the dates after the actual start date. Therefore, filtering data for some data fields of interest in terms of a specified time window is a desired function for the diagnostic approach. In fact, in addition to providing a time filter function, a more comprehensive set of filter functions should be included to help users select specific types of records, drawings, etc. The last condition requiring that all other possible causal factors be ruled out is the most problematic one to meet. From the literature review on critical factors it was found that for any performance measure the number of potential causal factors is large, and no real consensus exists on a common core of factors. In fact for experience-based causal relations, Asher (1983) argued that there is a potentially infinite universe of possible variables. It is impossible to eliminate all possible causal factors in an infinite space. Therefore, at some point constraints or logic must be used to examine a finite set of measured factors (Asher 1983). Further, they should accord with data already generated/collected in support of essential management functions, a point made several times before. How many causal factors or which finite set should be selected for a causal model to capture the most influential factors is not easy to determine nor is achieving  78  consensus about these factors from experienced personnel. Different knowledgeable practitioners might use different sets and the same person might use different sets in different scenarios or project contexts. Interestingly, based on work in psychology, it has been found that people tend to first focus on the factors that are assumed to have the most causal power (Lien and Cheng 2000). Sloman (2005) also observed that most research on causal learning assumes that people know which variables are relevant in the first place. This is consistent with the findings based on the literature review of past related construction research that no matter what technique was used - statistical analysis, neural networks, importance index, etc., the starting point of the analysis involved the selection of a finite set of factors for further examination based on people’s subjective opinion, whether it be the researcher’s opinion or results from a survey that presented a smorgasbord of possible factors. Along with considering the Pareto principle (also known as the 80-20 rule) that for many phenomena 80% of the consequences stem from 20% of the causes, it can be reasonably predicted that for a experience-based causal model constructed by the user based on his or her personal experience-based knowledge and possibly coupled with previous research findings, the number of important causal factors hypothesized would be limited. The elimination process will start from a finite set of causal factors through checking if the relevant variations exist and their occurrence in time. If supportive data evidence is found for one or more of the causal factors, then these factors are likely to explain some if not all of the variation in the effect factor. If all the hypothesized causal factors are eliminated or if the user is not fully satisfied with the current findings, then the diagnostic approach should allow the user to hypothesize other causal factors or modify the current ones for further examination of the data. It should be clear that the use of a single experience-based causal model cannot guarantee a complete explanation for variance in an event factor, and thus diagnosing construction performance and searching out supporting evidence might involve a number of iterations. Based on the results obtained, and assuming that the causal models capture the best expertise of a firm, and data in support of ongoing management functions have been fully and accurately recorded, the following can be asserted: the causal factors hypothesized by the user and the corresponding supportive data evidence found by applying the causal models to a search  79  of the project data base are likely to be the most persuasive actual causes for the performance problem of interest.  3.5.3  Overall Causal Diagram for Diagnosing Performance  Thinking causally about a problem and constructing a diagram that reflects causal processes often facilitates the clearer statement of hypotheses and the generation of additional insights into the topic at hand (Asher 1983). Considering the foregoing discussion about the two types of causal models and how to infer actual causes, the general overall structure of the causal diagram (causal modeling framework) for diagnosing construction performance measures is shown in Figure 3.6 (a). It is comprised of three main layers: (i) project performance measure layer; (ii) quantitative causal models layer; and, (iii) experience-based causal models layer. As elaborated upon in later chapters, a causal model based structured diagnosis process having generality for different performance measures is viewed as a significant contribution of this thesis research. It can be seen that if quantitative causal models exist, then the deviation associated with the performance measure of interest will be quantitatively attributed to the various causal variables through multiple-layer causal paths in the second layer, i.e. quantitative causal models, and then the experience-based causal models in the form of single layer in the last layer of the causal diagram will be used to try and explain the detected actual variances of the casual variables to which no further quantitative casual model is attached. As stated previously, not all measures (e.g. quality, safety) can be expressed through quantitative causal models, and hence the second layer of the causal modeling framework is not applicable for all measures of interest. In this situation, experience-based causal models will be used directly to help figure out the likely causes for the performance measure deviation (Figure 3.6 (b)). The benefits associated with the overall causal diagram structure shown in Figure 3.6 include: 1. This schema can be applied to all primary performance measures; 2. Quantitative causal models organized in a multi-layer form can be pre-coded in the structured diagnostic approach and made use of directly to compute variances of causal variables for different project contexts without any user  80  intervention required; 3. The structure of the causal diagram enables rapid construction of a specific overall causal diagram by taking advantage of sub modules, e.g. two experience-based causal models respectively suitable for analyzing a performance deviation for activity of work type A and activity in construction phase B can be used at one time like one experience-based causal model to analyze the deviation of a specific activity of work type A in construction phase B; and 4. Experience-based causal models in single layer form are relatively easy to specify by seasoned practitioners and allow them to express their personal experience-based knowledge for diagnosing construction performance. CV: Causal Variable CF : Causal Factor CF 1 CV 1  CV 3  CF 2 CF 3  Primary Performance Measure CV 2  CV n  CF m  CF 1 CF 2  Primary Performance Measure1  CF 3  Primary Performance Measure 2  CF m CF n  CF n Performance Measure Layer (time, cost)  Quantitative Causal Experience-based Models Layer Causal Models Layer  (a)  Performance Experience-based Measure Layer Causal Models Layer (quality, safety,etc)  (b)  Figure 3.6 Overall Causal Diagram for Diagnosing Performance In this thesis, we make a careful distinction between the underpinning ‘technology’ of the diagnostic framework (e.g. causal model based reasoning versus rule based expert system), and the expression of knowledge. It has already been explained that the thesis seeks to explore the merits of a causal modeling framework. However, knowledge in that framework can be expressed in part using paradigms from other technologies – e.g. the IF-THEN format often used by rule-based expert systems. Model-based refers to modeling or representing the diagnostic process in a structured way like the semantic network in Figure 3.2, which to some extent can simulate the thought process behind diagnostic tasks. In fact, rule-based and model-based knowledge representation structures are two different ways that can help implement diagnostic tasks. For the overall causal 81  modeling framework adopted, it can be seen that the whole structure of the diagnostic approach in essence is model-based, first with a series of quantitative causal models and then with single layer user-defined experience-based causal models. However, since the IF-THEN format is suitable for representing or expressing uncertain and ad-hoc knowledge (Alty 1987, Song et al. 1996), in the single layer experience-based causal models it is used to help specify hypothesized causal factors’ threshold states. But it is not this format that directs the diagnostic process of searching for supportive data evidence, i.e. IF-THEN rules expressed in very cryptic form are just used to evaluate whether a causal factor is beyond a user-specified threshold state or not. It is the data fields selected and the corresponding filters specified by the user for the factors hypothesized and grouped in different experience-based causal models that guide the search process, i.e. one does not need the support of an inference engine. Contrasted with pure rule-based reasoning as the basis for a diagnostic reasoning framework, reasoning with model-based knowledge representation has the following main advantages. 1. Separation of control knowledge from domain knowledge (Alty 1987, Hudlicka 1988, Lucas 1997). In the proposed diagnostic approach a series of causal models (quantitative and experience-based) make explicit the causal relationships among the causal variables and construction performance, and also among the causal factors and the corresponding causal variables. It contains only domain knowledge and the control knowledge for the inference or diagnostic process is embedded in the whole cause-effect diagram. Modularity in terms of each rule is often advertised as a feature of rule-based systems and any rule as a module can be easily added and deleted in the knowledge base (Adeli 1988). In practice, however, partly because control knowledge is combined with domain knowledge in these rules, they are not necessarily independent, modification might be difficult, and consistency amongst rules could be difficult to guarantee (Hudlicka 1988). Compare this with a model-based reasoning process where adding/deleting one causal factor or model will not affect the overall structured diagnostic process. The second main advantage of model-based reasoning process is related to its modularity in terms of models (Alty 1987, Glaser 2002, Berwick et al. 2002, Lamperti and Zanella 2003). Modularity allows the same models to be used several times for  82  diagnosing distinct systems and also for other uses (Hudlicka 1988, Lamperti and Zanella 2003). For example, in the proposed structured diagnostic process, user defined experience-based causal models in the form of modules and as initially defined for explaining activity duration could be embedded within the overall cost causal diagram to help explain cost performance, as illustrated in the next chapter.  3.6  Schema of Proposed Diagnostic Approach  Reflecting the discussion in the previous sections of this chapter, the overall schema of the proposed diagnostic approach has been designed to have five main components, which along with their interlinking via numbered process lines are depicted in Figure 3.7. These components are: (1) ready access to project data contained within an integrated construction information platform that supports unified project management, or some surrogate for such a platform; (2) a component that includes routines for exploiting/analyzing quantitative causal models to narrow the focus of inquiry to parameter variances at different levels; (3) a knowledge base component for capturing experience-based expertise in the form of single layer experience-based causal models composed of multiple causal factors that reflect the data generated/gathered in support of essential management functions; (4) a component that allows the user to formulate hypotheses as to causes for unsatisfactory performance using causal models from the third component customized to the project context at hand; and, (5) a component responsible for searching out data evidence in the project database in support of the performance hypotheses and reporting the results.  3.6.1  Integrated Information Platform (Component 1, Figure 3.7)  The REPCON research system, which has been developed at the University of British Columbia, was selected as the integrated information platform to base the diagnostic approach on because it supports unified construction project management and a significant amount of data in support of various construction management functions within a single system. Most importantly, ready access to the system’s internal workings and interface tools helps with the implementation of the diagnostic approach described in this thesis. 83  5. Search and Report Data Evidence Use customized causal models as hypotheses to search supporting data evidence, report results.  6  Process Feedback  7 3. Knowledge Base  4. Hypothesis Generation  (Standard Experience-based Causal Models)  (Specific Experience-based Causal Models) CF 1 CF 2  CV 3  CF m CF n  CF 1 CV n  CF 2 CF j CF k  4  8  CF: Causal Factors Causal Variable/ Performance Measure  Select and customize standard causal models according to project context  CF 1 CF 2 CF 3 CF n  2. Quantitative Causal Model Analysis CV: Causal Variable  3  Primary Performance Measure  CV 1  CV 3  CV 2  CV n  2  1  5 Physical Process  Cost Quality Organization As-built  Change Environment Risk  1. Integrated Information Platform  Figure 3.7 Overall Schema of Proposed Diagnostic Approach As shown in Figure 3.7, the REPCON system supports nine views of a project, each of which describes an abstraction or representation of a significant dimension of a project. They are 1. Physical – what is to be built and site context (i.e. product view); 2. Process – how, when, where and by whom; 3. Organizational/contractual – project  84  participants, contractual obligations and entitlements, insurance, bonding, warranties, and evaluation of participant performance; 4. Cost – how much and from whose perspective; 5. Quality – compliance requirements and achievements for input and output products; 6. As-built – what happened, why and actions taken; 7. Change – scope changes, why and consequences for other views; 8. Environmental – the project’s natural and man-made environments; and 9. Risk – potential risk events, mitigation measures, risk assignment, and outcomes. An in-depth treatment of the features and benefits of the multi-view application system is presented in Russell and Udaipurwala (2004). Given the focus on time performance in this thesis, the product, process, organizational/contractual and asbuilt views are the key ones for which quantitative and experience-based causal models are required in order to search for evidence to help explain project performance. Although the development of these views cannot be claimed as a contribution of this thesis, more detailed aspects of these four views are described hereinafter in order to provide the reader with a clear overall view of the diagnostic approach. The physical view consists of two main parts. One is Physical Component Breakdown Structure (PCBS) (Figure 3.8) in which a project’s physical components can be expressed and organized in a hierarchical structure with up to six levels (system, element, sub-element, sub-sub-element, content, and material). As well, spatial or procedural locations can be described using location set, location, and sub-location constructs. More importantly, for each component the user can flexibly define desired attributes and assign corresponding planned and actual values to describe it. Physical components are mapped onto locations through component attribute values. As well, a component can also be associated with activities, records, pay items, changes, risk issues, etc. in other views. This association feature is central to the diagnostic approach. The second part of the physical view is the Drawing Control system, which is responsible for managing the hundreds of drawings used to portray a project (Figure 3.9). In the drawing control system, all drawings are listed and can be sorted in terms of different data fields. For each drawing listed, useful information for diagnosing performance such as date received, date issued, number of sets received and so on are easily accessible. Similar to PCBS components, each drawing can be associated with relevant activities, pay items, records, or problem codes in other views.  85  Figure 3.8 PCBS in Physical View In the process view, the user can define different calendars, resources, project phases and activities (Figure 3.10). From the perspective of this thesis, the most important part in the process view deals with defining a network of activities to represent how the project is to be constructed. Through the Activity List Interface (Figure 3.11), the user can use a hierarchical structure to define and organize activities, and for each activity a lot of information such as production rate, predecessors, successors, date constraints and activity attributes can be specified, and activity dates (early/late dates) can be computed. Each activity can also be associated with various components in other project views, e.g. PCBS and drawings in the physical view. As noted previously, such associations are helpful for diagnosing construction performance in terms of searching out supportive or corroborative data evidence.  86  Figure 3.9 Drawing Control System in the Physical View  Figure 3.10 Project Phases, Resources in the Process View  87  Figure 3.11 Activity List Interface in the Process View In the organizational/contractual view, participants can be defined and organized in a hierarchical structure with two levels, participant class and participant (Figure 3.12). For each participant class, the user can define attributes relevant to that class, and they can be selectively inherited by all the participants in that class. For each participant, the user can also flexibly define specific attributes and assign quantitative, linguistic, Boolean, or date values to them, and evaluate the participant in terms of performance on schedule, quality, cost, safety, relationship and/or communication. Again, associations can be forged with the components in other project views. A process view requirement is that a participant in the form of a responsibility code be assigned to each and every activity. The As-Built view consists of two main parts. One is the records management system in which various types of records (e.g. photos, video clips, letters/memos, shop drawings, site instructions, request for information, change orders and so on) generated during the construction phase can be listed and organized along with relevant meta data. The actual records themselves are not stored within the system. Each record can also be associated with relevant activities, PCBS components, problem codes, drawings, etc. (see Figure 3.13).  88  Figure 3.12 Organizational/Contractual View  Figure 3.13 Records Management in the As-built View  89  The other major component of the As-Built view is Daily Site Data (Figure 3.14), in which an activity’s actual daily status (e.g. started, finished, on-going, idle, postponed, started and finished), as well as problems encountered, work force and equipment status can be recorded. Overall site environment data including weather conditions and daily work force data can be recorded as well. The As-Built view is a very important source of data evidence in support of explanations for unsatisfactory performance. Complementing this source of evidence is the physical view which can provide insights on scope changes (actual vs. planned component attribute values) and the timely (or untimely) delivery of accurate drawings.  Figure 3.14 Daily Site Data in the As-built View  90  3.6.2  Quantitative Causal Model Analysis (Component 2, Figure 3.7)  The second component in the overall schema (Figure 3.7) deals with the analysis of quantitative causal models in order to narrow the focus down to those causal variables that experienced a variance, i.e. given a detected unsatisfactory performance, the user through this component can determine what causal variables need to be further explained. This component is only applicable to quantitative performance measures (e.g. time and cost), since as discussed previously quantitative causal models don’t exist for other performance measures like safety and quality. For cost or time performance at a specific point in time during the construction phase, the actual total cost and the project’s overall actual time performance to date would be compared with the as-planned ones to detect performance variances at the project definition level. Then by using the predefined quantitative causal models (e.g. scheduling and related analysis routines), variances associated with the causal variables in the quantitative models can be determined. In the causal variable variance analysis process, the data used is from the corresponding views in the aforementioned integrated information platform (illustrated by process line 1 in Figure 3.7). For time performance, activity dates in the process view are of the most interest. The design and implementation of the quantitative causal model component as it relates to project time performance along with relevant user interface issues and challenges involved are described in detail in Chapter 4.  3.6.3  Knowledge Base Composed of Standard Experience-based Causal Models (Component 3, Figure 3.7)  As no further quantitative causal models exist with which to explain the variances associated with the causal variables detected using component 2, in order to determine which causal factors contribute to these variances, use now must be made of construction expertise articulated in the form of experience-based causal models, which are only valid for the situations specified by the user (e.g. project phases, work type, etc.) As expertise grows with time, leading to refinements in understanding, adjustments can be readily made to the causal models. Therefore, it is desired to separate the function of archiving experience-based knowledge from its use, since it will give users more flexibility to  91  quickly construct hypotheses according to their own experience by modifying the saved company-wide (or individual user) standard casual models without altering them on the knowledge management or standards side of the system. The topic of who has the right to, and the protocols involved in changing standards, although of great importance in practice, is beyond the scope of this thesis. Thus, the third component of the overall diagnostic framework, i.e. the knowledge-base component, has the primary responsibility for saving standard experience-based causal models. In this component, the user can select data fields from the integrated information platform (process line 2 in Figure 3.7) to construct standard experience-based causal models comprised of multiple causal factors drawn from the data fields used to represent the different project views. As described later, a set of operators combined with some relatively simple syntax is applied to the selected data fields to allow users to express the threshold states of the causal factors. Having considered that as many standard causal models as the users desire can be defined and saved in this component, proposing a way to effectively organize them for later use is very important. Inspired by case-based reasoning, and noticing that meaningful reasoning about performance requires consideration of relevant context, a small set of meta data fields were chosen as attributes to characterize the individual standard experience-based causal models as to their domain of applicability. These data fields correspond to model description, relevant performance measure/causal variable (variance type), project type, project phase/sub-phase, and work type. The primary function of such information is to assist with the automated selection of the appropriate causal models from the standards side for developing hypotheses for specific cases. Since the various modeling components (e.g. activities) have different attribute values (e.g. different sensitivity to precipitation, temperature or design changes, etc.), conditions as to the applicability of one or more causal factors in the standard experience-based causal models can be defined for each causal factor, through which checks can be conducted to automatically filter out factors that are not applicable on the project side.  92  3.6.4  Hypothesis Generation for Specific Experience-based Causal Models (Component 4, Figure 3.7)  The fourth component, hypothesis generation, is responsible for making use of the saved standard experience-based causal models to generate likely explanations for the causal variable variances detected in the second component. After obtaining the causal variable variances (process line 3 in Figure 3.7), this diagnostic module is designed to automatically select relevant standard experience-based causal models according to the characteristics and attributes of the activities with variable variances (process line 4 in Figure 3.7). As part of this process, the user can also specify thresholds for different kinds of causal variable variances to choose the ones of interest for further diagnosis. For example, as to an activity’s duration variance, the user might be only interested in explaining those duration variances larger than 5 days. Then through the specified thresholds, variances stratifying the criteria will be highlighted. After reviewing the experience-based causal models selected by the system from the standards side for those variances that exceed the threshold values specified by the user, if the user is not satisfied or does not fully agree with the models selected, he or she can modify the causal models by selecting different data fields directly from the first component (process line 5 in Figure 3.7) and clearly expressing their state values (basically threshold values for the causal factors, e.g. precipitation level). Otherwise, the automatically selected models constitute the hypotheses to guide the search for supporting data evidence.  3.6.5  Search and Report Data Evidence (Component 5, Figure 3.7)  This component is responsible for searching out supporting data evidence based on the hypotheses from the fourth component (process line 6 in Figure 3.7) and reporting the results. Upon reviewing the results of the search, the user can pass judgement on whether the evidence identified is sufficient for explaining the performance measure of interest. As stated previously, the search process may involve a number of iterations since if the results of a search don’t satisfy the user, the user can modify the hypotheses based on the results already obtained (feedback line 7 in Figure 3.7) in order to initiate a broader (or narrower) search. As experience is gained and knowledge accumulated, experience-based  93  causal models on the standards side can be edited to reflect this experience and knowledge (feedback line 8 in Figure 3.7). Detailed discussion on the implementation of this component and the previous two (third and fourth components shown in Figure 3.7) and their relevant challenges are presented in Chapter 5.  94  4  Quantitative Causal Models in Performance Variance Analysis  4.1  Chapter Overview  This chapter is focused on the design and implementation details of the second component of the causal modeling framework, namely Quantitative Causal Model Analysis (variance analysis) as shown in Figure 4.1. Topics treated are: (a) quantitative causal model cost variance analysis and associated issues; (b) quantitative causal model schedule variance analysis and associated issues; (c) generality of the quantitative causal model analysis framework; and (d) in-depth examination of the implementation of schedule variance causal modeling with an accompanying example.  5. Search and Report Data Evidence 6  7  3. Standard Experience -based Causal Models  4. Specific Experience -based Causal Models 4 CF1 CV 3  CF2  CF: Causal Factors  CF1 CV n  CF1 Causal Variable  CFj  CF2  8 CFn  CFk  CFn  2. Quantitative Causal Model Analysis 3  CV: Causal Variable  Primary Performance Measure 5  CV1  CV3  CV2  CVn  2  1 1. Integrated Information Platform  Figure 4.1 Component Responsible for Quantitative Causal Model Analysis  95  4.2  Introduction  In the construction industry, variance is usually defined as any actual or potential deviation from an intended or budgeted figure or plan as it relates to cost, time, resource usage or other measure (Popescu and Charoenngam 1995). More generally, variance analysis is relevant to the analysis of any arbitrary explicit or implicit function to determine the change in a dependent variable as a function of change in one or more independent variables. Hereinafter, in the context of this thesis, variance analysis refers to determining the deviation between planned and actual construction performance and relevant causal variables used to compute the performance result. Variance analysis is a long-standing concept in construction management and forms an integral part of performance analysis. However, to maximize the insights that can be generated from variance analysis, it is important to exploit fully the structure of the measure of interest such that all potential contributors to variance are identified. In order to provide the reader with as complete a view as possible on the use of quantitative causal models in performance variance analysis, cost variance analysis is first explored. This will help lend support to the assertion of generality of the causal modeling framework vis-à-vis quantitative causal model analysis. Although the primary focus of the thesis is on time performance, the analysis structure adopted allows for the treatment of different levels of granularity or definition of basic variables (e.g. activity duration vs. scope of work, productivity, resource usage rate) and both explicit and implicit performance functions, helping to ensure generality of the approach.  4.3  Quantitative Causal Models in Cost Variance Analysis  As one of the major tasks in construction cost control, figuring out cost variance in a timely manner during the course of construction can not only help to explain cost performance to date, but more importantly enable managers to direct attention to potential upcoming cost related problems. In what follows, quantitative causal models for cost performance from the perspective of a contractor are outlined after discussing how the cost performance baseline is prepared. A brief discussion is also given of the kind of experience-based causal models that would be needed to help explain reasons for 96  variance in a quantitative causal variable. As will be observed later, determining the structure of the quantitative models that underlie a performance measure helps to demonstrate how different performance measures are interconnected allowing the same experience-based causal models to be used for more than one performance measure, an advantage of the adoption of a causal modeling diagnosis framework.  4.3.1  Prepare Cost Performance Baseline and Measure Actual Cost  The cost performance baseline is usually obtained from the cost estimating function, which for different project development stages has different accuracy requirements. The Association for the Advancement of Cost Engineering (AACE) has assigned “class” numbers to the estimate classifications ranging from Class 1 to Class 5 (Table 4.1). Table 4.1 Different Classes in Cost Estimating (AACE 2007) Estimate Class Level of Project Definition (% of complete definition)  Class 5  Class 4  Class 3  Class 2  Class 1  0% to 2%  1% to 15%  10% to 40 %  30% to 70%  50%-100%  Low: High: Expected -20 ~ +30 ~ Accuracy Range -50% 100% Concept End Usage (Purpose of Estimate) Screening Capacity Factored, Methodology Parametric Models, etc.  Low: High: -15 ~ +20 ~ -30% 50% Study of Feasibility Equipment Factored, Parametric Models, etc.  Low: High: Low: High: Low: High: -10 ~ +10 ~ -5 ~ +5 ~ -3 ~ +3 ~ -20% 30% -15% 20% -10% 15% Budget Control or Check Estimate Authorization Bid/Tender or Bid/Tender Semi-Detailed Detailed Unit Detailed Unit Unit Cost with Cost with Forced Cost with Assembly Level Detailed Take-off Detailed Take-off Line Items  Such a classification has been widely accepted in the industry. As shown in Table 4.1 different methodologies are made use of to achieve the cost estimates with different accuracy. From the perspective of this thesis, only the cost estimates based on detailed unit costs and detailed quantity take-off (Class 2 and Class 1) have sufficient accuracy to serve as the cost control baseline.  The detailed or definitive Class 2 and Class 1  estimates involve a detailed breakdown of cost into a formal structure. Depending on a firm’s practice, this structure may correspond to a work breakdown structure which links component costs with associated work activities (adopted in this thesis for discussion  97  purposes), or a cost breakdown structure that is separate from a schedule breakdown. Simply put, the overall direct cost baseline = ∑n(Budget Unit Price * Planned Quantity of Input resource), in which n is the number of input resources involved. Planned quantities of input resources, which usually refer to labor, equipment, and material, are mainly determined by using estimated productivity rates and required output quantities obtained by carrying out detailed quantity take-offs from drawings or sketches. Then the overall planned cost should be distributed over the planned duration of the construction project for continuous monitoring during the construction course. Budgeted Cost of Work Scheduled (BCWS) is such a time-phased cost baseline against which the actual cost incurred to date could be examined to detect the cost variance. In fact, BCWS is just one element in the Cost and Schedule Control System Criteria (C/SCSC) which defines general procedures for monitoring cost performance of all projects and was fully implemented in 1980 by the Department of Energy (DOE), National Aeronautics and Space Administration (NASA), and Department of Transportation (DOT), and now is also being applied in both Canada and Australia (Popescu and Charoenngam 1995). In C/SCSC there are other elements such as Budgeted Cost of Work Performed (BCWP, also named as earned value), Actual Cost of Work Performed (ACWP), and Budgeted Cost at Completion (BAC). Figure 4.2 illustrates these elements at the project definition level. The curvilinear shape of BCWS reflects the observation that a planned workload usually is not evenly distributed over the whole time period of the project, i.e. more activities might be conducted in the middle stage than the beginning and late stages of the construction project. The variance between ACWP and BCWS is accounting variance (AV)= ACWP-BCWS, which is one measure of cost deviation to date, but of somewhat dubious value given the potential for including different scopes of work. For instance, a large positive value could indicate that the project is either far ahead of schedule or that a significant cost overrun rather than speedy progress has occurred. A more useful measure of cost performance is given by cost variance (CV)= ACWP-BCWP, which treats actual versus budgeted cost for the same scope of work. In C/SCSC the difference between BCWP and BCWS at a specific project time point called schedule variance (SV) is used to indicate the progress of a project, which defines time performance in monetary units through measuring the difference  98  between the work scope scheduled to finish and actually finished to date. But a more preferred way to indicate time performance is to measure duration variance between planned and actual duration of activities, the approach used later in this thesis – i.e. schedule variance (SV) is measured in time units Time Now  Cost in Dollars  BAC ACWP CV  BCWS  AV  SV  BCWP Time lag  0  T  Project Time  Figure 4.2 Cost and Schedule Control System Criteria Based on the previous discussion, the steps for preparing the time phased cost baseline can be simply summarized as: 1. determine output quantities through detailed quantity take-off from the drawings; 2. based on the output quantities obtained, determine required quantities of input material resources, and with estimated productivity rates to determine quantities of input labor/equipment resources; 3. use the detailed unit costs and estimated quantities of input resources to get the budget cost; 4. prepare a list of work items or activities to be implemented, and associate different input resources and their corresponding cost estimates with the activities; 5. develop an as-planned schedule by using the critical path method (CPM); and 6. based on the dates in the as-planned schedule, calculate the cumulative time-phased budget costs by adding up the budget cost of activities period by period, resulting in the BCWS curve in Figure 4.2. In the process of preparing a cost performance baseline, estimated productivity rates and unit costs for input resources might be assumed to be constant over the whole period of construction, what at best is a rather crude approximation of reality. 99  Correctly recording actual cost occurred during the course of construction is another crucial task for figuring out the cost variance at any point in time. From the perspective of a contractor, actual unit costs and actual quantities of input resources already used to implement each and every activity should be recorded, and actual work finished (i.e. actual output quantities) should be measured as well. As to the actual unit price, the data could be collected from the project accounting system that has account (cost) codes corresponding to the different resources used. Recording the data of actually used quantities of input resources and actual outputs could be done on a daily, weekly or biweekly basis. If the latter is adopted, the actual quantities of input resources and outputs are likely assumed to be evenly distributed over that time period. The same situation can be observed for productivity measures, and for describing productivity rates, typically an inverse unit, i.e. input/output, is usually preferred (e.g. mhrs/m2 is often used for slab construction productivity rate). Given these actual values, the actual cost of work performed (ACWP) to date could be obtained by multiplying the actual quantity of each consumed input resource by its corresponding actual unit cost, and then adding up the results. The time period could be since the last time period when a measurement was made, or from the beginning of construction.  4.3.2  Causal Diagram for Analyzing Cost Variance  Based on what has been discussed, a causal diagram having quantitative causal relationships for doing cost performance variance analysis from the perspective of a contractor is proposed (Figure 4.3). Table 4.2 is a cost variance analysis example for the slab construction activity in a high rise building which is assumed to have variable slab geometry over the height of the structure. This example is meant to help illustrate the quantitative causal relationships presented in Figure 4.3. The first layer (1. performance measure layer) in the diagram indicates the total cost variance that has occurred up to the present at the total project level or any work package level. If the variance detected is beyond a specific threshold, there is a need to explain why or from where such total cost variance came. The second layer (2. quantitative causal models layer) dealing with quantitative causal relationships should then be used to  100  Inputs  Total cost variance of a work item in previous layer  Permanent material cost variance  Material cost variance Temporary material cost variance  Work Items  Cost Performance (Total Project costs variance)  Hierarchical WBS Work package 1 Work package 2 Activity 1 Activity 2  Regular time labor cost variance  … … ...  Activity n … … ...  Labor cost variance  Work package j  Overtime labor cost variance  Output  Unit cost variance ($/unit) Quantity variance (units)  Waste material ......  Unit cost variance ($/unit)  Theft  Quantity variance (units)  Design errors  Unit cost variance ($/mhr)  Output (scope) variance (units)  Quantity variance (mhrs)  Weather  Quantity variance (mhrs)  Price inflation ......  Unit cost variance ($/day)  Shared resource cost variance  ......  Skill level  Productivity rate variance (mhrs/unit)  Unit cost variance ($/mhr)  Change orders  Input resource change Quantity variance (hrs or days)  Supplier change  Affect activity duration variance 2a  1.Performance measure layer  2b  2c  2d  2.Fundamental causal models layer 2. Quantitative causal models layer  2e  2f  3.Experience-based 3.Non-fundamental causal models layer  Figure 4.3 Causal Diagram Composed of Quantitative Causal Models for Cost Performance Variance Analysis 101  Table 4.2 Cost Variance Analysis Conceptual Example  Surface area  m2  -  Concrete volume  m3  Concrete Permanent Steel Temporary  Forming materials  Forming/stripping/re-shoring Productivity rates Rebar Concrete placement/finishing Total material cost  Labor quantity/cost  Regular time labor inputs  Rebar Concrete placement/finishing Forming/stripping/re-shoring  Overtime labor inputs  Rebar Concrete placement/finishing -  Total labor cost Prod. for floor area Dur. for all tasks Shared resources  -  Cranage  -  Flyforms  Total shared resource cost Total cost=M+L+S Total unit price Cumulative output Cumulative cost Cumulative Duration  m3 $/m3 tonnes $/tonne m2 $/m2 $ mhrs/m2 mhrs/tonne mhrs/m3 mhrs $/mhr mhrs $/mhr mhrs $/mhr mhrs $/mhr mhrs $/mhr mhrs $/mhr $ mhrs/m2 days hrs or days $/day hrs or days $/day  -  $  -  $ $/m2 m2 $ days  102  Variance  Actual  Floor n Planned  Unit cost  -  Forming/stripping/re-shoring  Input  Quantity/  Actual  Component  Planned  Category  Sub-category  Slab Material quantity/cost  Output  Measure  Floor 1  Variance  do the variance analysis work. For example, for the work package of slab construction as shown in Table 4.2, if the variance of cumulative cost (i.e. total cost up to present, the second to last row in the table) for the finished slabs (i.e. floor levels 1 to n) is beyond the determined threshold, then the variance could be attributed to the variances of different causal variables in the available quantitative causal models. The second layer of quantitative causal models can be broken down into a number of sub-layers. Sub-layer 2(a) corresponds to attributing the total cost variance detected in layer 1 to different total cost variances associated with different work packages of the whole project and different activities within one work package. This part is similar to Roth and Hendrickson (1991)’s model. As to the slab construction example, given that a cumulative total cost variance has been found, it must be quantitatively allocated to the total cost variance associated with the construction of each floor slab (i.e. the variances denoted in the fifth to last row in Table 4.2, “Total cost = M(material) + L(labor) + S(shared resource)”). Sub-layer 2(b) indicates that each and every total cost variance identified in the previous sub-layer can be further decomposed into total material, labor, and shared resource cost variance. Here labor can also represent a crew, which as a group might involve workers from different trades. It should be pointed out that these three types of input resource are not necessarily involved in all kinds of activities (e.g. for excavation activity, no input material is required). For the slab construction example, the total material, labor, and shared resource variances for the slab construction activity on different floors can be identified respectively as seen in the corresponding (greyed) rows of Table 4.2. Sub-layer 2 (c) then seeks to help explain the total variances of input material, labor and shared resources by attributing them into the variances associated with different components under different sub-categories of these types of input resource. As seen in Figure 4.3, for material, it can be categorized into permanent (e.g. concrete and steel) and temporary materials (e.g. forming materials), and for labor (e.g. labor required for the tasks such as forming/stripping/re-shoring, rebar, and concrete placement/finishing for the slab construction example), it can be categorized into regular time and overtime inputs. The next sub-layer 2 (d) makes use of another quantitative causal model Total cost =  103  Unit cost * Quantity to determine for each component in the previous sub-layer why the cost variance occurred or where it came from, i.e. for each component to determine if the cost variance is from its associated unit cost variance or input quantity variance, or both. For example, assuming a total material cost variance was identified for the slab construction at floor level 1, by browsing the corresponding rows in Table 4.2 the user is able to know whether quantity variance and/or unit price variance of the relevant input materials (concrete, steel, and forming materials) caused the total material cost variance. It is also observed that in this sub-layer the labor and shared resource quantity variances definitely have a relation with the duration variance of the relevant activity, but since one activity might involve more than one task and the tasks could overlap in time, it cannot be said that the activity’s duration variance is the simple summation of all relevant labor quantity variances. For example, as to the slab construction activity example shown in Table 4.2, the duration variance of the activity on each floor can be identified as seen in the row of “Duration for all tasks”, but the variance is unlikely to be equal to the summation of the labor quantity variances identified for the tasks of forming/stripping/reshoring, rebar, and concrete placement/finishing. However, one observation is that the activity duration variance here is also indicated in the causal diagram for schedule variance analysis to be discussed later on. This implies that time and cost performance measures are interconnected and some of the experience-based causal models involved in helping explain activity duration variance could also be used for cost variance analysis. As stated previously, in cost estimating, the quantity of any input resource is primarily determined by using an output quantity obtained through detailed quantity takeoff from drawings. Therefore it is reasonable to infer that any change in output quantity will fundamentally cause an input resource quantity variance. A causal variable output quantity is illustrated in the sub-layer 2(f). As an illustrative example, the output quantity variance for the slab construction activity (e.g. slab surface area, concrete volume) can be seen in the relevant rows in Table 4.2. Besides output quantity, for determining the quantity of labor resource, the estimated productivity rate is also a quantitative causal variable, which is illustrated in input sub-layer 2(e) (for simplicity here assume the same productivity rate prevails in regular time and overtime). Productivity rate variance can be measured for tasks and/or an activity directly. It depends on what level of detail  104  estimating is done and at what level of detail productivity is measured on site. For the slab construction activity example, productivity rate variances for its relevant tasks (i.e. forming/stripping/re-shoring, concrete placement/finishing, rebar) and for the activity (productivity rate for floor area) are illustrated in Table 4.2. Again, there must be a relationship between them, but in practice it is not easy to quantitatively determine the relationship as the tasks might overlap in time. For the causal variables to which no further quantitative causal relationships are attached, as illustrated at level 3 (the experience-based causal models layer) in Figure 4.3, user defined experience-based causal models are constructed and used to help further explain the causal variable variances by searching out supporting evidence from the project’s database. For instance, material quantity variance could be explained by causal factors such as theft and wastage of material, and unit cost variance of a shared resource (e.g. crane) might result from rent price inflation or supplier change, etc. It is found that there is large general agreement between the causal diagram shown in Figure 4.3 and other causal diagrams presented in the decision support systems literature reviewed in terms of making use of several layers of quantitative causal models to do cost variance analysis. But as shown conceptually in Figure 4.3, with the assistance of user defined experience-based causal models it is possible to do a more and comprehensive cost performance analysis, which can be regarded as an advantage of the diagnostic approach proposed in this thesis. Using time as the representative performance measure to prove the workability of the overall structured diagnostic process and user-defined experience-based causal models will be discussed in detail in the next chapter.  4.4  Challenges for Cost Variance Analysis  Although in theory a project’s or work package’s total cost variance can be broken down into a series of sub variances associated with the causal variables included in the quantitative causal relationships, there are some practical challenges which are briefly highlighted here for completeness. The challenges associated with cost variance analysis include:  105  (a) Detecting the contribution of indirect cost variance. It is observed that in preparing the causal diagram (Figure 4.3) for cost performance variance analysis only direct cost is considered, but in practice indirect cost actually accounts for a certain part of total cost, which usually consists of job overhead and general overhead. General and job overhead respectively refers to corporation general expenses allocated to a given project and costs for supervision, site utilities, facilities, insurance, interest, penalty and bonuses, etc. (Popescu and Charoenngam 1995). In practice, planned indirect cost is not often allocated to different activities, and further it is rare to see that actual indirect costs incurred are recorded against ongoing or finished activities. As a result, figuring out indirect cost variance up to time now during the course of construction is difficult. (b) Detecting cost variance of shared resources for an activity. As an example, on a building construction site, the tower crane(s) is normally shared by more than one activity. It is very difficult to develop a detailed usage plan for such a resource, and even more of a challenge is to record how these activities shared the crane (i.e. how much time or part of the unit of the crane is used by a specific activity). As a result, it is impossible to accurately determine the relevant variance for shared resources at the individual activity level. Rather, the variance analysis has to be conducted at a more coarse grained level – e.g. for all of the activities/tasks that comprise a work package. (c) Using average actual value for doing variance analysis could lead to inaccurate findings. Variance is detected by comparing a causal variable’s planned value, which is very likely an average value (e.g. an estimated production rate), to an actual value, which in real practice tends to fluctuate over time (e.g. actual production rate). Since from the perspective of a contractor it would be unnecessary to make comparisons on a daily or real time basis, practitioners usually just record the overall actual outcome for a specific time period (e.g. obtain the actual production rate by dividing the actual output quantity finished by the actual time used since the last time period when a measurement was made, which in fact is an average actual production rate), but not the actual result on a day by day basis. This practice could conceal underlying problems causing the unexpected fluctuations.  106  4.5  Quantitative Causal Models in Schedule Variance Analysis  Similar to cost variance analysis, schedule variance analysis refers to figuring out the deviation in time units between the overall as-planned and as-built (actual) schedule, and further finding out the source(s) of the deviation. Doing this in a timely way can help managers interpret time performance up to time now, and offer them an opportunity to take correct corrective actions to bring the project back on schedule. Obviously, the analysis starts from comparing the as-planned schedule with the as-built schedule, so in the following sub-section of the thesis, the preparation of as-planned and as-built schedules will be introduced which are integral components for identifying the relevant quantitative causal models.  4.5.1  Preparation of As-planned and As-built Schedule  The as-planned schedule represents the contractor’s plan of the work in order to meet the time related requirements of the contract. But a reality of most projects is the encountering of various problems, some anticipated, others not. As a result, the plan for the remaining work could be changed from the original one in order to fulfill the contract target. In this sense, hereinafter the as-planned schedule, also titled as the target schedule, refers to (1) the original schedule generally developed prior to the beginning of the construction phase, or (2) an interim project schedule, both demonstrating the logic relationships, durations, and expected start/finish dates for all unfinished activities as of the progress date (i.e. data date), including project start. In overview terms, the procedure for developing an as-planned schedule can be summarized as: (1) Prepare a list of activities which are usually organized under a Work Breakdown Structure (WBS) for the project; (2) For each activity in the list, determine its planned duration based on the quantity to be finished and expected production rate; (3) Establish the logic relationships (Finish-Start: FS, Finish-Finish: FF, Start-Finish: SF, Start-Start: SS) among the activities; (4) Obtain early and late start/finish dates along with the free and total floats for the activities using the CPM algorithm. If the schedule has been resource loaded, then there is a need to make sure that on a specific day the planned quantity required of one type of resource would not exceed the planned available 107  quantity. An iterative process, i.e. change the duration of some activities and/or logic relationships among some activities and then recalculate the schedule, may be required to meet resource requirements, intermediate milestone requirements, and a project’s contractual duration. In practice, some assumptions are always involved in preparing the as-planned schedule, which might affect the schedule variance analysis. For example, an activity’s duration is usually assumed to not include interruptions (i.e. planned duration is equal to planned working time, and no idle time is treated.) Moreover, given that explicitly expressed logic relationships among activities are met, the implicit predecessors or requirements for starting the activities in the as-planned schedule are assumed to be met readily, thus they are not depicted in the target schedule in order to avoid clutter. Another assumption for doing schedule variance analysis is that the as-planned schedule is of sufficient quality and detail at the activity level that it is feasible in terms of resource requirements and activity sequencing. In order to do variance analysis for activity detail level schedules, correctly recording the actual start/finish dates and actual duration for each completed activity, and time to complete for in progress activities is necessary (one operating assumption of the thesis is that collecting as-built time performance is at the same level of detail as asplanned, although it is observed that in practice actual dates at a more summary level may be recorded directly). With such information recorded on the construction site, as-built schedules can be generated. If the project’s updated planned finish date is found to be later than the one originally required, under ideal condition the durations and/or logic relationships of the unfinished activities would be modified to guarantee the original schedule duration will be met. The updated or as-built schedule obtained then becomes the new target schedule. Schedule variance analysis can be carried out between a current as-built schedule and any target schedule whose progress date is earlier than that of the as-built schedule.  4.5.2  Causal Diagram for Analyzing Schedule Variance  Based on what has been discussed, the causal diagram consisting of quantitative causal  108  models for schedule variance analysis is proposed and shown in Figure 4.4. The first layer (1. performance measure layer) in the diagram determines if there is a need to invoke use of the associated quantitative causal models – i.e. there is a need to explain a variance between planned vs. actual time performance because the project’s target finish date or some intermediate milestone dates will be exceeded (for simplicity of discussion, from this point on attention will be directed at overall project duration - the process outlined is equally applicable to intermediate milestones). The second layer (2. quantitative causal models layer) deals with the quantitative causal relationships used to compute or determine values for the first layer of performance measure. For the case at hand, the basic relationships that describe time performance can be broken down into a number of sub-layers. Sub-layer 2(a) corresponds to the network of critical activities and paths which are the determinants of project duration, and the quantitative causal relationships shown in the following sub-layers 2(b) through 2(e) can be applied recursively to all critical activities. This is different from analyzing the deviation of a project’s total cost performance which is dependent on the cost performance of all activities. Sub-layer 2(a) helps to narrow the breadth of the diagnosis that has to be conducted for schedule variance analysis. However, it is observed that critical paths can shift from the planned critical paths and detecting such shifts can be very difficult. This issue will be discussed later under challenges. Sub-layer 2(b) indicates that the finish date variance (difference between planned finish date and actual finish date) of a critical activity as a causal variable is the main contributor to the project’s overall time deviation. In other words, the focus should be first directed at the finish date variance of the critical activities shown in sub-layer 2(a). Sub-layer 2(c) then helps to pinpoint which causal variables contributed to the finish date variance. Use is then made of the quantitative causal model Finish Date = Start Date + Activity Duration, plus the requirement that an activity’s finish date also needs to satisfy the logic relationships with its finish predecessors (herein finish predecessors refer to the predecessors with SF or FF relationship with the activity). Therefore, for an activity under examination its start date variance (difference between planned start date and actual start date), duration variance (difference between planned duration and actual  109  Network Level  Activity Level  Site access ......  Resource unavailable  Implicit Predecessors Start Date Variance  Critical Activities Time Performance (Variance of project duration or intermediate milestone requirement)  Act.j  Act.2 Act.1  Act.i  Act.3  2a  1.Performance measure layer  Finish Date Variance Act.n  Duration Variance Finish Predecessor (SF, FF) Variance  2b  2c  Start Predecessor (SS, FS) Variance  Production Rate Productivity (Unit/Mhr)  Working Time Variance  Resource Usage Rate (Mhr/Day)  Permit unavailable  Low skill level Excess rain ......  Unclear site instruction  Quantity Variance (Unit)  Broken equipment  Idle Time 2d  2. 2.Fundamental Quantitative causal models layer  ......  Delay in response to RFI  2e  3.Experience-based 3.Non-fundamental causal models layer  Figure 4.4 Causal Diagram Composed of Quantitative Causal Models for Schedule Variance Analysis  110  duration), and finish predecessor variance (for a SF predecessor it refers to the difference between the predecessor’s planned start date and actual start date, for a FF predecessor it refers to the difference between the predecessor’s planned finish date and actual finish date) can be regarded as the primary causes for its finish date variance. The next sub-layer 2(d) seeks to use further quantitative causal models to explain the identified variances at the previous sub-layer. The actual start date of an activity with start predecessors (herein start predecessors refer to the predecessors with SS or FS logic relationship with the activity) can be affected by the actual start date of its SS start predecessors and the actual finish date of its FS start predecessors. Thus, knowing the start date variance of its SS start predecessors and the finish date variance of its FS start predecessors (i.e. start predecessor variance) can help explain the activity’s start date variance. Further, as said previously an activity may have many implicit predecessors (e.g. access availability, permits and drawings availability, etc.) that are usually assumed to be readily satisfied. But in reality, these implicit predecessors often constitute the actual causes for the identified start date variance indicated in sub-layer 2(c). Therefore, judging whether one or more implicit predecessors caused an activity’s start delay is of the same importance as figuring out its start predecessor variance. As to the duration variance at sub-layer 2(c), actual duration is defined to consist of actual working time plus idle time encountered (i.e. unexpected interruptions), and in the as-planned schedule an activity’s planned duration usually is its planned working time, i.e. as said previously no scheduled idle time is assumed to be there, therefore Duration Variance = Actual Duration – Planned Duration = (Actual working time + Idle Time) – (Planned working time) = Working time Variance + Idle time. Thus, working time variance and unplanned idle time can be regarded as the causal variables for the identified duration variance illustrated in sub-layer 2(c). Sub-layer 2(e) makes use of another quantitative causal model to analyze the working time variance identified at sub-layer 2(d), which is an activity’s planned duration (working time) D (days) which equals scope Q (e.g. m3) divided by the product of productivity P (e.g. m3/mhr) and resource usage rate R (e.g. mhr/day), or production rate Pr (e.g. m3/day) which is the product of P and R (i.e. D=Q/P*R=Q/Pr). In this sense, variance associated with the quantity to be finished, resource usage rate, productivity or  111  production rate can be thought of as the possible causes for working time variance. For the causal variables to which no further quantitative causal models are attached, practitioners still seek to find the causes of the variances encountered. As discussed previously in Chapter 3, in the diagnostic approach user defined experience-based causal models will be used to help search out evidence in support of further explaining the identified causal variable variances. These experience-based models are illustrated at layer 3 in Figure 4.4. For instance, if implicit predecessors are confirmed to exist, the question becomes which ones (e.g. poor site access, permit or resource unavailable). If idle time is identified, what are the causes (e.g. broken equipment, delay in response to request for information)? Similarly, many causal factors (e.g. low labor skill level, excess rain and unclear site instructions) could help explain production rate variance, and along with the causal factors help explain quantity variance (e.g. change in PCBS) which in turn can help explain working time variance. Compared with the previously discussed causal diagram for cost performance (Figure 4.3), in essence the causal diagram for schedule variance analysis (Figure 4.4) is quite similar. Stated another way, the structured diagnostic process with quantitative causal models works for both cost and time, i.e. the total variance identified on the performance measure layer (layer 1) can be attributed to various causal variables through a series of quantitative causal models, and the causal variables experiencing variances could be rationally regarded as contributing to or causing the total variance. The structured diagnostic process also features a third layer (layer 3) for both performance measures where users can apply their expertise in term of experience-based causal models in order to seek out the likely actual causal factors for variances identified in the quantitative causal models layer (layer 2). And, some of the experience-based causal models may be applicable to both time and cost performance, while others may only be relevant to a single performance variance. Nevertheless, the form adopted for these models is equally applicable to both performance measures.  4.6  Challenges for Schedule Variance Analysis  Under the guidance of the causal diagram (Figure 4.4), a project’s overall time deviation  112  can be attributed to different variances associated with the causal variables in it. Nevertheless, some challenges need to be addressed which include: (a) It can be difficult to identify actual critical activities in a current schedule, i.e. a project’s time performance is affected by actual critical activities which might be different from the planned critical activities because durations and/or logic may have changed. For instance, as shown in Figure 4.5, Act. 1, 2 and 4 are the planned critical activities. But it is noticed that in the as-built schedule, Act.3 has consumed all of its float and more, which makes it a critical activity instead of Act.2. Thus, it is desired to explain the time performance of Act. 3, but not Act.2. However, because one only really knows the actual criticality of any activity on the progress date of the schedule (Winter 2004), activities that form the actual critical path can only be identified by re-computing the schedule day by day, which in fact is difficult to achieve. Therefore, as a compromise measure, besides focusing on the planned critical activities, near critical activities, which have a relatively low amount of total float and thus have the possibility of impacting project duration, should be examined in the schedule variance analysis process. This feature was incorporated in the proposed diagnostic approach.  As-planned schedule As-built schedule  Act.2  Progress Date Act.4  Act.1  Act.3  Figure 4.5 Shift of Crticial Activity (b) Some causal variables such as productivity, resource usage rate or production rate may not be measured frequently or if at all, and in addition they are very likely to fluctuate with time to time. As a result, attempting to identify variances on a daily basis is extremely difficult, if not impossible, and except for unusual situations, simply not required. Having considered this, it may be sufficient to treat working time variance as a causal variable with no further quantitative causal relationships, similar to the causal variable of implicit predecessors and idle time. Stated another way, one could choose to work directly with a user defined experience-based causal model for diagnosing activity 113  working time variance without consideration of its constituent causal variables such as productivity or resource usage rate. This observation speaks to the need for a flexible implementation of Figure 4.4. (c) As said previously, in practice implicit predecessors are always assumed to be met readily and in order to avoid clutter they are not depicted in the as-planned schedule. But the real situation is likely very different, thus to determine if there exist implicit predecessors which can actually affect an activity’s actual start date is a challenge. In order to resolve this problem to some extent, one underlying assumption made in this thesis is that for activities already started, their start predecessors’ logic relationships (i.e. SS or FS) defined in the as-planned schedule should be followed in the actual schedule. Given this assumption, by examining the relevant actual/planned dates of the activities it can be determined whether there exist implicit predecessors or planned start logic relationships (i.e. SS or FS) have been breached. Figure 4.6 shows a simple example which can help understand what has been said (left for future work is consideration of user imposed float and date constraints – e.g. for procurement activity chains, free float constraints may be imposed to avoid the premature start of procurement activities). Both activities Act.3 and Act.4 have two start predecessors (Act.1 has FS relationships with them, and Act.2 has SS relationship with Act.3, and Act.5 has SS relationship with Act.4). As shown in the as-built schedule in the figure, Act.1 started and finished on time, but Act.2 and Act.4 started two days later than planned, and Act.3 and Act.5 started one day later than planned. Given the aforementioned assumption (i.e. in the as-built schedule for Act.3 it should follow its planned start logic relationships with Act.1 (FS) and Act.2 (SS)), Act.3 should have not started earlier than Act.2’s actual start date, but it did. In other words, Act.3 didn’t follow the planned start logic relationships and implicit predecessors cannot be confirmed as the reason for not following the planned start logic relationship. But for Act.4, given it should follow its planned start logic relationships, it should have been able to start one day earlier than its actual start date, but it didn’t. In this case, it is reasonable to say that on that day there are likely some implicit predecessors encountered for Act.4.  114  Act.2 Act.3 As-planned schedule Act.1 As-built schedule Implicit predecessor  Act.6 Act.4 Act.5 Progress Date  Figure 4.6 Implicit Predecessors 4.7 Implementation of the Component for Making Use of Quantitative Causal Models In order to demonstrate the workability of the quantitative causal models layer in the structured diagnostic process, time has been selected as the representative performance measure to be implemented in the REPCON system as part of this research work. Given sufficient time and resources, cost performance could also be treated using the same overall structure along with relevant quantitative causal relationships. In the following discussion, a small project (Figure 4.7) is used as an example to illustrate the results of the implementation work and a number of issues that had to be addressed to ensure a comprehensive and accurate time variance analysis. The structure of this project mimics that of full scale projects, especially those that involve significant repetition of physical components and activities over multiple locations. In the REPCON system, hierarchical scheduling has been implemented, in which leaf activities can be grouped under one parent activity and parent activities can also be further rolled up into a parent activity at a higher level. Linear scheduling is also supported through multi-location activity structures. The conceptual and implementation issues along with the merits of hierarchical scheduling have been discussed previously by Udaipurwala and Russell (2000).  115  A34 A03  Location 3 A03-1 (6d)  Finish Milestone  A04 (7d) A05 (5d)  A03-2 (3d)  FF  A34 A03  Location 2 A03-1 (6d)  A04 (4d)  FS-1  A06 (3d)  A03-2 (3d)  A08 (7d)  A10 (2d)  SF3  A09 (10d) SS2  A34  A01 (5d) Start Milestone  A03  FS2  Location 1  A03-1 (6d)  A04 (5d)  A07 (1d)  A03-2 (3d)  A02 (8d) Derived activity  Legend:  Activity Name (Duration)  Logic relationships among activities. If not specified, it is assumed to be finish to start with 0 lag  Parent activity Leaf activity  Figure 4.7 Example Project for Illustrating the Quantitative Causal Model Variance Analysis for Time Performance and Implementation Details It can be seen from Figure 4.7 that in this example project activities take place at one or more of three different locations. The nomenclature used is A0x refers to the parent activity number, and A0x-y refers to the child activity number, if it exists. For example, A03-1 and A03-2 are the leaf child activities under the parent activity of A03, which along with another activity A04 are grouped under a derived activity A34 (a derived activity is basically a summary activity, with all of its properties being “derived” from parents and children below it. Logic does not pass through a derived activity.) The project is planned to start 01-Jan-2007. After modeling and scheduling the project, its planned finish date is 01-Mar-2007 and the planned early/late dates along with total and free floats are shown in Appendix A.1. The approach adopted in what follows is to describe features and implementation details of the quantitative causal model analysis for time performance, and treat where appropriate research challenges encountered and how they were resolved. It should be noted that all of the variance analysis process was designed and implemented as an integral part of the research described in this thesis.  116  4.7.1  Selecting the Target Schedule to Evaluate Current Time Performance  After the commencement of the project, actual (start/finish) dates for each activity can be recorded directly or obtained from the recorded daily site status (in REPCON under the As-built view daily site status also provides the opportunity to collect data useful for the experience-based causal analysis, as discussed later). At a frequency selected by the user, the schedule will be updated to see if the overall time performance to date meets the target or not. This corresponds to the first layer in Figure 4.4. For the example project, the first progress date is set to be 31-Jan-2007, one month after the planned start date. The actual daily status before that date for the relevant activities is given in Appendix A.2, and after updating the schedule the new planned finish date is found to be 09-Mar-2007 (updated activity dates are shown in Appendix A.3) and later than the project’s original planned finish date of 01-Mar-2007. The schedule variance analysis, which can be initiated from either the process view or the as-built view (Figure 4.8), involves comparing the active schedule with the target one. Therefore the first step is to select a target schedule in the pop-up window shown in Figure 4.9. As said previously, the active schedule’s progress date must be later than the progress date of the target schedule unless the target schedule is the original base schedule which normally does not have a progress date (if this is the case, the window shown in Figure 4.10(a) will come up to let the user confirm that the target schedule selected is the initial base schedule), otherwise the comparison doesn’t make sense and warning information will be given (Figure 4.10(b)). In the select target project window (Figure 4.9), the finish date of the active schedule (e.g. PROJ03) can be seen and examined against the finish date of the target schedule selected (e.g. PROJ01) to decide if it is necessary to proceed with the schedule variance analysis. The comparison bar chart diagram (Figure 4.11) is then shown after selecting the target schedule in which the active schedule is above the target schedule.  117  Figure 4.8 Menu Options to Access Schedule Variance Analysis Function  Figure 4.9 Window for Selecting Target Project  118  (a)  (b) Figure 4.10 Suggestive Information for Selecting Right Target Schedule  Active schedule Target schedule (cross-hatched)  Figure 4.11 Comparison Bar Chart Between Active and Target Schedule  119  4.7.2  Selecting Activities to Diagnose  Corresponding to the sub-layer 2(a) in Figure 4.4, it is observed that after selecting a target schedule and detecting the total time deviation to date, the next step is to select a subset of the critical activities to analyze, since in theory only actual critical activities can affect the overall time performance. But as discussed previously in the section of challenges for schedule variance analysis, identifying the actual critical activities is difficult (i.e. what is critical given performance to date), so near critical activities defined by the user in terms of planned total float (e.g. total float less than 5 days) should also be considered for variance analysis. More generally, considering that the user might also be interested in analyzing activities that take place at locations of particular interest or undertaken by some specific responsibilities (e.g. a particular trade), and/or during some specific time period, a function that allows the user to flexibly define such filters has been implemented (Figure 4.12). As to selecting the responsibility and defining the location range, drop-down lists (Figure 4.12(b) and (c)) are available to facilitate choices from the organizational and physical views, respectively. If with the help of these filters activities of interest still cannot be fully identified, in the target activities filter menu (Figure 4.12(a)) another option of ‘Activity Profile’ can be selected. It allows the user to define an ad hoc group of activities to focus attention on. For instance, the user can define a leaf activity profile which only includes the leaf activities of the example project (Figure 4.13). All the mentioned filters can be used together, i.e. the logic relationship between them is assumed to be ‘AND’. As to defining the time window filter (Figure 4.12(e)), it is worth noting that from the perspective of diagnosing time performance to date the maximum date range that can be specified should be between the progress date of the target schedule and that of the active schedule, because only in this time period have new events actually occurred which can help explain the new total time deviation identified since the target schedule’s progress date. Stated another way, only the activities in the target schedule, which should have been finished or are ongoing during the specified date range, or the activities in the active schedule, which actually have been finished or are ongoing during the date range should be the focus of the analysis. It is observed that in current popular and industrially accepted planning and control tools (e.g. MS Project, Primavera), when doing variance 120  (a) Menu Options for Selecting Target Activities  (b) Define Responsibility Code Filter  (d) Define Near Critical Activities Filter  (c) Define Location Filter  (e) Define Date Range Filter  Figure 4.12 Filters for Selecting Activities of Interest  121  (a) List of Available Activity Profiles to Choose  (b) Window for Defining Activity Profile Figure 4.13 Activity Profile Filter analysis, a time window filter can be set as well, but it is executed only based on the active schedule. As a result, in the time window specified activities that according to the target schedule should have started but actually didn’t in the active schedule will be filtered out or missed. From the time performance analysis perspective, obviously, such activities deserve attention as well. Ensuring that the activity selection is correct required 122  consideration of the various scenarios that could be encountered. To elaborate on the foregoing, Figure 4.14(a) shows different possible scenarios. In this figure T represents activities in the target schedule and A represents corresponding activities in the active schedule. T-n is the nth possible scenario in terms of the target and active schedule’s Progress Date of Target Schedule  Legend:  t-1 a-1-1 a-1-2 a-1-3 a-1-4 a-1-5 a-1-6  Target Activity finished  to do  Active/Actual Activity  Date i  Date j  t-2 a-2-1 a-2-2 a-2-3 a-2-4 a-2-5 a-2-6  finished to do  Progress Date of Target Schedule  Progress Date of Active Schedule  Progress Date of Active Schedule  t-3 a-3-1 a-3-2 a-3-3 a-3-4 a-3-5 a-3-6  T-1 A-1 T-2 A-2-1 A-2-2  t-4 a-4-1 a-4-2 a-4-3 a-4-4 a-4-5 a-4-6  T-3 A-3-1 A-3-2 T-4 A-4-1 A-4-2 A-4-3  t-5 a-5-1 a-5-2 a-5-3 a-5-4 a-5-5 a-5-6  T-5 A-5-1 A-5-2 A-5-3 T-6 A-6-1 A-6-2 A-6-3  t-6 a-6-1 a-6-2 a-6-3 a-6-4 a-6-5 a-6-6  Maximum date range  (a)  Date range of interest Maximum date range  (b)  Figure 4.14 Different Scenarios of Activities in Target and Active Schedule  123  progress date. A-n-m represents the mth possible actual activity scenario corresponding to the nth possible target activity scenario. Based on the date range specified (see Figure 4.14(a)) the activities of interest should include T-2, T-3, T-4 and T-5 and their corresponding activities in the active schedule. In addition, A-6-1 or A-6-2 and their corresponding activity T-6 in the target schedule should be included, since although T6 falls outside and after the time window, it has an actual state (i.e. it started earlier than scheduled, for whatever reason). For any date range specified within the maximum date range the possible scenarios are illustrated in Figure 4.14(b), which along with Figure 4.14(a) can help the reader understand the pseudo code shown in Table 4.3 for identifying the activities of interest in terms of the flexibly defined date range filter. As to the activities not having a match between the target and active schedules (i.e. activities in target not in active, or activities in active not in target), they can be flagged and reported on. Figure 4.15 shows an illustrative example of applying activity profile (leaf activities), date range filter (29-Dec-2006 to 31-Jan-2007), near critical activities filter (total float < 5d), and location range filter (Location 1 to 2) to the result of selecting target schedule activities to analyze for the example project (see Figure 4.11). It is noticed that although A03-1 at location 2 has not actually started yet in the date range specified, it is still filtered for diagnosis since according to the selected target schedule it should have started. If the same date filter is used in MS Project for the example project, as discussed previously this activity will be missed since its start date in the active schedule falls out the specified date range. In this diagnostic approach, the date filter is conducted based on both active and target schedules, which can help better identify target activities and thus could be regarded as a small improvement.  Target schedule (cross-hatched)  Active schedule  Figure 4.15 Result of Applying Filters to the Example Project 124  Table 4.3 Pseudo Code for Identifying Activities in Defined Time Window Do first for each activity in the target schedule (activity i, i=1,…,m) IF it has actual start and finish date THEN don’t select and mark it (e.g. T-1) ELSE IF it has actual start date and early/late finish date IF the early/late finish date >= the start date of the date range THEN select and mark it (e.g. T-2, T-3) ELSE don’t select and mark it END IF ELSE it has early/late start and finish date IF the early/late start date < the finish date of the date range and the early/late finish date >= the start date of the date range THEN select and mark it (e.g. T-4, T-5, t-2, t-3, t-4, t-5) ELSE don’t select and mark it (e.g. T-6, t-1, t-6) END IF END IF Do then for each activity in the active schedule (activity j, j=1,…, n) IF its corresponding activity in the target schedule has been selected and marked THEN select and mark it in the active schedule ELSE IF it has early/late start date (i.e. actually the activity has not started yet) THEN don’t select and mark it (e.g. A-6-3) ELSE IF its actual start date < the finish date of the date range and its early/late or actual finish date >= the start date of the date range THEN select and mark it in the active schedule (e.g. A-6-1, A-6-2, a-1-2,3,4,5, a-6-2,3,4,5) and also select and mark its corresponding activity in the target schedule. END IF  4.7.3  Determining the Variance of Schedule Causal Variables  Following the previous step directed at narrowing the focus down to the candidate activities to diagnose, the next step is to identify the variances relevant to the various causal variables depicted in sub-layers 2(b) to 2(d) in Figure 4.4. As for the causal variables in sub-layer 2(e) which were discussed in the section of challenges for schedule variance analysis, the assumption here is that actual daily data for productivity, resource usage rate or production rate is not being collected, so the quantitative causal variable analysis will be carried out only through to sub-layer 2(d). In other words, if working time variance is identified, user-defined experience-based causal models will be used 125  directly to analyze the variance. Nevertheless, it might be possible to attribute the variance to productivity related factors, resource related factors, or both, using data collected as part of the as-built view. As to examining quantity variance, it can also be included in the user-defined experience based causal models as a causal factor and examined by searching relevant data in the physical view (e.g. comparing as-built vs. asplanned attribute values for physical components associated with the activity of interest). Details on experience based causal models are discussed in the next chapter. After the user triggers the analysis process by using the menu option of ‘Find Schedule Variances’ (Figure 4.16(a)), the user interface for specifying the schedule variances of interest comes up first (Figure 4.16(b)). By using the options available, the  (a)  (b) Figure 4.16 Options for Finding Schedule Variances 126  user can decided whether early or late start/finish dates will be used in the variance analysis process. As to the method to calculate variance, there are two options ‘Target schedule - Active schedule’ and ‘Active schedule - Target schedule’. For the former one, negative and positive results respectively indicate the occurrence of delay and ahead of schedule, and for the latter option positive values represent delay in start, finish date or extended duration and working time. For the convenience of further discussion in what follows, all of the examples given are based on the option of early dates and variance equal to ‘active schedule - target schedule’. In the bottom half of the window (Figure 4.16(b)) is the causal diagram with multiple sub-layer quantitative causal relationships, which reflects the contents of Figure 4.4. The user can select anyone, some, or all of them at one time for purposes of analysis and to view the results. For the finish date of each activity in the target or active schedule, it exclusively has an early finish date or an actual finish date. However, two scenarios are impossible and can be ignored in analyzing the finish date variance. The first scenario is that an activity in the target schedule has an actual finish date while in the active schedule it has an early/late finish date; the second is that an activity in the target and active schedule has the same actual finish date (see T-1 and A-1 in Figure 4.14(a), since their presence for analysis would be screened out using the date range filter). The pseudo code for identifying finish date variance is given in Table 4.4. Table 4.4 Pseudo Code for Determining Finish Date Variance Do for each activity passing through the filters specified IF it has actual finish date, and the corresponding activity in the target schedule has early finish date (e.g. A-3-1 with T-3, or A-4-1 with T-4 in Figure 4.14 (a)) THEN finish date variance = actual finish date (active schedule) – early finish date (target schedule) ELSE it has early finish date, and the corresponding activity in the target schedule has early finish date (e.g. A-2-2 with T-2; or A-4-2 with T-4 in Figure 4.14(a)) THEN finish date variance = early finish date (active schedule) – early finish date (target schedule) and mark the finish date variance as “undetermined” by using symbol [ ]. END IF  127  From the causal diagram (Figure 4.4) it is observed that three causal variables, start date variance, duration variance and finish predecessor variance have impact on the finish date variance. For the start date of each activity in the target or active schedule, it is impossible that an activity in the target schedule has an actual start date but only an early start date in the active schedule. So there are only three possible scenario combinations, as described by the pseudo code for identifying the start date variance given in Table 4.5. Table 4.5 Pseudo Code for Determining Start Date Variance Do for each activity passing through the filters specified IF it has actual start date, and the corresponding activity in the target schedule has the same actual start date (e.g. A-2-2 with T-2; A-3-1 with T-3 in Figure 4.14 (a)) THEN start date variance= 0 ELSE IF it has actual start date, and the corresponding activity in the target schedule has early start date (e.g. A-4-2 with T-4; A-6-2 with T-6 in Figure 4.14(a)) THEN start date variance = actual start date (active schedule) – early start date (target schedule) ELSE it has early start date, and the corresponding activity in the target schedule has early start date (e.g. A-5-3 with T-5 in Figure 4.14(a)) THEN start date variance = early start date (active schedule) – early start date (target schedule) and mark this start date variance as “undetermined” by using symbol [ ]. END IF Further, because of the quantitative causal model that Finish Date = Start Date + Duration, thus duration Variance = finish date variance - start date variance. Figure 4.17 illustrates the relation among these variances, and the pseudo code for figuring out the duration variance is given in Table 4.6. It is noted that currently these three variances can be obtained as well in commercially available scheduling tools (e.g. MS Project, Primavera – see Figures 2.2 and 2.3), and hence they cannot be viewed as a contribution of this research. Nevertheless, their computation is needed for completeness of the current work. Start  As-planned activity  Duration S.V.  Finish  F.V.  Duration + D.V.  As-built activity Figure 4.17 Relation Among Start, Finish and Duration Variances 128  Table 4.6 Pseudo Code for Determining Duration Variance Do for each activity passing through the filters specified Duration Variance = Finish Date Variance – Start Date Variance IF its start date variance or finish date variance is marked as “undetermined” THEN mark this duration variance as “undetermined” by using symbol [ ]. END IF  In theory, if an activity’s finish date is also governed by finish predecessors (predecessors with SF or FF logic relationship with the activity), listing all of the activity’s finish predecessors according to the target schedule along with their corresponding variances can help the user judge to some extent if the finish predecessors’ variances have an impact on the finish date variance and if the planned logic relationships have been actually broken or not. The pseudo code for determining finish predecessor variance is presented in Table 4.7. Table 4.7 Pseudo Code for Determining Finish Predecessor Variance Do for each finish predecessor of an activity passing through the filters specified IF the finish predecessor has SF relationship with the activity of interest THEN show the finish predecessor’s start date variance as the finish predecessor variance ELSE the finish predecessor has FF relationship with the activity, and show the predecessor’s finish date variance as the finish predecessor variance END IF  In the next sub-layer 2(d), two causal variables, which can have an impact on the start date variance, are explicit start predecessor and implicit start predecessor. As to the explicit start predecessor, it refers to the predecessors formally defined in the target schedule in the form of a SS or FS logic relationship with the activity under analysis. Similar to a finish predecessor, in theory, these types of predecessor are the dominant determinants of the actual start date of their successors. Listing the start time predecessors along with their relevant variances can assist the user in evaluating if one or more of the start predecessors’ variances caused the start date variance. The relevant pseudo code for determining start predecessor variance is contained in Table 4.8. As to the finish and start predecessor variances, it is observed that in current commercially 129  available scheduling applications, the user can only find them by manually going through the activity list to identify the start and finish predecessors first, and then to see the predecessors’ relevant start and/or finish date variances. Compared with the diagnostic approach here being able to list the predecessor variances directly and automatically, the workload on the user is heavier and the efficiency is low. Thus, this could be viewed as an improvement to current scheduling tools. Table 4.8 Pseudo Code for Determining Start Predecessor Variance Do for each start predecessor of an activity passing through the filters specified IF the start predecessor has SS relationship with the activity of interest THEN show the start predecessor’s start date variance as the start predecessor variance ELSE the start predecessor has FS relationship with the activity, and show the predecessor’s finish date variance as the start predecessor variance END IF  As to implicit predecessors, another causal variable in sub-layer 2(d), they refer to those predecessors (events and relationships) that have not been explicitly included in the target schedule but which in fact can cause the delay of an activity’s start date. In determining whether they exist or not, the number of planned lag days along with the explicit planned start predecessor logic relationships (i.e. SS and FS) should be considered. With a SS logic relationship, the successor is supposed to be able to start on the same date plus the user specified lag value when its predecessor actually started, but with FS logic relationship, the successor is supposed to be able to start on the next date plus user specified lag value after its predecessor actually finished. Such difference (i.e. successor’s start date – predecessor’s finish date – lag – 1, or successor’s start date – predecessor’s start date – lag) is reflected in the pseudo code for confirming if implicit predecessors exist (see Table 4.9). Referring to the small example shown in Figure 4.6 and the relevant discussion in the section of challenges for schedule variance analysis can help better understand the pseudo code. For Act.3 in Figure 4.6, Min (Act.3’s actual start date-Act.1’s actual finish date-lag-1, Act.3’s actual start date-Act.2’s actual start date-lag) = Min (1, -1) = -1, which is <=0, thus no implicit predecessor can be confirmed to exist. But for Act.4, Min (Act.4’s actual start date-Act.1’s actual finish date-lag-1, Act.4’s 130  actual start date-Act.5’s actual start date-lag) = Min (2, 1) = 1, which is >0 and means Act.4 could have started one day earlier, but it didn’t. Thus, the presence of one or more is suggested for Act.4. If the value obtained is 0, it can indicate that in the active schedule the activity of interest actually completely followed the planned logic relationships with its start predecessors. The ability to infer if there exist implicit predecessors for the activities of interest can help the user develop deeper insights into time performance deviation. This ability is not available in current business scheduling applications and has not been discussed in the literature reviewed, and is thus viewed as a contribution of this research. Table 4.9 Pseudo Code for Determining Implicit Start Predecessor Variance Do for each activity (i) passing through the filters specified Do for each start predecessor (j) of activity i IF activity j has FS logic relationship with activity i, and is not a start milestone IF In the active schedule j has actually finished and i has actually started THEN record the value of (i’s actual start date – j’s actual finish date –lag – 1) ELSE IF In the active schedule j has actually finished and i has not started THEN record the value of (i’s early start date – j’s actual finish date –lag – 1) ELSE IF In the active schedule j has not finished and i has actually started THEN record the value of (i’s actual start date – j’s early finish date –lag –1) ELSE In the active schedule j has not finished and i has not started THEN record the value of (i’s early start date – j’s early finish date –lag –1) END IF ELSE activity j has SS logic relationship with activity i, or is a start milestone IF In the active schedule j has actually started and i has actually started THEN record the value of (i’s actual start date – j’s actual start date –lag) ELSE IF In the active schedule j has actually started and i has not started THEN record the value of (i’s early start date – j’s actual start date –lag) ELSE IF In the active schedule j has not actually started and i has started THEN record the value of (i’s actual start date – j’s early start date –lag) ELSE In the active schedule j has not actually started and i has not started THEN record the value of (i’s early start date – j’s early start date –lag) END IF END IF (After do for activity i’s all start predecessors, a list of values is available) Find the minimum value (MIN) in the list of the values recorded IF the MIN <= 0 THEN show “No, MIN” (means no implicit predecessors) ELSE show “YES, MIN” (means there exists implicit predecessors) END IF 131  For the case of duration variance, two contributory causal variables are idle time and extended working time as shown in sub-layer 2(d) in Figure 4.4. As discussed previously, Duration Variance = Actual Duration – Planned Duration = (Actual working time – Planned working time) + Idle Time. In REPCON for each activity it is possible to record its daily status (Finished, Idle, On-going, Postponed, Started, and Started & Finished) in the As-Built view. If this feature is used, then idle time can be explicitly identified. Compared with MS Project and Primavera which don’t have this feature, identifying idle time can help the user to understand better what caused an activity’s duration variance. The pseudo code for identifying the amount of idle time is given in Table 4.10. Table 4.10 Pseudo Code for Determining Idle Time as part of Duration Variance Do for each activity passing through the filters specified IF it has actual start date in the active schedule THEN Idle time = the number of days recorded as “idle” from the progress date of the target project IF the activity has no actual finish date in the active schedule THEN mark this idle time as “undetermined” by using symbol [ ] ELSE do not use symbol [ ] to mark this idle time END IF ELSE Idle time = 0 and mark it as “undetermined” by using symbol [ ] END IF Thus, extended working time = Actual working time – Planned working time (i.e. planned duration) = Duration variance – Idle time, and the relevant pseudo code is given in Table 4.11. Table 4.11 Pseudo Code for Determining Working Time Variance Do for each activity passing through the filters specified Extended Working Time = Duration Variance – Idle Time IF its Duration variance is marked as “undetermined” THEN mark this extended working time as “undetermined” by using symbol [ ]. END IF As part of the research process, the implemented prototype for doing schedule  132  variance analysis based on the foregoing pseudo codes was tested on several small project examples to make sure they work properly and yield the correct results. Compared with the example project shown in Figure 4.7, these example projects are relatively small. Thus for further illustration, the schedule variance analysis process is applied to the project depicted in Figure 4.7. To illustrate all of the foregoing variance determinations, Figure 4.18 shows the schedule variance analysis results for the leaf activities of the example project in the time period between the project’s start date (i.e. use original base schedule as the target schedule) and 31-Jan-07 (the active schedule’s progress date). The result can also be printed out in a hard-copy report (see Appendix A.4) by using the menu option of ‘Report’ shown in Figure 4.16(a). In the result, for 00-start milestone it can be seen that the project actually started one day ahead of the original planned project start date, i.e. its start date variance is –1. This is good news given using the analysis option of ‘Active Schedule – Target Schedule’ to compute variance values. Thus a negative number corresponds to a positive result. Because it is a milestone activity, no other causal variable variances (e.g. duration variance, idle time, etc.) are applicable to it. For A01, it can be seen that it is supposed to take place at location L1 and has total float of 1 day in the target schedule. It was actually finished with 1 day delay (finish date variance:1), which can be attributed to its start date variance of –1 (it started one day ahead of its planned start date which is good news) and undesired duration variance of 2 days, but it was not affected by its finish predecessors (as shown in Figure 4.18 it has no planned finish predecessor). For the unexpected duration variance detected, it can be totally attributed to the idle time (idle time:2) that occurred, but not extended working time variance which is 0 in this case. If no variance is found for an activity’s output quantity (which can be achieved in REPCON by checking the actual and planned values of physical components associated with the activity), 0 extended working time variance can help the user reasonably infer that the activity’s actual production rate is equal to the planned rate. In the column of start predecessor variance, it can be seen that A01 has only one start predecessor, which is 00-start milestone at location L1 (00[L1]) having planned FS logic relationship plus 0 lag ([FS0]) with it. The start predecessor’s finish date variance is –1 (milestone’s finish date variance is equal to its start date variance) that according to the planned FS logic relationship  133  Figure 4.18 Schedule Variance Analysis Result with Using Option of ‘Variance=Active Schedule-Target Schedule” (Active Schedule: Progress Date of 31-Jan-07; Target Schedule: Original Base Schedule)  134  could affect A01’s actual start date. Here star ‘*’ indicates that the corresponding start predecessor is a critical activity in the target schedule. Also for A01 it is noted that its implicit predecessor variance is “No,0”. As discussed previously about the pseudo code for determining implicit predecessor variance, ‘0’ (Min(A01’s actual start date-00’s actual start date-lag)=Min(29DEC06-29DEC06-0)=0, the actual dates can be found in Appendix A.3) in this case means A01’s actual start date fully complied with its planned start predecessor logic relationship, thus ‘No’ is used to indicate that no implicit predecessor can be confirmed to exist for A01. For activity A02, it is observed that its finish date variance of 1 day can be attributed to its undesired start date variance of 2 days and favorable duration variance of –1 day, which can be further explained by the undesired idle time detected as 2 days and the shortened working time of 3 days. Start milestone (00[L1][FS0]:*-1), which actually finished one day ahead of schedule, is the only one explicit start predecessor that can affect A02’s actual start date, thus if it had complied with the planned explicit start logic relationship, A02 should have been able to start 1 day earlier than its planned start date, but unfortunately it didn’t. The result shown in the implicit predecessor variance is “Yes,3”, here ‘3’ (referring to Table 4.9, Min(A02’s actual start date-00’s actual start date-lag)=Min(03Jan07-29DEC06-0)=3, the actual dates can be seen in Appendix A.3 and non-working days, i.e. Saturdays and Sundays, are not counted in the calculation) means according to the planned start logic relationship A02 should have been able to start 3 days earlier than its actual start date, but it didn’t. Therefore, the existence of implicit predecessor(s) is suggested for A02 (as indicated by ‘Yes’). For the next activity A03-1 at location L1, it can be seen that fortunately it started and finished on time, and no idle time and extended working time variance is identified. Also, no implicit predecessor can be confirmed to exist and it complied with its planned start predecessor logic relationship as well. For the activity A03-1 at location L2, it has an undetermined (indicated by the bracket [ ]) unfavorable 6 days start date variance, which helps the user know that in the time window specified this activity should have started according to the target schedule, but actually it has not up to the progress date of the active schedule. From the start predecessor variance, it is observed that A03-1 at L2’s start date variance is dominated by its start predecessor A34.03[L2][SS0] which has a 6 day start date variance. For activity A03-1 at location L3,  135  it is observed that it finished 9 days earlier (finish date variance:-9) than its planned finish date, which can be attributed to its favorable start date variance of –8 days and –1 day duration variance that resulted from the undesired 1 day idle time that it experienced and favorable 2 days shortened actual working time. A03-1 at L3 has two start predecessors (34.03.03-1[L2][FS0] and 34.03[L3][SS0]) which should determine its actual start date, which respectively has 6 days finish date variance and –8 days start date variance. The result shown in the implicit predecessor variance is “NO, -11” (Min(A03-1@L3’s actual start date–A03-1@L2’s finish date–lag–1, A03-1@L3’s actual start date–A03@L3’s actual  start  date–lag)=Min(25JAN07-08FEB07-0-1,  25JAN07-25JAN07-0)=Min(-  11,0)=-11, please refer to the pseudo code for determining implicit predecessor variance in Table 4.9 and activity dates in Appendix A.3). This means that according to the planned start predecessor logic relationships A03-1 at L3 should have only been able to start 11 days later than its current actual start date, but obviously it didn’t. Therefore, it can be reasonably inferred that the activity’s planned start predecessor relationship has been breached (i.e. out of sequence) and implicit predecessors cannot be confirmed to exist (indicated by ‘NO’). The causal variable variances for the next two activities A03-2 at location L1 and A04 at location L1 shown in Figure 4.18 can also be interpreted as per the discussion for other activities. As emphasized several times before, after identifying the causal variable variances the user still needs to know what factors caused the unfavorable variances such as idle time, extended working time and what implicit predecessors occurred. User-defined experience-based causal models to be discussed in the next chapter can offer the user some help in finding the answers by taking advantage of their experience-based knowledge. As mentioned previously, an interim project schedule comprised of the logic relationships, durations, and expected start/finish dates for all unfinished activities as of the progress date can also be used as a target schedule. Figure 4.19 shows the possible comparison scenarios between different active and target schedules for the example project. Now assume that work has progressed on and the actual activity daily status between 31-Jan-2007 and 28-Feb-2007 is recorded (see Appendix A.5), and the schedule updated as of the new progress date (28-Feb-2007) again. The planned/actual dates in the updated schedule are given in Appendix A.6. It is observed that the project’s new planned  136  Legend: Target Schedule  Active Schedule 4  2 1  Base Schedule (Original)  5 3  6  Schedule on 28-Feb-2007 (Interim)  Schedule on 31-Jan-2007 (Interim)  Schedule on 27-Mar-2007 (Final)  Figure 4.19 Possible Comparison Scenarios Between Active and Target Schedule finish date (16-Mar-2007) is later than the project’s original planned finish date and the planned finish date in the interim schedule as of 31-Jan-2007 as well. As an active schedule, the schedule as of 28-Feb-2007 can be compared with both of them as the target schedules (line 2 and 3 in Figure 4.19) to do schedule variance analysis. With the project going on, more interim schedules could be created, which can be compared with the original or any other interim schedule, until the project is actually completed. For instance, for the example project, activity daily status after the progress date of 28-Feb2007 is recorded until the project’s actual finish date of 27-Mar-2007. Appendix A.7 shows the daily status recorded during that period and Appendix A.8 shows the actual dates for all activities. The final actual schedule can be compared with any earlier schedule to do schedule variance analysis (line 4, 5 and 6 in Figure 4.19). As an example, Figure 4.20 shows another schedule variance analysis result using the original base schedule as the target schedule and the schedule with progress date of 28-Feb-2007 as the active schedule (i.e. line 2 in Figure 4.19). This result will be used as an example in the next chapter to illustrate how to make use of user-defined experience-based causal models to find likely factors causing unfavorable idle time, extended working time and implicit predecessor variances. The schedule variance analysis results for other possible scenarios shown in Figure 4.19 are given in Appendix A.9 to A.12. As to the interpretation of the results, please refer to the discussion related to Figure 4.18.  137  Figure 4.20 Schedule Variance Analysis Result with Using Option of ‘Variance=Active Schedule-Target Schedule” (Active Schedule: Progress Date of 28-Feb-07; Target Schedule: Original Base Schedule)  138  4.8  Summary  In this chapter the structured causal diagrams for doing cost and schedule variance analysis are proposed, both of which, in general, have three layers – 1. performance measure layer; 2. quantitative causal models layer; and 3. experience-based models layer. Compared with the decision support system literature reviewed for diagnosing construction cost performance (e.g. Roth and Hendrickson 1991, Abu-Hijleh and Ibbs 1993), the general agreement is that several sub-layers of quantitative causal models should be made use of to attribute total performance deviation to sub-variances associated with different causal variables in the models. It is observed that this structure is generally applicable to any performance measure which can be expressed in quantitative terms and computed using either an explicit or implicit function. Further, the structured diagnostic process proposed herein also feature a experience-based models layer which make it possible to do a more thorough and comprehensive performance analysis in terms of finding likely actual causal factors to help further explain why variances occurred to these causal variables for which no further quantitative causal models are available to explain. This can be regarded as a particular advantage of the diagnostic approach proposed in this thesis. Time performance is then selected as a representative performance measure to demonstrate the workability of the quantitative causal models layer through a complete implementation of the process. In terms of doing schedule variance analysis by using available quantitative causal models, compared with current commercially available scheduling tools, the prototype implemented in this research can not only identify finish date, start date and duration variances for activities of interest as the business scheduling applications do, but further it can automatically show the user an activity’s start and finish predecessor variances which could affect the activity’s actual start and finish date. It is observed that in the scheduling applications the user can only go through the activity list manually to find these variances. Moreover, by taking advantage of the particular feature in the REPCON system of being able to record an activity’s daily status in the Asbuilt view, the implemented approach can further attribute the identified duration variance to idle time and/or extended working time variance, which offers the user deeper insight into why duration variance occurred. The ability to infer if there exist implicit 139  predecessors affecting the actual start date of activities of interest was also implemented in the approach, which is believed to be important for identifying relevant causal factors. Such a capability has not previously been discussed in the literature and it does not exist in current commercial applications.  140  5  5.1  Experience-based Causal Models in Performance Variance Analysis  Chapter Overview  The primary focus of this chapter is on how use is made of experience-based knowledge to help explain causal variable variances. Topics treated include: (a) a discussion of causal factors found to have impact on time performance; (b) implementation details of Component 3 of the schema (Figure 5.1) responsible for constructing and saving experience-based causal models on the standard side of the diagnostic approach prototype; (c) implementation details of Component 4, Figure 5.1 responsible for retrieving and refining if necessary, experience-based causal models for a specific project and activities; and, (d) implementation details of Component 5, Figure 5.1 responsible for searching out and reporting data evidence with the guidance of experience-based causal models. 5. Search and Report Data Evidence 6  7  3. Standard Experience -based Causal Models  4. Specific Experience -based Causal Models 4 CF1 CV 3  CF2  CF: Causal Factors  CF1 CV n  CF1 Causal Variable  CFj  CF2  8 CFn  CFk  CFn  2. Quantitative Causal Model Analysis 3  CV: Causal Variable  Primary Performance Measure 5  CV1  CV3  CV2  CVn  2  1 1. Integrated Information Platform  Figure 5.1 Components Responsible for Making Use of Experience-based Causal Models 141  Extensive use of screen captures for the diagnostic approach as implemented is made in this chapter to highlight specific features of the approach. The focus is on the capability created - not on elegance or lack thereof of the user interface.  5.2  Causal Factors Affecting Construction Time Performance  Different from causal variables in quantitative causal models, causal factors in this thesis refer to variables or events expressed in the form of experience-based functions which convey causal meaning. These functions correspond to the experience-based causal models described in Chapter 3. As discussed in Chapter 2, the investigation of critical causal factors having impact on different construction performance measures has attracted the interest of many researchers. Considerable efforts have been exerted to identify these causal factors (e.g. Assaf et al. 1995, Akinsola et al. 1997, Odeh and Battaineh 2002, Aibinu and Odeyinka 2006, Iyer and Jha 2006) and to establish experience-based quantitative causal relationships between the performance measures of interest and the critical factors (e.g. Thomas and Yiakoumis 1987, Jaselskis and Ashley 1991, Kog et al. 1999, Zayed and Halpin 2005, Hanna and Gunduz 2005). For example, Table 5.1 lists some of the causal factors identified in the literature reviewed, which were deemed to be important for or could have significant impact on construction time performance. In the first column of Table 5.1 is a category for organizing different causal factors. The categorization is based on the author’s own interpretation of the literature. While in the research by others the format of a hierarchical structure is widely accepted, no consensus exists on category titles and how many categories or hierarchical layers should be used to organize the causal factors identified. For instance, Chan et al. (2004b) used five categories, “Project management actions”, “Project related factors”, “External environment”, “Project procedures” and “Human related factors”, to organize the factors identified as affecting a project’s overall success. Edieb (2007) used a three-layer hierarchical structure (category-subcategory-factors) to categorize different factors contributing to time and cost overruns. While useful to have, the author does not believe that it is essential to have consensus on or a widely acceptable categorization schema for  142  organizing the myriad number of possible causal factors that contribute to construction performance deviations (development of a standard categorization would be useful for conducting research on benchmarking, but for the individual firm, an internal standard is sufficient to permit inter-project comparisons). As a result, no attempt has been made to argue for a specific categorization of factors in this thesis. The primary intent of categorizing factors herein is to have readers realize that the causal factors identified in the literature reviewed relate to several construction management functions. Thus, as emphasized several times previously, some form of integrated information platform that allows access to the data supporting these management functions is important and necessary for diagnosing construction performance. Table 5.1 Some Identified Causal Factors Having Impact on Time Performance Category Physical working conditions Weather  Human resource  Material/ Equipment management Changes  Communication  Causal Factors  Examples of Data Field Mapped onto Factor  Unforeseen ground condition  Physical view>PCBS>Location attribute values  Congested working area  As built view>Daily site>Daily site data>Problems  Restricted access  As built view>Daily site>Daily site data>Problems  Unsafe working condition  As built view>Daily site>Daily site data>Problems  High/low temperature  As built view>Daily site>Site environment data  Heavy precipitation  As built view>Daily site>Site environment data  High humidity  As built view>Daily site>Site environment data  Low labor skill  As built view>Daily site>Daily work force data  Labor unavailability  As built view>Daily site>Daily work force data  Low manager skill  As built view>Daily site>Daily work force data  Manager unavailability  As built view>Daily site>Daily work force data  Lack of experience  Organization view>Participant’s attribute values  High labor turnover rate  As built view>Daily site>Daily work force data  Material/Equipment shortage  As built view>Daily site>Daily site data>Problems  Material/Equipment damaged  As built view>Daily site>Daily site data>Problems  Delayed delivery  As built view>Daily site>Daily site date>Equip. status  Material theft  As built view>Daily site>Daily site data>Problems  Drawing change  Physical view>Drawing controls  Scope change  Physical view>PCBS>component attribute values  Construction method change  Process view>Activity list>M&RBS  Material change  Process view>Activity list>Resource  Lack of communication  Organization view>Participant>Evaluation/comm.  Late approval of test samples  As built view>Records>Inspection reports  Late approval of permits  As built view>Records>Permits/Certificates  143  Because the number of important factors identified in previous research by others is numerous and no consensus on their name description and importance has been achieved, in the second column of Table 5.1 are listed just several of the causal factors identified in the literature reviewed as important or critical for affecting time performance. As discussed in Chapter 2, possible reasons for the non-consensus observed include that the projects or construction trades studied had highly variable characteristics (e.g. the type of projects differed) and the data studied were collected for different types of activities. Thus, in the diagnostic approach proposed, in order to define the properties of causal factors to help explain a specific project’s performance, the author believe that being able to explicitly express causal factors as a function of the project context and activity’s characteristics is a key desired feature. In the literature reviewed, there is very often no discussion on how to express what state of a causal factor will affect time performance – i.e. the focus is on identifying the factors themselves but not characterizing possible factor states. That actually resulted in another review finding that most identified causal factors lack clarity in their definition. Thus, another key feature pursued in the diagnostic approach for defining the properties of causal factors is to being able to express the threshold state of causal factors in terms of available data field values – i.e. if the value or some function of the causal variable exceeds a specific value or threshold state, performance will likely be affected. In the later sections detailed discussion on how to achieve these two desired features is presented. Shown in the last column of Table 5.1 are examples of the data fields in the REPCON system. They can be flexibly selected and mapped onto the corresponding causal factors to express their states or meaning. Figures 3.8 to 3.14 illustrated some of these data fields in the physical, process, organizational/contractual and as-built views in the REPCON system. It is emphasized that the mapping relations between data fields and causal factors is not necessarily one-to-one as illustrated in Table 5.1, i.e. more than one data field could be used to define or express one causal factor. For example, “scope change” as a causal factor can be confirmed by examining whether there is difference between the value of two data fields, the actual and the planned attribute value of physical component(s) associated with an activity of interest (e.g. actual vs. planned concrete volume.) Once a clear definition is given to a hypothesized causal factor by  144  using available data fields, consensus on the name is not that important. In reality, the user could name it in any way, since that will not affect if the factor is confirmed as a likely actual cause because supporting evidence is determined in a data search guided by the user-defined causal model. Nevertheless, for purposes of communication and developing a shared image, careful expression of a causal factor is important. As to a likely causal factor, it means that there is evidence in support of the causal factor contributing to the variance of interest. It is very hard, if not impossible, to prove definitively that it was an actual factor, especially when multiple factors are involved, hence the use of the term, ‘likely’, herein.  5.3  Standard Experience-based Causal Models  As previously described on the schema of the proposed diagnostic approach, section 3.6 in Chapter 3, in order to further explain the variances detected for the causal variables in the quantitative causal models for time performance (e.g. idle time, extended working time, and implicit predecessors, as discussed in Chapter 4), two components respectively responsible for capturing, saving and making use of a user’s experience-based diagnostic knowledge in the form of experience-based causal models were designed. The first component is a knowledge base in which various standard user-defined experience-based causal models can be constructed and saved (standard as it applies to the context here relates to causal models developed through some kind of consensus process developed within a firm such that the resulting models reflect the combined experience – expertise of a firm’s personnel). Discussion in this section is focused on this component, i.e. the third component illustrated in Figure 3.7 and highlighted in Figure 5.1.  5.3.1 Creating and Organizing Standard Experience-based Causal Models As shown in Figure 5.2, by clicking the ‘Causal Models’ option on the ‘Standards’ side of the REPCON system, the user can access the standard user-defined experience-based models component implemented in this research. The user interface of the component is shown in Figure 5.3. Here, ‘Standards’ side refers to the knowledge-management part of the REPCON system, in which various standard components or elements (e.g. Physical 145  Component Breakdown Structure-PCBS, Environment Breakdown Structure-EBS, etc.) can be defined as knowledge, and then which can be selectively applied by the user to a specific project according to the project’s characteristics or context. With the assistance of predefined standard components as knowledge, a project can be modeled in less time and analyzed efficiently.  Figure 5.2 Accessing the Standard Experience-based Causal Models Component  Figure 5.3 Standard Experience-based Causal Models Component  146  It is noted that on the left side of the component interface (Figure 5.3) there are five attributes for describing experience-based standard causal models, which are “Description”, “Performance”, “Project Type”, “Phase/Subphase”, and “Work Type” (the last four constitute meta data associated with each model). In this component, the user can create as many standard causal models as desired. The description attribute is the key data field, which means for each created causal model its description name is exclusive and must be different from others (for simplicity, here model descriptions simply refer to model number – in practice a more expressive description should be used, e.g. Bldg. site work duration variance). The other attribute values or meta data are used to help the diagnostic approach automatically screen the roster of causal models most appropriate to explain the causal variable variances for different activities of interest. Meta data values can be very coarse grained or fine grained. Use of the former maximizes generality of a causal model (e.g. applies to any project type, total project, and all work types) while use of the latter allows targeting of very specific types of work where relatively definitive knowledge exists as to relevant causal factors and related thresholds. Here we elaborate briefly on the choice and use of the meta data variables. Project type is thought to be important because the context of different types of projects can affect what causal factors should be considered and included in the corresponding standard models. For example, compared with high-rise building projects usually constructed in urban areas and surrounded by other existing buildings, highway road projects are likely less affected by the restriction on site access and facility use (e.g. use of crane), which as potential causes for unexpected time performance are often considered for high-rise building projects. However, the option exists to simply specify a choice of ‘general’, making this field project type independent. Phase/Subphase as another meta data field is also important since even for one type of project, activities in different phases/sub-phases can be affected by different causal factors. For example, for high-rise building projects, heavy rain as a potential causal factor is likely included in the causal models applicable to the activities in the construction phases of site preparation and excavation/shoring, but obviously it is meaningless to consider such a factor in the models for helping explain the time performance of activities in the phases of interior finishing or vertical transportation. Work type as another meta data further allows one to  147  consider specific work contexts. Hanna et al.’s work (1999a and 1999b), to some extent, has proved that for activities of different work types (e.g. mechanical and electrical construction) a causal factor’s (e.g. change order) impact on the performance of interest might be different. As to the work type attribute, the system was designed to accept any work type classification system, but use has been made of Masterformat herein because it involves a hierarchical classification of work, allowing users to specify either a general class of work or a specific sub class. One practical issue to consider is the willingness of users to assign attribute values to activities. For example, to benefit from work type, each activity on the project side must have work type associated with it. Not a particularly onerous task, but nevertheless not always done by practitioners. Details on one way of dealing with this issue and how to take advantage of the model attributes to filter standard causal models for use on the project side will be given later. Our own view is that fine-grained specification of meta data values should be avoided – it is best to seek as much generality as possible when formulating experience-based causal models. Overtime, additional models targeted at special situations can be easily added to the roster of causal models and automatically accessed as part of the reasoning process. By using the options under the menu item of ‘Standard Causal Models’ shown in Figure 5.4, the user can create a new experience-based standard causal model, delete or edit an existing one. Any name can be given to the ‘Description’ field except an already exiting one in the pop-up window (Figure 5.5).  Figure 5.4 Interface for Creating/Deleting/Editing Standard Experience-based Causal Models 148  Figure 5.5 Specifying Attributes for a Standard Experience-based Causal Model In the ‘Performance’ field (Figure 5.5), the user must select from the drop-down list a specific performance variance for which the causal model is applicable. Since time performance is the representative performance measure studied in this thesis, the variances of causal variables requiring further explanation by making use of experiencebased causal models include ‘working time variance’, ‘implicit predecessors’, and ‘idle time’, as shown in Figure 4.3. The structured model based diagnostic process of the diagnostic approach is designed to be applicable to other performance measures, such as safety, quality and the causal variables in cost performance related quantitative causal models as described in Chapter 4, thus in future implementation they can be included in the performance drop-down list although they are not explored in this thesis. The ‘Project Type’ field shown in Figure 5.5 can also be assigned a specific value by using the corresponding drop-down list. Thus, the corresponding defined causal models are only applicable to the project of the type specified. The available types of project to choose from can also be flexibly defined by the user through clicking the ‘Project Type’ menu option shown in Figure 5.3. The user interface for editing project types is shown in Figure 5.6. The project types defined in the window (e.g. General, High-rise building, Bridge, Highway road, Dam project) are automatically presented in the project type dropdown list. ‘General’ is a default project type that cannot be deleted or edited. The reason for having the type of ‘General’ is that some causal models could be largely independent of project type. Also, another advantage associated with having the ‘General’ project type is that it can improve the efficiency of establishing the knowledge base and make it possible to simultaneously use more than one causal model (General type along with 149  specific type model) to analyze a performance variance, i.e. causal factors in a ‘General’ type model can be used together with causal factors in a model for a specific type of project as hypotheses to diagnose the causal variable variance of activities of the project.  Figure 5.6 User Interface for Specifying Project Types Similar to the ‘Project Type’ attribute, when assigning values to the last two causal model attributes of ‘Phase/Subphase’ and ‘Work Type’ shown in Figure 5.5, the user is also able to select the most appropriate one from the corresponding drop-down list. The available options in the lists are, respectively, from the ‘Phases’ and ‘Work Type’ defined on the ‘Standards’ side of REPCON (see Figure 5.2). The user interfaces for creating standard construction phases and standard work types are respectively shown in Figure 5.7 (a) and (b) (the Phase/subphase interface was not implemented as part of this research, but work type was). It is noted that two and three layer hierarchical structures, respectively, can be used for modeling standard construction phase/subphase and work type. As stated previously, the Masterformat classification has been used for describing work type. Different from the previously discussed attributes, the ‘Phase/Subphase’ and ‘Work Type’ attributes are not must-fill-in fields for a newly created standard experiencebased causal model, i.e. they can be left blank. If this is the case, in essence it means that the corresponding experience-based standard causal model can be applied to any construction phase and/or work type.  150  (a)  (b) Figure 5.7 User Interfaces for Defining and Editing Standard Phases and Work Types After specifying and then confirming the attribute values for a causal model (Figure 5.5), the model is listed on the left side of the standard user-defined experience-based causal models component (as model 1, 2, etc. shown in Figure 5.3), and for a highlighted model the causal factors included in it are listed on the right side of the interface.  151  5.3.2 Defining Causal Factors Once a standard causal model is created, by using the factor related menu options shown in Figure 5.8, the user can Add/Delete/Edit causal factors thought appropriate for that model. As discussed in Chapter 3, user-defined experience-based causal models are assumed to be single layered, so the tree structure for organizing causal factors is as shown on the right side of Figure 5.4 or 5.8. The user can create as many causal factors as thought to be relevant in one causal model, and the only restriction is that each causal factor must have a name different from others in the model. The name of a causal factor and its definition is specific to a particular causal model. Thus, different causal models may include a causal factor of the same name, but the definition of the factor may differ considerably between models, reflecting in part different choices of work type, phase/subphase and project type.  Figure 5.8 Adding/Deleting/Editing Causal Factors in a Standard Causal Model The syntax used for expressing a causal factor takes the following form: IF [Function/Operator: Left Bracket: Data Fields: Right Bracket: Condition: Value 1: Value 2] THEN the evidence searched out can support that the causal factor is a likely contributor to the performance variance. Only the items in the brackets need to be 152  expressed explicitly by the user. The words ‘IF’ and ‘THEN’ part in fact are expressed implicitly. The functions/operators implemented to date include Max, Min, Num (to get the number of variables, e.g. RFIs, request for information), Sum, Average, +, -, *, /. The data fields can be selected from the ones corresponding to data collected in support of day-to-day management functions (see Figure 5.11). Left and Right brackets are used when more than one data field is involved. Condition choices include EQ (equal to), NE (not equal to), GT (greater than), GE (greater than or equal to), LT (less than), LE (less than or equal to), WR (within range value 1, value 2) and NW (not within range value 1 and value 2). Value 1 and Value 2 can be numeric, date, or Boolean values. It depends on the data fields selected and functions applied. The user interface for constructing or editing a causal factor (e.g. excess rain) is shown in Figure 5.9. In the ‘Name’ field on the upper left side of the window, the user can give a causal factor any name he or she prefers. As observed from Figure 5.9, there are three tabs, ‘Definition’, ‘Attribute Association’ and ‘Comments’ that can be used to describe the features/properties of a causal factor.  Figure 5.9 User Interface Showing Causal Factor Definition In the first tab of ‘Definition’ (Figure 5.9), a detailed definition of a causal factor’s state beyond which performance is believed to be affected, can be expressed in terms of the syntax just described. As to defining a factor’s threshold state in an explicit manner, in summary form three steps are involved, which are: 1. Select the desired data field(s); 2. Specify filter(s) for each selected data field in order to screen for the appropriate 153  data to examine; and 3. Apply functions/operators to the filtered data to see whether or not the outcome is beyond the specified condition value (i.e. factor threshold state). With using the ‘Add’ button in Figure 5.9 the user can access the relevant interface (Figure 5.10) to execute the aforementioned three steps. Data fields available to choose from are hierarchically organized under different views and listed in the causal factor data field selection window (Figure 5.11), which appears after the user clicks the ‘Select Data Field’ button in Figure 5.10. The type of input (number, ordinal, etc.) is indicated along with units for number fields, when appropriate. Because of time limitations and the goal of the research being to provide proof of concept of the research ideas, but not to develop a complete business application, only data fields under the four important views of As-built, Organizational, Physical, Process are considered in this research (Figure 5.11(a)).  Figure 5.10 User Interface for Defining Causal Factor State  154  (a)  (b)  Figure 5.11 User Interface for Selecting Desired Data Field Data fields from other views (e.g. Cost, Risk, Change, Quality and Environmental) could be readily included as an extension of the work described herein. From the data field list, the user can choose any field desired to define a causal factor. For example, for the causal factor example of “Excess rain”, the user would most likely think the data field of precipitation under the as-built view->daily site category is the appropriate field to use (Figure 5.11(b)). Selecting this field will result in it being shown in the ‘Data Field’ cell in Figure 5.10. The second step involves specifying filter(s) to screen relevant data for the data field selected in step 1, which can be done by using the ‘Add’ button just beside the ‘Define Filter’ part in Figure 5.10 to access the relevant user interface (Figure 5.12). Depending on the data field selected, there are different filter keys, conditions and condition values/variables in the corresponding drop-down lists for use in constructing the desired filters. For example, “Excess rain” as a causal factor is a possible candidate to explain extended working time. For a given project, daily precipitation data is likely to have been recorded over the time period from the project’s actual start date to the current progress  155  Figure 5.12 User Interface for Defining Filters date (i.e. data date). But in order to try and explain working time variance for a specific activity, only precipitation data on the working dates of that activity would be useful. In this case, as shown in Figure 5.12 the user can use ‘Daily Site Date(s)’ as the filter key and then specify the condition of EQ (i.e. equal to) and condition value/variable of ‘Actual Working Dates (AWD)’ to construct the filter for the selected data field of precipitation. When applying this causal factor on the project side of the system to explain different activities with different working time variances, because their actual working dates are different, so obviously the daily precipitation data automatically extracted to examine for different activities would be different. Table 5.2 shows an illustrative example, in which Act1 and Act2 have different planned and actual daily status, and different extended working time variance (i.e. 2 days for Act1 and 1 day for Act2, which can be obtained by using the component described in Chapter 4). It can be seen that Act1’s actual working dates are on dates 3, 4, 5, 6, and 7, but for Act 2 its actual working dates are on dates 1, 2, 3 and 5. Thus, based on the filter as defined in Figure 5.12 the precipitation data extracted to be examined would be different, (15, 20, 15, 10, 0) for Act 1, and (0, 0, 15, 15) for Act 2. 156  Table 5.2 Extracting Precipitation Data for Different Activities Date1 0 Precipitation (mm) Act1 P Act2 Ss  Date2 0 Ps Oo  Date3 15 So Of  Date4 20 Of I  Date5 15 O F  Date6 10 O  Date7 0 F  (Daily activity status: s-start; o-ongoing; f-finish; P-postponed; I-idle; lowercase-planned activity daily status; uppercase-actual activity daily status.)  It is noted that for the date type filter keys, the user can designate desired date(s) as a condition value/variable by using the +/- operator in the drop-down list beside the one for selecting the date variable and specifying the number of days to add or delete. The use of working dates or calendar dates as part of the filter definition can also be indicated (see Figures 5.12 and 5.13). For example, to explain implicit predecessors, the user might want to examine precipitation data on dates earlier than an activity’s actual start date. In this case, the filter could be constructed as shown in Figure 5.13, which means precipitation data on the calendar dates within the range (WR) of a time period of 3 days and 1 day earlier than an activity’s actual start date would be extracted. If this factor is applied on the project side to the activity Act.n shown in Table 5.3, the precipitation data on dates 3, 4 and 5 will be screened for examination. But if the working dates, not calendar dates option, was selected, the result would be different and precipitation data for dates 1, 2 and 3 would be extracted (here Saturdays and Sundays are assumed to be non-working days). Such flexibility in defining date related filters can help fulfill the second condition required to infer there is a reasonable likelihood of a causal relationship between two variables X and Y, i.e. cause (X) variation must occur before effect (Y) variation, which was already discussed in Chapter 3. Table 5.3 Example Showing Difference Between Working Dates and Calendar Dates WED Date1 Act. n  THU Date 2  FRI Date 3 Ps  SAT Date 4  SUN Date 5  MON Date 6 So  TUE Date 7 Oo  WED Date 8 Of  (Daily activity status: s-start; o-ongoing; f-finish; P-postponed; I-idle; lowercase-planned activity daily status; uppercase-actual activity daily status.)  157  Figure 5.13 Using +/- to Specify Condition Value for Filter Key in Date Type After finishing specifying filter(s) for a selected data field, the result is shown in the ‘Define Filter’ part shown in Figure 5.10 (e.g. Daily Site Date(s) EQ AWDW (Actual Working Dates plus the Working dates date option)). In the next step, the user can further choose and apply a function/operator (e.g. Max, Min, Average, Sum, etc.) from the function/operator drop-down list on the upper side of Figure 5.10 to the data filtered, and specify as well the corresponding condition value (at the bottom side of Figure 5.10) as a causal factor’s threshold state. Using again the “Excess rain” causal factor as an example, the construction expert/practitioner might believe that if the average rate of precipitation on actual working dates of an activity of interest is greater than 10 mm, then this factor could be asserted as a likely actual cause for the extended working time observed. The corresponding function selected and condition value specified are shown in Figure 5.10. After this step, the factor’s definition is then shown in the definition tab (Figure 5.9). Continuing the previous example shown in Table 5.2, once the factor is applied on the project side, for Act1 Average (precipitation)=Average (15, 20, 15, 10, 0)=12 GT 10, and for Act2 Average (precipitation)=Average (0, 0, 15, 15)=7.5 not GT 10, thus according to the user’s experience-based knowledge “Excess rain” as a causal factor could help to explain Act1’s 158  extended working time, but not Act2’s. Furthermore, it is observed that the user can also make use of left/right brackets both in the windows for specifying filters (Figure 5.12) and for defining the state of a causal factor (Figure 5.10). The primary function of brackets is to allow users to make use of more than one data field to define a causal factor’s state, and also make it possible to construct more complicated filters for the data field selected. For example, “Unclear design/instruction” as a factor could cause or contribute to extended working time as well. Based on the user’s experience this causal factor might be able to be expressed as “if the number of records, which are associated with an activity of interest and whose record type is RFI (request for information) or SI (site instruction) and whose record date is greater than the actual start date of the activity, is greater than 3”, then this factor (i.e. unclear design/instruction) could be asserted as a likely actual cause for the extended working time. For this example, the set relation (i.e. AND/OR) and left/right brackets can be used to construct the desired filters for the selected data field of records, as shown in Figure 5.14. As another example, change in scope could be another factor contributing to extended working time. By examining whether or not the actual attribute value(s) is different from the planned attribute value(s) of the physical components (PCBS) associated with an activity of interest, the presence of this causal factor could be determined. The definition of this type of factor involves more than one data field, as shown in Figure 5.15. It is noted that through the use of left/right brackets and operators, making use of more than one data field to define a causal factor’s threshold state is also possible in this diagnostic approach. The tab of ‘Attribute Association’ in Figure 5.9 can be used to specify the activity context to which a causal factor is applicable. This is related to one of the key desired features discussed earlier in this chapter for defining a causal factor. Use of this feature is optional, as it requires attribute values to be specified for each activity. Figure 5.16 (a) shows the tab which includes fifteen pre-defined activity attributes that can be used to describe the characteristics of an activity, and which in turn can assist with reasoning about the relevance of a causal factor to help explain performance. These attributes are categorized into three groups – ‘sensitive to’, ‘work type’ and ‘subject to’. On the causal modeling  159  Figure 5.14 Multiple Filters Defined for Data Field Selected  Figure 5.15 Using Multiple Data Fields to Define a Causal Factor  160  (a)  (b)  Figure 5.16 Define Factor’s Attribute Association and Activity’s Attribute Values side of the system, the user can specify condition and threshold value(s) for each attribute, and if this threshold is exceeded for an activity of interest, then the corresponding causal factor will be applied to examine relevant data for that activity. Herein the condition value must be between 0 and 1, corresponding to the scale used on the project side to express an activity’s sensitivity to these attributes (see Figure 5.16 (b)161  with 0 meaning has no sensitivity, 1 meaning direct impact). For the “Excess rain” causal factor example, the user might believe the attribute association causal factor test should only be applied to activities with an attribute value of ‘sensitive to precipitation’ greater than 0.5. In this case, the user can select a desired condition from the drop-down list and then specify the condition value as 0.5 for the causal factor (Figure 5.16 (a)). For the activity example A01, assume that it corresponds to site clearing and has been assigned a ‘sensitive to precipitation’ attribute value of 0.7 (see Figure 5.16 (b)) which is greater than 0.5. Thus, if a causal model including the causal factor of “Excess rain” as discussed previously is invoked to help explain an extended working time variance associated with activity A01, the precipitation data on A01’s actual working dates will be extracted and checked. However, if the activity’s attribute of ‘sensitive to precipitation’ is assigned a lower value (i.e. less than 0.5), then the factor will not be applied to that activity. It should be noted that the attribute association feature assumes that the user is prepared to assign attribute values to each activity. If this is not done, i.e. no value is assigned to the attributes for each activity, then any screening of causal reasoning that is activity context dependent is simply ignored, i.e. all causal factors included in an applicable causal model will be automatically examined. In the last tab of ‘Comments’ in the ‘Causal Factor’ interface (Figure 5.9), as a seasoned practitioner the user can describe the meaning of the factor, which can help other users fully understand what this causal factor is and how it is supposed to work on the project side. Use of this feature is optional. For instance, Figure 5.17 shows the description for the “Excess rain” causal factor example.  Figure 5.17 Comments Tab in Causal Factor Window  162  In summary, in the component designed for defining and saving standard userdefined experience-based causal models (i.e. the knowledge base illustrated in Figure 3.7), the user can create as many causal models as he or she wants. The context to which each of the models is applicable can also be specified (i.e. applicable to what type of project, what construction phase/subphase, and what work type of activity), and the user is given great flexibility to clearly define a causal factor’s state by selecting data field(s) from different construction management views (illustrated by process line 2 in Figure 3.7), constructing corresponding filter(s), applying suitable function/operators to the data field(s) and specifying condition value(s) as threshold states. As to each causal factor included in a causal model, the user is also able to further specify the context to which it is applicable by taking advantage of the attribute values assigned to an activity. The whole process to capture a user’s (individual or firm) experience-based diagnostic knowledge is transparent and believed to be readily understood by construction practitioners. With the assistance of user encoded experience-based knowledge in the form of causal models, hypotheses for explaining an activity’s time performance can be readily established and used to search for supporting evidence from a project’s data base.  5.4  Use of User-defined Experience-based Causal Models for a Specific Project  After capturing and saving knowledge in the form of user-defined experience-based causal models, the capability to retrieve, refine if necessary, and make use of appropriate ones to explain performance variances for a specific project is central to application of the diagnostic approach. Thus, discussion in this section is focused on how user-defined causal models are transferred and refined on the project side of the diagnostic approach.  5.4.1 Retrieve and Modify User-defined Experience-based Causal Models In order to apply user-defined standard causal models to help explain construction performance, the user must first make sure that the relevant models are available on the project side. Here “relevant” means that the project context attribute (i.e. project type) of the standard causal models to be used should match the context of the specific project of interest. The corresponding component on the project side (i.e. the fourth component 163  illustrated in Figure 3.7 or Figure 5.1) containing the relevant experience-based causal models can be accessed by clicking the menu option of ‘Define Causal Models’ under the as-built project view as shown in Figure 5.18. The project experience-based causal models component interface is shown in Figure 5.19. Compared with the standard causal model component (Figure 5.3), it is noted that on the left side of this interface there is no project type attribute to describe the causal models. That is because all of the causal models listed are supposed to be applicable to the particular project type at hand (they should include both General models and models specifically formulated for a particular project type).  Figure 5.18 To Enter Into Project Experience-based Causal Models Component  Figure 5.19 Project Experience-based Causal Models Component  164  A feature of the diagnostic approach as designed and implemented is that user does not need to establish project causal models from scratch. Relevant causal models defined and stored on the standards side as experience-based knowledge can be automatically retrieved and copied over to the project side after the user specifies the type of the project being worked on using the interface shown in Figure 5.20, which can be accessed by using the menu option of ‘Define Causal Models’ in Figure 5.18. The choice of project types in the drop-down list is the same as those defined in Figure 5.6. As an illustrative example, assume at the standards side there exists six defined causal models (Figure 5.21), which are applicable to different types of projects. If the type of project the user is working on is high-rise building (i.e. the user selects High-rise building for the project type in Figure 5.20), the standard causal models retrieved and copied over to the project side will include Model 1, 2, 3 and 6 (Figure 5.19) since they have project type attribute value of High-rise building or General, but not models 4 and 5 which are only applicable to Bridge or Dam projects.  Figure 5.20 User Interface for Choosing Type of Project of Interest  Figure 5.21 Standard Causal Model Examples to Copy Over to Project Side As to the two mutually exclusive options in Figure 5.20, ‘Keep existing causal models at project side’ and ‘Delete existing causal models at project side’, they mean that 165  when copying standard causal models over to the project side, the user can decide whether or not to first delete already existing causal models in the project casual models component. During the analysis process, the user can optionally modify the causal models copied over from the standards side, or use them in unmodified form. For the latter case, the user can retrieve and copy the relevant standard models over to the project side again by selecting the option of ‘Delete existing causal models at project side’. Another scenario could be that during the course of construction, some new standard causal models are created on the standards side and the user wants to make use of them along with the already existing ones on the project side. For this case, the user would copy the relevant newly established standard causal models over to the project side by selecting the option of ‘Keep existing causal models at project side’. These two options give users more flexibility in retrieving standard causal models over to the project side, and allows them to flexibly modify models on the project side without worrying about being unable to use unmodified standard causal models again. Although the user is not allowed to construct totally new casual models on the project side from scratch, before using the relevant project casual models as hypotheses for explaining performance copied over from the standard knowledge base, the user is allowed to refine the causal factors in the models to suit the project context at hand (e.g. physical features, timing of work, prevailing economic conditions, etc.). By using the menu options shown in Figure 5.22, new causal factors can be added, and existing ones can be deleted/edited.  Figure 5.22 Menu Options to Add/Delete/Edit Causal Factors in a Project Causal Model  166  The justification for incorporating this feature in the diagnostic approach is two fold: (i) The user might not totally agree with the definition (i.e. data field(s) selected, filter(s) constructed, and/or threshold state specified) of the factors retrieved from the standards side. For instance, for the previously discussed “Excess rain” causal factor example, according to the user’s own experience, she or he might believe for the particular project at hand that if on any actual working date precipitation is greater than 5 mm, then this factor could actually cause extended working time for activities sensitive to precipitation. In this case, after accessing the relevant interfaces (Figure 5.23), the user can modify the factor on the project side as desired. These interfaces are the same as those used on the standards side. In this example, although the factor name might be the same (e.g. excess rain), the real meaning of the causal factors could be different from each other in terms of function used and/or threshold state specified. This reflects the reality that seasoned practitioners often have different subjective opinions or definitions for a particular causal factor, especially in the context of a particular project.  Figure 5.23 User Interfaces for Editing Causal Factors at Project Side 167  (ii) Constructs on the standards side of the system and made use of to define casual factors in standard experience-based models may be changed on the project side for the project at hand. In this case, corresponding modifications must be made on the project side to the relevant causal factors retrieved from the standard side. For example, in the REPCON system, the user can flexibly develop standard daily site problem codes and categories (Figure 5.24), which as one kind of knowledge could be used directly or with some modifications on the project side to record daily problem(s) encountered for different activities. Figure 5.25 (a) shows the problem codes and categories on the project side which can be quickly established with the assistance of the standard problem codes shown in Figure 5.24. It is noted that the problem code categories on the project side could be different from those defined on the standards side. If this is the case, problem code category related causal factors retrieved from the standard knowledge base must be modified in the project casual models component. Figures 5.25 (b) and (c) respectively show the daily site status for different activities and on each day how a problem code is recorded against each activity. What performance measure(s) (e.g. time, productivity, cost, etc.) at the level of the individual activity believed by site personnel to have been affected by a specific recorded problem code can also be indicated (Figure 5.25 (c)). Again, the interfaces shown in Figure 5.24 and 5.25 were not developed as part of this research work. However, they are shown here as they are useful in helping readers understand what problem code related casual factors mean, and how they work.  Figure 5.24 Interface for Developing Standard Daily Site Problem Codes  168  (a)  (b) (c)  Figure 5.25 Interfaces for Developing and Using Project Side Problem Codes For example, a causal factor defined on the standards side might be “on actual working dates of an activity if there are any problem codes recorded as having affected time performance and they are from the problem code category of site conditions, then 169  this factor could be a likely actual cause for that activity’s extended working time identified”. But it is noted that on the project side the category name used is 02 Site/work conditions (Figure 5.25 (a)), which is different from 02 Site conditions used on the standards side (Figure 5.24). Thus, the relevant filter defined for the causal factor (e.g. problem code category EQ 02 Site conditions) must be changed on the project side in order to correspond to the problem code category 02 Site/work conditions. In fact, when the user tries to retrieve relevant standard causal models to the project side, the diagnostic approach will automatically check whether the problem code categories used on the project side are different from those defined on the standards side. If yes, warning information (Figure 5.26) will appear, and the user then has to access the project casual models component to modify the relevant causal factors by using the interfaces shown in Figure 5.23.  Figure 5.26 Warning Information for Modifying Problem Code Categories As to how such a problem code causal factor works on the project side, here is an illustrative example. Assume that one activity has been diagnosed with extended working time. The user might think a weather related problem could have caused the extended working time, and thus wants to check actual working dates to determine if there are problem codes from the problem code category of weather and which are flagged as having affected time performance. If yes, then weather could be confirmed as a likely actual cause of the extended duration. To elaborate, Table 5.4 shows an example activity’s daily status and the problem codes recorded against it. If the causal factor problem code equals weather is applied to it, by browsing Table 5.4, it can be seen that the actual working dates for the activity include dates 3, 5, 6, and 7. On these dates, the problem codes recorded are (05.01(s), 01.01(t), 01.02(t), 04.03(c)). In the list, the problems codes recorded as having affected time performance include (01.01(t),  170  01.02(t)). And these two problem codes belong to the weather category numbered 01. Thus, with the supporting evidence found (i.e. problem codes recorded), weather can be confirmed as a likely actual cause for the observed extended working time. Table 5.4 How Casual Factor Example of Problem Code Work on Project Side Activity Date 1 Date 2 Daily status  s !P  Date 3  o !P  Date 4 Date 5  f !S  !I  !O  Date 6  Date7  !O  !F  02.01(t) 02.01(t) 05.01 (s) Unsafe 01.01(t) 01.02 (t) 04.03 (c) Design Rain; Snow; changes;  Problem Site not Site not practices; codes ready; ready; 01.01 (t) Rain;  Daily activity status: s-start; o-ongoing; f-finish; P-postponed; I-idle; lowercase-planned daily status; uppercase-actual daily status. ! indicates that there is problem code(s) recorded against the corresponding activity (see Figure 5.25(b)). (*) indicates that * performance is affected by the problem code recorded, e.g. t: time; s: safety; c: cost.  Simply put, after retrieving casual models from the standards side (i.e. process line 4 in Figure 3.7), and modifying factors on the project side according to the specific context of the project at hand and/or the user’s own experience-based diagnostic knowledge (process line 5 in Figure 3.7), hypotheses in the form of experience-based causal models at the project side are then ready to be applied to help explain the construction performance causal variable variances (process line 3 in Figure 3.7) detected in the component described in Chapter 4.  5.4.2 Apply Causal Models to Explain Construction Performance Variances For explaining time performance, once the user gets the schedule variance analysis results as shown in Figure 4.20 and makes sure that user-defined causal models on the project side have been customized in line with the context of the project at hand and the user’s own experience-based diagnostic knowledge, the causal model analysis process can be invoked to explore the likely actual causes for the time variances of extended working time, idle time, and implicit predecessors. By using the menu option of ‘Causal Model Analysis’ in Figure 5.27, the user can access an interface to determine what type(s) of variance with what amount of variance to explain (Figure 5.28). For the variance of implicit predecessor, no condition value is required because the condition value is a simple yes or no. Obviously, if no implicit predecessor is identified, no relevant 171  causal models should be applied to the corresponding activity.  Figure 5.27 Menu Option of Causal Model Analysis  Figure 5.28 User Interface for Selecting Variance(s) to Explain For different activities with identified time variances satisfying the condition values specified in Figure 5.28, the diagnostic approach is able to automatically select the relevant causal models from the project side causal models component to help search for evidence in support of explaining the time variances. For example, if the schedule variance analysis results are like Figure 4.20 and four causal models are defined on the project side (Figure 5.19), a window showing the causal models identified for different variances associated with different activities would appear, as shown in Figure 5.29. The first five columns in the window show identifier information for the activities having one 172  or more variances, including activity path, activity description, location, phase/subphase, and work type. For each activity, what variances have been identified and their corresponding variance value are shown in the next two columns, the same as derived from the schedule variance analysis results (Figure 4.20). In the last column listed are the applicable causal models selected from the project side casual models component for explaining the variances identified.  Figure 5.29 Experience-based Causal Models to Explain Variances of Activities For example, activity A01 at location L1 has idle time of 2 days. Thus, model 2 suitable for explaining idle time (see Figure 5.19) is automatically retrieved and listed. Activity A03-1 at location L2 has an extended working time of 1 day. Thus, both model 1 and model 6 suitable for explaining extended working time variance are listed. At this stage, the user has the opportunity to determine which causal models should not be applied by using the selection boxes in the first column. For example, the user might think for some reason that model 6 should not be used at present (e.g. no consensus has  173  been achieved in the firm as to some threshold states defined for some factors in the model). Thus the user can deselect the model in Figure 5.29 and it will not be applied to help explain the corresponding working time variances. In the process of retrieving relevant causal models from the project side causal models component, the phase/subphase and work type model attributes discussed previously actually have been used to assist in screening the characteristics of the activities and automatically filtering the suitable casual models. In the REPCON system, the user can define construction phase with two depth levels (Phase, Sub Phase) and work type with three depth levels (00.00.00) (see Figure 5.7). For any activity, it might have phase and work type value at any of the allowable levels of depth. For an experiencebased causal model it has the same situation, i.e. a causal model can have phase and work type attribute value at any allowable depth level. In identifying the relevant causal models for an activity, checks must be made to make sure that only the causal model(s) having phase and work type attribute value at a higher level can be applied to the activities with the consistent phase and work type value at the same or lower level. It is possible that the user does not specify phase and/or work type for an activity or a causal model, i.e. the value of work type and phase for an activity or a causal model is nil or blank. From the perspective of the diagnostic approach, a causal model with nil values for phase and/or work type is assumed to be general enough to be applicable to all activities with different work types and different phases. Table 5.5 and 5.6 are included to help understand what was just discussed. Referring to Table 5.5: (1) if a causal model has work type value 00, it can be applied to any activity with a work type value of 00, 00.** or 00.**.**; (2) if a causal model’s work type value is 00.00, it can be applied to activities with work type value of 00.00 or 00.00.**; (3) if a causal model’s work type value is 00.00.00, it then can only be used to explain activities with work type value of 00.00.00; and (4) if a causal model has no work type value, obviously it can be applied to any activity with any work type value. The interpretation for Table 5.6 is similar to Table 5.5. Finally, it is observed that phase and work type as context information must be considered together when identifying relevant causal models, i.e. the implicit logic relation between them is “AND”. Based on the foregoing discussion, the pseudo code for the diagnostic approach to automatically identify relevant causal models for each activity at the project side is  174  given in Table 5.7. As an example, it can be seen that on the project side, causal models with nil value for phase/subphase and work type (Model 1, 2, 3 and 6 in Figure 5.22) can be automatically retrieved for activities with any phase and work type value (Figure 5.29). Table 5.5 Work Type as Context Information to Identify Causal Model Work Type 00 Activity 00.00 00.00.00 Nil  Causal Model 00.00 00.00.00  00 √ √ √  √ √  Nil  √  √ √ √ √  Table 5.6 Construction Phase as Context Information to Identify Causal Model Construction Phase Phase Activity Phase.Subphase Nil  Phase  Causal Model Phase.Subphase  √ √  √  Nil √ √ √  After determining which causal models to use, the user can trigger the search routine guided by the experience-based causal models to obtain the causal model analysis results (Figure 5.30). On the left hand side of the interface is shown the similar information as shown in Figure 5.29. Note that in the last column headed causal model, model 6 is not there because it is not selected in Figure 5.29. On the right hand side of Figure 5.30, the result for each causal factor in the corresponding causal model (highlighted on the left hand side of the window) is shown. In the window, ‘Yes’ ahead of a causal factor means relevant data evidence has been searched out and the factor’s actual state is beyond the threshold state defined. Thus the factor can be asserted as a likely contributing cause for the corresponding variance detected. ‘No’ means that guided by a user-defined causal factor definition, no supporting evidence was found in the project’s data base (it is possible that practitioners miss collecting and/or inputting data) or the data found does not yield a factor value beyond the factor’s specified threshold state, and thus the factor cannot be asserted as a contributing cause to the detected variance. And ‘N/A’ means that 175  Table 5.7 Pseudo Code for Finding Relevant Experience-based Causal Models Do for each type of variance detected for each and every activity Find a list of causal models applicable to the type of variance of interest from the project side causal models component Do for each causal model in the list IF the causal model has three-level work type attribute value (e.g. xx.yy.zz) IF the activity doesn’t have (i.e. nil) or has less than three levels (e.g. xx or xx.yy) or different three-level work type attribute value (e.g. xx.yy.ll) THEN remove the model from the list and go to examine the next one END IF ELSE IF the causal model has two-level work type attribute value (e.g. xx.yy) IF the activity doesn’t have (i.e. nil) or has less than two levels (e.g. xx) or different two-level work type attribute value (e.g. xx.jj) THEN remove the model from the list and go to examine the next one END IF ELSE IF the causal model has one-level work type attribute value (e.g. xx) IF the activity doesn’t have (i.e. nil) or has different one-level work type attribute value (e.g. kk) THEN remove the model from the list and go to examine the next one END IF ELSE the causal model has no work type attribute value (i.e. nil) THEN keep the casual model in the list and go to examine the next one END IF (For the casual model(s) not having been removed from the list, its phase value should be further examined by using the following codes) IF the causal model has two-level phase attribute value (e.g. phase-x.subphase-y) IF the activity doesn’t have (i.e. nil) or has one-level (e.g. phase-x) or different two-level phase attribute value (e.g. phase-x.subphase-l) THEN remove the model from the list, and go to examine the next one END IF ELSE IF the causal model has one-level phase attribute value (e.g. phase-x) IF the activity doesn’t have phase (nil) or has different one-level phase attribute value (e.g. phase-k) THEN remove the model from the list, and go to examine the next one ELSE the causal model has no phase attribute value (i.e. nil) THEN keep the casual model in the list and go to examine the next one END IF (Note: after examining all the causal models in the list, the ones still remaining in the list will be shown in the column of causal model in Figure 5.28. The pseudo code as implemented was thoroughly tested to ensure that the relevant causal models were selected.) 176  Figure 5.30 Causal Model Analysis Results a causal factor is not applicable to the activity highlighted on the left hand side of the window due to the activity’s attribute value(s) not satisfying the casual factor’s attribute value(s) specified to indicate in what context it is applicable (see Figure 5.16 and its relevant explanation). In summary, it can be seen that on the project side with the assistance of retrieved and refined user-defined experience-based causal models as hypotheses for explaining the variances detected, data evidence can be searched out that supports or denies the hypotheses (process line 6 in Figure 3.7).  5.4.3 Report Data Evidence In order to view the evidence found with the guidance of the project side experiencebased causal models, the user can access the reporting function (Figure 5.31) by using the ‘Report’ button in Figure 5.30. Note that the user can select different report content profiles accommodating different preferences/needs for information. As to defining a report content profile, as seen in Figure 5.32, by using selection boxes the user can decide what information about activities, variances, and causal models/factors to include.  177  Figure 5.31 Selection of Report Content Profile to Report Results  Figure 5.32 Defining a Report Content Profile  178  Also as observed in Figure 5.32, given any problem code(s) found as supporting evidence to help explain variances of an activity, the diagnostic approach allows the user to view associated evidence (i.e. drawings and daily records) of the problem code(s) recorded against the activity. In other words, the diagnostic approach is able to automatically search out relevant records and/or drawings as supportive evidence for the problem codes recorded on a daily basis, provided the user has associated records with the problem codes of an activity. As said previously, the ability to make association relationships between different constructs in different views is supported by the REPCON system (e.g. records can be associated with project participants, activities, PCBS components, and problem codes, etc., see Figure 3.13.) The association relationships between any record or drawing with these constructs is very likely not to be a one-to-one relation, and the same problem code might be assigned to several activities. For an activity that takes place at multiple physical locations, it is possible that a project record (e.g. photo) is only associated with a specific location of the activity. The pseudo code for finding associated evidence is given in Table 5.8. Detailed examples as to identifying associated evidence are given in a hard copy report example that follows. Table 5.8 Pseudo Code for Finding Associated Evidence of Recorded Problem Codes Do for each problem code (PCi) found as supporting evidence Do for each record (Rj) in the As-built view>Records IF Rj has the selected record type (see Figure 5.32) AND Rj is associated with the problem code PCi AND the activity under examination (An) THEN IF the activity An is a multi-location activity THEN IF Rj is associated with the PCBS location where the activity An is THEN list record Rj as an associated evidence for the problem code PCi ELSE don’t list record Rj as associated evidence END IF ELSE list record Rj as associated evidence for the problem code PCi ENDIF ELSE don’t list record Rj as associated evidence for the problem code PCi ENDIF Note: given the option of drawings is selected in Figure 5.31, in the above pseudo code using drawing (DWj) instead of record (Rj) can help find associated drawing evidence.  179  After selecting a report content profile in Figure 5.31, a causal model analysis results report can be generated. Before showing and discussing the contents of a hard copy report, which is based on the schedule variance analysis results (active project with progress date of 28Feb07 and target project with progress date of 28Dec06) for the example project presented in the previous chapter and the three project causal models selected as relevant (Models 1, 2 and 3 shown in Figures 5.22 and 5.29), the detailed definitions of the casual factor examples in the models are given in Table 5.9. Due to time limitations, the number of factors constructed and fully tested in this research is limited. However, the diagnostic approach itself has no limitation on the number of factors that can be implemented. The properties of a causal factor shown in the table can be transparently, flexibly and easily expressed or recorded by using the tabs of ‘Definition’, ‘Attribute Association’ and ‘Comments’ previously discussed (see Figures 5.9 to 5.17). Table 5.9 Causal Factor Examples for Explaining Time Variances Model Causal Description Factor Name  Factor’s Literal Definition  Excess rain  If the average of precipitation data on actual working dates of an activity of interest is greater than 5 mm, then this factor could be asserted as a likely cause for extended working time. Unclear design/ If the number of records, which are associated with instruction an activity of interest and whose record type is RFI (request for information) or SI (site instruction), is Model 1 greater than or equal to 3 and whose record date is (Working time greater than the actual start date of the activity, then variance) this factor could be asserted as a likely cause for working time variance. Change in If an actual attribute value(s) is different from a PCBS planned attribute value(s) of the PCBS components associated with the activity of interest, then this factor (i.e. scope change) could be asserted as a likely cause for working time variance. Excess rain If on the idle dates of an activity of interest, any precipitation data recorded is greater than 20 mm, then this factor could be asserted as a likely factor Model 2 causing idle time. Problem codes If on the idle dates of an activity of interest, there is (Idle time) any problem code(s) recorded as having affected time performance, then this factor could be used to help explain why idle time occurred for the activity.  Attribute Association Sensitive to precipitation GT 0.4  Sensitive to precipitation GT 0.6  Cont’d  180  Model Causal Description Factor Name  Factor’s Literal Definition  Access problem If on the dates between 'the date of actual start dateimplicit variance days' and 'the date of actual start date -1 day(s)' the status of access to site is recorded as poor, then this factor could be asserted as a likely implicit predecessor. Model 3 Problem codes If on the dates between 'the date of actual start date(Implicit (utility) implicit variance days' and 'the date of actual start predecessors) date -1 day(s)', there is any problem code(s) recorded as having affected time performance and it belongs to the utility problem code category, then a utility problem could be asserted as a likely implicit predecessor.  Attribute Association Sensitive to work face access restriction GE 0.5  It is observed that in different models, a causal factor might have the same description or name (e.g. ‘excess rain’ in Model 1 and 2 in Table 5.9), but their detailed definition or threshold state can be different for different variances identified or phase and/or work type. (Preferably, however, a different name should be used to avoid confusion. For example, for model 2 in Table 5.9, the ‘excess rain’ casual factor might be better named as ‘idle day excess rain’). Compared with the literature review finding that very few researchers discussed states of causal factors identified and no consensus was found as to the precise meaning of the factors identified, it is believed that being able to explicitly express a causal factor’s state is a real merit of the diagnostic approach and very important for the meaningful diagnosis of construction performance. The author is comfortable saying that clearly expressing a factor’s state is more important than giving an absolutely correct name to a causal factor. For example, in practice, data relevant to the causal factor of ‘unclear design/instruction’ is not collected, although it was identified as an important factor in many of the papers reviewed. Because as part of the diagnostic approach developed herein it is feasible to access various data supporting different construction management functions, it is possible to use surrogate measures (e.g. number of RFI and SI records, or number of revisions to a drawing or set of drawings) to indicate if the problem of unclear design occurred. This may be regarded as an additional strength of the diagnostic approach. The hard copy report example for the small example project presented in Chapter 4 is shown in Figure 5.33 and the full report is given Appendix B. A number of observations regarding interpretation of the contents of the full report  181  are offered here to help the reader understand the breadth of information that the diagnostic approach can produce. Activity A01 at location L1 was identified as having an idle time variance of 2 days, and model 2 was used to explain the variance. On its idle dates (02Jan07 and 03Jan07), the precipitation data searched out is 29 mm and 18 mm respectively. As the first is obviously greater than the specified threshold state of 20 mm, the factor of “Excess rain” is asserted as a likely actual cause for A01’s idle time. For the causal factor of “Problem codes”, it is found that on A01’s idle dates, problems recorded as having affected time performance include “1.1 heavy rain” and “9.2 shortage of utilities”. Therefore, they can also be asserted as likely or contributing causes for the idle time detected. It is interesting to notice that for this activity, precipitation as recorded objective data on a daily basis can corroborate the heavy rain problem code subjectively recorded. But this is not always the case. For the multi-location activity A03-1 at location L2, which was also identified as having idle time of 2 days, in the report it is observed that problem code of “1.1 heavy rain” was recorded against it, thus it can be asserted to be a likely or contributing cause to the variance. But the precipitation data on its idle dates are 19 mm and 13 mm, and since they are not greater than (i.e. GT) the specified threshold state (20 mm), “Excess rain” as a causal factor is ruled out as a likely actual cause for the idle time variance identified. It seems that there is contradiction in this example, i.e. results from the two causal factors are not supportive of one another (an explanation for which might be that the specified experience-based or statistical-based threshold state is inappropriate for the activity due to any number of reasons, including inadequate knowledge). Nevertheless, with the useful evidence searched out, the user is believed to be able to make the final judgment on the likely actual cause for the observed idle time with more confidence. Clearly, one must be careful in specifying threshold values. The state of knowledge is not such that they can be specified with great certainty – in fact they are subject to considerable uncertainty. The question is – does one simply specify a relatively loose bound (e.g. use 10mm of rain instead of 20mm), or use some technique (e.g. a fuzzy logic based rule) to deal with uncertainty surrounding a bound. In essence, we have opted for the former – specify a crisp threshold, but one that reflects a zone of uncertainty. It is left to future work to explore other ways of specifying a bound, although it is noted that the more information required from the user, the less practical a  182  Figure 5.33 Causal Model Analysis Report Example 183  a schema becomes. The approach used mirrors current practice in other areas of construction management – e.g. specifying a definition of near critical activities – an activity with 6 days of float might not be really less critical than the one with only 5 days. Again with respect to activity A01, another finding is that for the problem code “9.2 shortage of utilities” recorded on 02Jan07, the record LT-01 (i.e. a letter) was searched out as supportive associated evidence, helping to corroborate the validity of the problem code subjectively recorded. As another example, for activity A02 at location L1 it can be seen that on one of its idle dates the problem code of “2.7 unexpected geotechnical conditions” was recorded and a photo (PH-03) was found as the problem code’s corroborative evidence due to its association with the activity and the problem code. There is no guarantee, however, that for every problem code recorded that associated evidence can be found. It may not simply exist, or even if it does, no association has been made by system users. As an example, on another idle date of A02, the problem code of “5.7 inadequate instructions” was recorded, but no relevant associated evidence was found in the project records. As to a causal factor’s attribute association, it is noted in Table 5.9 that the casual factor of “Excess rain” in model 2 is sensitive to precipitation GT 0.6, which means if an activity’s attribute value of sensitive to precipitation is less than 0.6, then the casual factor should not be used to help explain the activity’s corresponding variance. For example, activity A05 at location L3, which in the example project is assumed to be an indoor mechanical activity, has a “sensitive to precipitation” attribute value of 0. Thus, in the report it can be seen that its test result is N/A, even if heavy precipitation occurred on its idle dates. The same outcome happened for activity A09 at location L2, whose attribute value of sensitive to precipitation is 0 as well. Thus, this characteristic of the diagnostic approach allows an activity’s context to be captured and taken advantage of to assist in reasoning about the likely actual causes for the variances detected. Activity A02 at location L1 was also identified as having one or more implicit predecessors, since given that all its explicit predecessor logic relationships were met, it should have been able to start 3 days earlier, but it did not. Model 3 in Table 5.9 is used to help explain an implicit predecessor variance. According to the causal factor of “Access problem”, it is found that before A02’s actual start date, the access to site on  184  01Jan07 was poor, which thus can be reasonably asserted as a likely actual implicit predecessor. But for another causal factor of “Problem codes (utility)” it can be seen in the report that on the 3 days before A02’s actual start date, no problem code from the category of utilities/city was recorded as having affected time performance. Thus, utility shortage or unavailability cannot be asserted as an implicit predecessor encountered by activity A02. For this example, the user might think that one day with poor access to site cannot fully explain the variance, so within the diagnostic approach she or he can easily and flexibly refine the experience-based model (e.g. propose more causal factors and/or modify the problem code relevant factor to include more categories) to search for other contributing causes for the variance. While probably self-evident, it must be realized that user-defined experience-based causal models will not be perfect, and thus diagnosing likely actual reasons for unexpected construction performance might be an iterative process, involving refinement of the causal models (effectively broadening or narrowing the search for supporting evidence in the project’s data base). For working time variance, Model 1 in Table 5.9 can be applied. For example, activity A03-1 at location L2 was detected as having a 1 day extended working time. Guided by the user-defined causal factor of “Excess rain”, the precipitation data on its working dates was examined as part of the search process. However, due to the average value of the precipitation data (Average(0,8,6,0,0,0,8)=3.14) being not greater than 5 mm, this factor cannot be asserted as a likely actual cause for the extended working time detected (as always, an average value as a test should be carefully used - a more sophisticated test would involve checking for a cluster of consecutive days that experienced heavy precipitation). As to another causal factor of “Change in PCBS”, it can be seen that the relevant data searched out (i.e. planned vs. actual attribute values at location L2 of the two PCBS components associated with A03-1, PR.02.01 and PR.02.03.) indicates that there is difference between planned and actual attribute values, thus this factor is asserted as a likely actual cause for A03-1’s extended working time. As to the last causal factor in Model 1, since no RFI or SI record was found to be associated with A03-1 at location L1, it cannot be regarded as another likely actual cause. It must be emphasized again that the diagnostic approach just aims to identify likely actual causes for the causal variable variances identified, but not to quantitatively allocate or apportion  185  the variances to the causes asserted. For example, activity A03-2 at location L1 was identified as having an extended working time variance of 8 days. Based on the relevant data searched out, all of the factors in Model 1 can be asserted as contributing to the variance. But it cannot be said that “Excess rain”, “Change in PCBS”, and “Unclear design/instruction” should be respectively responsible for x, y or z days of the 8 extended working variance. For the diagnostic approach as designed, if a causal model is refined on the project side based on experience gained (feedback line 7 in Figure 3.7), then if the user thinks that the refined project causal model is good enough for future reuse, it can be saved back to the standards side as knowledge by using the option of ‘Save as standard causal model’ in Figure 5.22 (feedback line 8 in Figure 3.7).  5.5  Summary  It can be seen from the discussion in this chapter that the diagnostic approach developed allows the user to flexibly construct standard experience-based causal models/factors as diagnostic knowledge and further have them automatically copied over to the project side where they can be further refined according to a specific project’s context to diagnose likely actual causes for different performance variances detected. Compared with the current state-of-the-art of commercial construction management application software which does not offer any help in reasoning about actual causal factors for performance variances, it is believed that the diagnostic approach presented herein could give practitioners useful insights as to why performance variances happened, thus advancing the state-of-the-art. In comparison with previous research by others which has focused on developing explanatory construction performance models (see Table 3.2), this approach offers significant improvements on diagnosing performance by exploiting both expert experience-based knowledge and a project’s data base. For example, compared with the research work conducted by Roth and Hendrickson (1991) and Abu-Hijleh and Ibbs (1993), which can only explore quantitative causal relationships or examine very few causal factors, the diagnostic approach presented here allows the user to formulate and examine an unlimited number of causal factors. In Yates (1993)’s approach, a knowledge  186  base for diagnosing causal factors was established, but the user is not allowed to flexibly modify the diagnostic related knowledge according to their own experience-based knowledge and the contexts of a specific project at hand, features which form an integral part of the approach conceived in this thesis. In Moselhi et al. (2004)’s system, although practitioners can flexibly propose an unlimited number of causal factors for cost and time variances detected, these factors are simply assumed to be actual causes and no data gathered in support of different construction management functions is searched out from the project data base as evidence to help corroborate that the factors selected are legitimate. In contrast, the diagnostic approach described herein is guided by flexibly defined causal models/factors to search out data evidence to help assert or deny likely actual causes, which is very important if users are to have confidence in accepting the analysis results. Finally, compared with Dissanayake et al. (2008)’s research involving AHP, fuzzy logic, and neural network techniques to diagnose causal factors, the holistic structured causal model based diagnostic process embedded in the diagnostic approach is transparent and believed to be easy to understand by construction practitioners, and thus more acceptable to them. All of these advantages associated with the diagnostic approach can be regarded as contributions to the body of construction management knowledge and eventually to the tool-kit of practitioners.  187  6  6.1  Conclusions  Chapter Overview  The primary focus of this chapter is on: 1. providing a brief summary of the causal modeling performance diagnostic approach as designed and implemented; 2. use of a representative and realistically-sized building project to i. demonstrate the capabilities and workability of the diagnostic approach as designed and implemented, and ii. assess the usefulness of the approach through an experiment with three groups of graduate construction students all with prior industry experience; 3. setting out contributions of the work including an assessment of how well the formulation of the diagnostic approach responds to the tests on generality, data integration, transparency, ease of use, and flexibility set out in Chapter 1; and 4. presenting suggestions for future work, some of which address limitations in the current approach.  6.2  Summary of the Research  It is not uncommon to see construction projects, irrespective of project type, scale and technical complexity, encounter time and cost overruns, and sometimes even have unsatisfactory quality and safety outcomes. Diagnosing construction performance in a timely manner not only helps practitioners grasp current construction performance status, but more importantly can help them pinpoint the likely actual causes for performance and thus a basis on which to further identify and carry out corrective actions to bring a project back on track. In academia, researchers have done a substantial amount of research work focused on identifying important general causal factors that affect performance and establishing construction performance models. However, most of the models have been developed from a predictive, not an explanatory, perspective. As a result, their ability to explain a specific project’s performance deviation is limited. Further, and as 188  demonstrated through the detailed literature review presented in Chapter 2, the limited number of explanatory models developed to date have not made use of practitioners’ experience-based knowledge to search for evidence in the large amount of heterogeneous data collected in support of ongoing management functions to help explain actual performance. Recognizing the potential room for improvement, the general objective of this research was to develop a practical causal modeling based diagnostic approach capable of making use of seasoned practitioners’ experience-based diagnostic knowledge expressed in the form of causal models to help identify likely actual causes along with supporting data evidence to explain unsatisfactory performance, while at the same time facilitating the capture of this knowledge for future reuse. The diagnostic approach is desired to be directly usable by practitioners, and applicable to all key project performance measures. For performance measures defined by quantitative causal relationships (e.g. cost and time performance) the diagnostic approach can first help users narrow the diagnostic focus down to causal variables with variances by using available quantitative causal models. Details and the implementation of the corresponding component were discussed in Chapter 4. As to performance measures (e.g. productivity) and causal variables with no quantitative causal models available to help further explain them, another two components were designed to deal with them by making use of user-defined experiencebased causal models. The first of these, the standard causal model component, relates to knowledge management for capturing and saving a practitioner’s diagnostic knowledge in the form of experience-based causal models applicable to different types of projects and different construction contexts. The second component, which is on the project side, is responsible for automatically filtering appropriate causal models from the standard causal model component according to the characteristics of the specific project at hand and activities of interest. The user is able to customize the models as deemed necessary. Finally, another component of the approach is responsible for using the causal models on the project side as hypotheses to search out evidence to help determine what hypothesized factors are the likely actual causes for the variances detected, and then reporting the results. Chapter 5 discussed the implementation work of these three components. It is believed that with the assistance of the diagnostic approach,  189  practitioners can know better, with supporting evidence at hand, what factors likely account for unexpected construction performance encountered in a specific case, which in turn could be very helpful for settling construction claims and improving performance of unfinished work.  6.3  Validation  Use is made of a mid-sized building project to achieve a number of objectives. They are to: •  demonstrate the workability of the diagnostic approach as designed and implemented;  •  demonstrate the kinds of performance insights that can be generated, especially given access to an integrated information environment; and  •  assess the degree to which the original research objectives have been met and the responsiveness of the diagnostic framework to the tests set out in Chapter 1.  The example was developed wholly at arm’s length from the author by the author’s supervisor as part of a problem-based learning module for a graduate course on construction planning and control. It was made available to the author after the standard experience-based causal models were formulated by the author. The example is meant to be representative of a large class of building projects – e.g. high-rise construction, and it is reflective of experience gained on a number of actual high-rise projects. The example portrays a realistic initial schedule and actual performance achieved and conditions encountered during the first 2 plus months of execution. Data captured during the execution phase reflects a philosophy of comprehensive and consistent reporting, especially of daily status – in this sense it represents more of an ideal of what one would like to see as opposed to what one typically encounters in practice. Analysis is made of data from two schedule updating cycles. In addition to providing a test bed for examining the workability of the diagnostic approach, the example was also used as a basis for an experiment for assessing the incremental value of the diagnostic approach as compared with conventional practice in terms of the thoroughness of identifying relevant causal factors and the time required to  190  diagnose performance problems. This experiment was designed and then carried out with the cooperation of some savvy construction management graduate students (i.e. senior level graduate students nearing completion of their program and who also have prior industry experience) and a seasoned industry practitioner who is also enrolled in the Master of Engineering program. Findings from the experiment are discussed in detail. Finally, the key concepts and details of the diagnostic approach were introduced to three seasoned industry personnel in order to obtain their views about the usefulness of the diagnostic approach and improvements or modifications seen as desirable from their practical perspective.  6.3.1  Apply the Diagnostic Approach to Upper Crust Manor Building Project  The example building project used to validate workability of the diagnostic approach is concerned with construction of a 6-story upper middle-class residential building named Upper Crust Manor. Only a very brief description of the project context is given here. A more extensive description may be found in the problem-based learning assignment from the graduate course CIVL 520 Construction Planning and Control (Russell, 2008). The ground floor houses an entrance with sitting area, 2 suites and an exercise room, floors 2 through 5 house 3 suites per floor, and the sixth floor houses two penthouse suites. The mechanical penthouse houses elevator equipment and ventilation equipment. The north side of the property is bounded by a lane which must be kept operational throughout the construction period. The south side of the property faces onto a tree-lined residential street. The city requires that existing trees be protected, and both vehicular and pedestrian traffic cannot be blocked. The west side of the property is bounded by another property which has another condominium project on it. The east side of the property is bounded by a sidewalk, then a boulevard and a street. Figures 6.1 shows the North-East and SouthWest Elevations of the project. The footprint of the project’s substructure is 127 feet by 108 feet. Foundation elements consist of spread footings. Figure 6.2 shows the foundation plan view, with the superstructure component of the project being bounded by gridlines 3-7 and B-F. The physical components of the project, e.g. physical locations, foundation system and  191  structural system which will be used for later discussion, are modeled in the REPCON system and shown in Figure 6.3.  E  N  S  W  North-East Elevations  South-West Elevations  Figure 6.1 Elevations of Upper Crust Manor Project  Figure 6.2 Upper Crust Manor Project’s Foundation Plan View  192  Figure 6.3 Physical Components of Upper Crust Manor Project It is anticipated that notice to proceed will be received in time for a 20 October 2003 start of fieldwork. The normal workweek will be Monday to Friday, 7:30 am to 4:00 pm. As move in over the days of 01-03 September 2004 has been promised to the purchasers of the condo units, the project is desired to be completed no later than 31 August 2004, including receipt of the occupancy permit. Figure 6.4 shows the list of activity planning structures used to model the project. 193  Activities of Upper Crust Manor Project (01 through13.02.05, Cont’d)  Figure 6.4 List of Activity Planning Structures of Upper Crust Manor Project  194  Activities of Upper Crust Manor Project (14 through 44, End)  Figure 6.4 (Cont’d) List of Activity Planning Structures of Upper Crust Manor Project Many of these structures involve work at multiple locations (see Appendix C.1). The planned logics, dates, durations and total floats of the activities in the base schedule are given in Appendix C.1. Figure 6.5 shows the project’s participants modeled in the organizational/contractual view of the REPCON system. On 20-Oct-03 the construction project received the notice to proceed and started on time.  195  Figure 6.5 Participants of Upper Crust Manor Project In what follows, analysis is made based on results from two updating cycles, representing progress as of 28 November 2003 and progress as of 31 December 2003. The actual daily statuses (i.e. postponed, started, on-going, idle or finished) for the project’s activities up to the first progress date (28Nov03) are recorded and shown in  196  Appendix C.2. After recomputing the schedule with the daily status information, the updated schedule is obtained (Appendix C.3). It is noted that the project’s new expected finish date has changed from the project’s original finish date of 31-Aug-04 to 16-Sept04. First Schedule Variance Analysis (20Oct03 base schedule with 28Nov03 schedule) In order to diagnose reasons for the schedule deviation, as discussed in previous chapters, the first step in the diagnostic approach as formulated and implemented is to conduct a schedule variance analysis using the base schedule as the target. The focus of the analysis is on the time window between the project’s planned start date and the current progress date of 28-Nov-03. Figure 6.6 shows the schedule variance analysis results. It can be seen that up to the current progress date, activity ‘02 Mobilize & clear site’ already started and finished on time as planned with the start milestone ‘01 Receipt of notice to proceed’, as its start predecessor, being received on the planned project start date. Activity ‘03 Bulk excavate substructure’ immediately followed activity 02 and started on time, but it finished 10 days later than planned, which as indicated in Figure 6.6 can be attributed to its 7 days of extended working time and 3 idle days. Activity ‘04 Shotcrete shoring’ has a start predecessor of activity 03 having a start-to-start plus 2-day lag relationship with it. It should have started on time considering its explicit start predecessor activity 03 actually started on time, but it didn’t and started 2 days later than the planned start date. Thus, some implicit predecessor could be there, causing the start date variance. Although up to the progress date activity 04 is still ongoing, it has had 11 days of finish date variance, which can be attributed to the 2 days start date variance and 9 days of duration variance that is resulted from 7 days of extended working time and 2 idle days it experienced. For activities ‘05 Excavate wall, core, column footings’ and ‘051 Excavate crane footing’, they have already started but both have 9 days of start date variance, which as seen in Figure 6.6 can be reasonably attributed to their explicit start predecessor activity 03 which has a finish-to-start logic relationship with them and which finished actually 10 days later than its planned finish date. Up to the progress date, activities 05 and 051 have experienced 9 days of finish date variance, which can be fully attributed to their start date variances. The only difference between them is that activity  197  Figure 6.6 Schedule Variance Analysis Results for Upper Crust Manor Project (Schedule as of 28-Nov-03 with Base Schedule)  198  051 has been entirely completed but activity 05 is still ongoing, so its duration variance, idle time, extended working time variance and finish date variance cannot be definitely determined (indicated by brackets [ ] around the variance values). As to activity ‘44 Procure reinforcing trade at BIDP procurement sequence process step’, it started on time but finished one day later than planned, which resulted from its extended working time of one day. The finish date variance then caused the start date variance (1 day) of its successor ‘44 Procure reinforcing trade at TENDER procurement sequence process step’. Along with the extended working time of 2 days it experienced, activity 44 at the TENDER process step finally had 3 days finish date variance. This finish date variance then resulted in the start date variance (3 days) of activity ‘44 Procure reinforcing trade at EV/AC procurement sequence process step’. The 7 days finish date variance of activity 44 at EV/AC can be attributed to its start date variance and the extended working time (4 days) it experienced. It can be seen as well from the results (Figure 6.6) that for the activities from ‘06.01 F/P/S perimeter wall footings’ to ‘14.01 Erect crane’, up to the progress date, they should have started according to the original base schedule, but their start has been delayed due to their explicit start predecessors’ variances. First Causal Model Analysis (20Oct03 base schedule with 28Nov03 schedule) In order to further diagnose actual causes for the aforementioned schedule variances which cannot be explained by using quantitative causal models (e.g. activity 03’s idle time, extended working time, and activity 04’s implicit predecessor, etc.), as discussed before, user-defined experience-based causal models will be made use of to search out data collected as part of the control process in order to help explain the variances. The causal models defined at the standards side and which are applicable to projects in general or high-rise building projects specifically are copied over to the Upper Crust Manor project side, as shown in Figure 6.7. Model 1-1 can be used to help explain working time variance for any activity belonging to the construction phase, i.e. the causal factors in it are general enough to be applicable to any construction activity. The same is for Models 2 and 3, which are targeted at helping to explain idle time and implicit predecessor variance, respectively for any activity in the construction phase. As for models 1-2, 1-3, 1-4, and 1-5, they are respectively applicable to activities only belonging to  199  Figure 6.7 User-defined Causal Models for Upper Crust Manor Project the construction sub-phases of excavation/shoring, foundations/U/G utilities, substructure or superstructure. In this validation work, because of data availability and time limitations, attention is only focused on front-end site activities (i.e. activities in excavation, foundation and substructure construction sub-phases, but not procurement and superstructure activities). Thus, in Table 6.1 only detailed definitions of the causal factors in models 1-1, 1-2, 1-3, 1-4, 2 and 3 are given, but to some extent it still shows that experience-based causal models can be flexibly defined or customized by the user in terms of using different available data fields, setting different thresholds, using different descriptions to name factors and specifying different applicable activity attribute values. And, if necessary the user can flexibly add/delete any factor s/he prefers to the corresponding experience-based causal models based on their knowledge of the particular project context at hand. For several of the models as defined, it can be observed that extensive use is made of the problem code feature of the as-built view. With the schedule variance analysis results (Figure 6.6) and user-defined causal models (Figure 6.7) at hand, causal model analysis can then be conducted. The applicable appropriate models are automatically selected to help explain the corresponding variances encountered by different on-site activities (Figure 6.8). Partial causal model analysis results are shown in Figure 6.9. Detailed information in the causal model analysis report as to which hypothetical factors are confirmed as likely actual causes and the data evidence found is given in Appendix C.4.  200  Table 6.1 Causal Factor Definitions for Upper Crust Manor Project Model Description  Causal Factor Name  Change in scope  Poor communication Model 1-1 Working time variance  Unclear design /instruction  Miscellaneous problems  Excess precipitation  Inclement weather condition  Model 1-2 Working time variance  Poor ground/site condition  Insufficient equipment  Poor site access  Factor’s Literal Definition If any actual attribute value(s) is different from the planned attribute value(s) of the PCBS components associated with the activity of interest, then this factor could be confirmed as a likely actual cause for working time variance. If the communication evaluation rate for the participant responsible for the activity of interest is poor, then the factor could be confirmed as a likely cause for working time variance. If the number of records, which are associated with an activity of interest and whose record type is RFI or SI, is greater than or equal to 3 and whose record date is greater than the actual start date of the activity, then this factor could be confirmed as a likely actual cause for working time variance. If there are problem codes recorded on actual working dates, which are from the problem code categories of owner/consultants, design/drawings, schedule management, utilities/city, or miscellaneous, and which are recorded as having affected time or productivity performance, then this factor could be confirmed as a likely actual cause for working time variance. If the average value of precipitation data on actual working dates of an activity of interest is greater than 10 mm, then this factor could be confirmed as a likely actual cause for working time variance. If there are problem codes recorded on actual working dates, which are from the problem code category of weather condition and having affected time or productivity performance, then this factor could be confirmed as a likely actual cause for working time variance. If there are problem codes recorded on actual working dates, which are from the problem code category of site condition and recorded as having affected time or productivity performance, then this factor could be confirmed as a likely actual cause for working time variance. If there are problem codes recorded on actual working dates, which are from the problem code category of supplies and equipment and recorded as having affected time or productivity performance, then this factor could be confirmed as a likely actual cause for working time variance. If on actual working dates of an activity of interest the access to site recorded in daily site of the as-built view is poor, then this factor could be confirmed as a likely actual cause for working time variance.  Activity Attribute Association Sensitive to: design changes GT 0.2  Sensitive to: design complexity GE 0.4  Sensitive to: precipitation GT 0.5  Sensitive to: ground conditions GE 0.5  Sensitive to: equipment intensive GT 0.6 Sensitive to: work face access restrictions GT 0.4  Cont’d 201  Model Description  Causal Factor Name Excess precipitation Inclement weather condition Labor insufficiency  Model 1-3 Working time variance  High labor turnover rate  Work force problem  Poor ground condition  Excess precipitation Inclement weather condition Model 1-4 Working time variance  Labor insufficiency  Low labor skill  Work and Labor problem  Factor’s Literal Definition If the average of precipitation data on actual working dates of an activity of interest is greater than 15 mm, then this factor could be confirmed as a likely actual cause for extended working time. The definition is the same as ‘inclement weather condition’ in Model 1-2 If on actual working dates of an activity, any labor sufficiency information recorded at trade level in the asbuilt view for the participant responsible for the activity is ‘No’, then this factor could be confirmed as a likely actual cause for working time variance. If on actual working dates of an activity, any labor turnover rate information recorded at trade level in the as-built view for the participant responsible for the activity is ‘No’, then this factor could be confirmed as a likely actual cause for working time variance. If there are problem codes recorded on actual working dates, which are from the problem code category of work force and having affected time or productivity performance, then this factor could be confirmed as a likely actual cause for extended working time. If on actual working dates of an activity, any ground condition information recorded in the as-built view is ‘Poor’, then this factor could be confirmed as a likely actual cause for extended working time. If the average of precipitation data on actual working dates of an activity of interest is greater than 5 mm, then this factor could be confirmed as a likely actual cause for extended working time. The definition is the same as ‘Inclement weather condition’ in Model 1-2 The definition is the same as ‘Labor insufficiency’ in Model 1-3  If on actual working dates of an activity, any labor skill information recorded at trade level in the as-built view for the participant responsible for the activity is ‘low’, then this factor could be confirmed as a likely actual cause for extended working time. If there are problem codes recorded on actual working dates, which are from the problem code categories of work force or work and recorded as having affected time or productivity performance, then this factor could be confirmed as a likely actual cause for extended working time.  Activity Attribute Association Sensitive to: precipitation GT 0.5  Work type: labor intensive GT 0.4 Work type: labor intensive GT 0.6  Sensitive to: ground conditions GT 0.3 Sensitive to: precipitation GT 0.3  Work type: labor intensive GT 0.4 Work type: labor intensive GT 0.4  Cont’d  202  Model Description  Causal Factor Name Excess precipitation  Delay in response to RFI Model 2 Idle time  Broken Equipment  Problems on idle dates  Site access problem  Model 3 Implicit predecessor  Drawings unavailable  Excess precipitation  Problems on implicit predecessor dates  Factor’s Literal Definition If the precipitation data on an activity’s idle dates is greater than 30 mm, then this factor could be confirmed as a likely actual cause for idle time. If a RFI’s response date is later than its response required by date, which is associated with the activity of interest and whose record date is later than the activity’s actual start date, then this factor could be confirmed as a likely actual cause for idle time. If on an activity’s idle dates the recorded status in daily site of any equipment used by the activity is ‘Broken’, then this factor could be confirmed as a likely actual cause for idle time. If there are problem codes recorded on an activity’s idle dates, which are recorded as having affected time performance, then this factor could be confirmed as a likely actual cause for idle time. If on the implicit predecessor dates between 'the date of actual start date-implicit variance days' and 'the date of actual start date-1 day(s)', any site access information recorded in the daily site of as-built view is ‘poor’, then this factor can be confirmed as a likely actual cause for delayed start. If a drawing’s date received is between 'the date of actual start date-implicit variance days' and 'the date of actual start date -1 day(s)', which is associated with the activity of interest, then this factor could be confirmed as a likely actual cause for a delayed start. If the precipitation data on the dates between 'the date of actual start date-implicit variance days' and 'the date of actual start date -1 day(s)' is greater than 30 mm, then this factor could be confirmed as an actual cause for a delayed start. If there are problem codes recorded on the dates between 'the date of actual start date-implicit variance days' and 'the date of actual start date -1 day(s)', which are recorded as having affected time performance, then this factor could be confirmed as a likely actual cause for a delayed start.  Activity Attribute Association Sensitive to: precipitation GE 0.2  Sensitive to: equipment intensive GT 0.3  Sensitive to: work face access restrictions GT 0.6 Sensitive to: design complexity GT 0.3 Sensitive to: precipitation GT 0.2  End  203  Figure 6.8 Causal Models Automatically Selected to Explain Schedule Variances  Figure 6.9 Sample of Causal Model Analysis Results for Upper Crust Manor Project From the report it is noted that activity ‘03 Bulk excavate substructure’ extended working time (7 days) can be explained by making use of Models 1-1 and 1-2 together, in which a total of nine hypothesized factors appear. As seen in Appendix C.4, “Change in scope” is identified as a likely actual cause because the actual volume of contaminated soil (15 m3) is found to be different from the planned volume (0 m3). “Excess precipitation” is also identified as a likely actual cause because the average value of the actual precipitation on the activity’s working dates is greater than 10 mm, which is  204  corroborated to some extent by data found for the causal factor “Inclement weather condition” which corresponds to the presence of problem codes recorded on actual working dates (e.g. on 30Oct03 the precipitation data is 22 mm, and the problem code of “1.1 Rain” was also found as having been recorded on the same day.) Daily site records associated with an activity of interest and recorded problem codes are also identified as part of the search process and then listed as supportive evidence as well (e.g. photos PH11, 12, 13, etc. and one letter LT-06 can be viewed as supportive evidence for the problem “1.1 Rain” recorded). “Poor ground/site condition” is confirmed as another likely actual factor contributing to the extended working time since some problem codes from the problem code category of site condition were recorded (e.g. “2.6 poor ground conditions”, “2.7 Unexp geotech conditions”, and “2.8 contaminated soil” that is also suggestive of the change in scope). Again, supportive daily site records for these problem codes are identified (e.g. PH-28 is identified for “2.6 poor ground conditions” which reflects the muddy conditions impacting construction performance). And, another actual problem encountered and identified by examining relevant problem codes recorded is “Insufficient equipment”. For activity 03’s extended working time, the other four causal factors in the user specified experience-based models are not identified as the likely actual causes. The reasons for this likely include: 1. no supportive data was found (e.g. for “Poor communication”, no communication assessment value was found for the activity’s responsibility T001 Excavation and shoring trade), 2. the data found did not help confirm the actual occurrence of the factor (e.g. “Poor site access”, on activity 03’s actual working dates the site access condition is found to be fair), or 3. the hypothesized causal factor is not applicable to the activity due to the threshold of the attribute values assigned (e.g. “Unclear design/instruction” is only applicable to an activity whose attribute value of “sensitive to design complexity” is greater than or equal to 0.4). It is observed that activity 03’s attribute value for sensitive to design complexity is 0 (see Appendix C.9), thus N/A - Not Applicable, as a test result is shown for the causal factor for activity 03 in the report. As for activity 03’s idle time (3 days) identified, Model 2 can be used to help identify the likely actual reasons. It can be seen from the relevant report (Appendix C.4) that “Excess precipitation” is identified as a likely actual cause since on one of its idle  205  days (14Nov03) the precipitation level recorded (38 mm) is greater than the causal factor’s user-specified threshold (30 mm). This is corroborated by another confirmed factor “Problems on idle dates” which seeks problem codes recorded on an activity’s idle dates and as having affected time performance. It is noted that the precipitation on another idle date of 06Nov03 is 26 mm, less than the specified threshold of 30 mm. However, on the same day problem code “1.1 Rain” is found, which could indicate that the user might query the reasonableness of the specified threshold value. As to another idle date (26Nov03), from the confirmed factor “Problems on idle dates”, it can be reasonably said that unexpected geotechnical conditions is the very likely actual cause. For the other two hypothesized causal factors in Model 2, “Delay in response to RFI” and “Broken equipment”, because no relevant supporting data was found, they cannot be regarded as reasons resulting in activity 03’s unexpected idle days. For ac