Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Community-based, macro-level collision prediction models Lovegrove, Gordon Richard 2006

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata

Download

Media
831-ubc_2006-130180.pdf [ 23.82MB ]
Metadata
JSON: 831-1.0063261.json
JSON-LD: 831-1.0063261-ld.json
RDF/XML (Pretty): 831-1.0063261-rdf.xml
RDF/JSON: 831-1.0063261-rdf.json
Turtle: 831-1.0063261-turtle.txt
N-Triples: 831-1.0063261-rdf-ntriples.txt
Original Record: 831-1.0063261-source.json
Full Text
831-1.0063261-fulltext.txt
Citation
831-1.0063261.ris

Full Text

COMMUNITY-BASED, MACRO-LEVEL COLLISION PREDICTION MODELS by GORDON RICHARD L O V E G R O V E B.A.Sc., University of British Columbia, 1982 M.Eng., University of British Columbia, 1988 M.B.A. , Simon Fraser University, 1993 A THESIS SUBMITTED IN PARTIAL F U L F I L L M E N T OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY THE F A C U L T Y OF G R A D U A T E STUDIES (CIVIL ENGINEERING) UNIVERSITY OF BRITISH C O L U M B I A December 2005 © Gordon Richard Lovegrove, 2005 Lovegrove 11 ABSTRACT The burden on communities due to the enormous economic and social costs associated with road collisions has been recognized worldwide as a major problem of epidemic proportions. Given the magnitude and persistence of the problem, spanning many decades, organizations worldwide have initiated engineering and research programs to improve road safety. There are two main transportation engineering approaches to improving the safety performance of the road component: reactive and proactive. The reactive or traditional engineering approach has been to address road safety in reaction to existing collision histories. While it has proven to be very successful, road safety authorities and researchers are also pursuing more proactive engineering approaches. Rather than working reactively to improve the safety of existing facilities, the proactive engineering approach to road safety improvement focuses on predicting and improving the safety of planned facilities. Reactive and proactive programs both rely heavily on reliable empirical techniques, including collision prediction models (CPMs). Reactive programs use micro-level collision prediction models, which focus on a single facility. Reliable micro-level C P M methods and techniques have been well researched and refined. However, while micro-level CPMs successfully support the reactive engineering approach, several shortcomings have been identified related to unsuccessful attempts by planners and engineers to use them in proactive road safety planning. Given these shortcomings of micro-level CPMs in planning-level (i.e. macro-level) road safety evaluations, there exists a research gap of reliable empirical tools to pursue road safety in a proactive manner. In view of this lack of reliable macro-level empirical tools, the main goal of this thesis was to develop macro-level CPMs, and to provide guidelines for their use by planners and engineers, so that road safety could be explicitly considered and reliably estimated in all stages of the road planning process. The approach taken included developing macro-level CPMs using extensive data extraction and Generalized Linear Regression Modeling (GLM) regression techniques, and then developing guidelines for use of those models based on several case studies of road safety planning applications. Lovegrove 111 This thesis describes the results of that research on the development and use of community-based, macro-level CPMs using data from 577 neighbourhoods or Traffic Analysis Zones (TAZs) across the Greater Vancouver Regional District in British Columbia, Canada. The models predict mean collision frequency based on associations with variables from one of four neighbourhood characteristic themes, including exposure, socio-demographics (S-D), Transportation Demand Management (TDM), and network. A set of model use guidelines has also been proposed. To test whether the developed models and guidelines could be practical and relevant for practitioners, this research has also demonstrated the use of macro-level CPMs in several reactive and proactive case studies. The development of models and model-use guidelines in this research, together with their application in several case studies, have been offered as contributions toward addressing the research gap that has limited the effectiveness of the proactive engineering approach in road safety improvement programs. It is believed that these tools will contribute significantly to improved safety planning decisions by community planners and engineers, significantly enhanced effectiveness in road safety improvement programs, and, ultimately, to long term social and economic benefits for all communities. Lovegrove iv CONTENTS ABSTRACT ii , CONTENTS iv TABLES xi FIGURES xii ABBREVIATIONS xiii ACKNOWLEDGEMENTS xiv 1. INTRODUCTION 1 1.1 Background 1 1.1.1 The Road Safety Problem 2 1.1.2 Road Safety Improvement Programs 2 1.1.3 Empirical Tools 3 1.2 Research Problem: Lack of Reliable Empirical Tools for Proactive Engineering Approach 4 1.2.1 Objective 1: Develop Models and Test Their Significance 5 1.2.2 Objective 2: Develop Guidelines for the Use of Macro-Level CPMs 6 1.2.3 Objective 3: Provide Case Studies as Examples of the Use of Macro-Level CPMs 8 1.3 Thesis Structure 8 2. LITERATURE REVIEW 10 2.1 Introduction 10 2.2 The Magnitude of the Road Safety Problem 10 2.2.1 Global Context 10 2.2.2 North American Context 11 2.2.3 Local Context 13 2.3 Reactive (Traditional) Road Safety Improvement Programs 15 2.3.1 Black Spot Programs 15 2.3.1.1 Statistical Techniques 17 2.3.1.2 Empirical Bayes Technique 18 2.3.1.3 Using Collision Prediction Models 26 2.3.1.4 Identification of Collision Prone Locations using CPMs 29 2.3.1.5 Ranking of Collision Prone Locations 30 2.3.2 Diagnosis of Black Spots 32 2.3.3 Remedy of Black Spots 33 2.3.4 Collision Prediction Model Development 35 Lovegrove 2.3.4.1 Regression Techniques 2.3.4.2 Model Form 2.3.4.3 The G L M Process 2.3.4.4 Selection of Explanatory Variables 2.3.4.5 Goodness of Fit 2.3.4.6 Outlier Analysis 2.3.5 Issues with the Traditional Reactive RSIP Approach 2.4 Proactive Road Safety Improvement Programs 2.4.1 Road Safety Audits 2.4.2 Combining Micro-Level CPMs & Regional Transportation Planning Models 2.4.3 Dutch Sustainable Road Safety & Transportation Demand Management 2.4.4 Proactive Road Safety Planning Framework 2.5 Development of Macro-Level Collision Prediction Models 2.5.1 Macro-Level C P M Research 2.5.2 Issues Regarding Development & Use of Macro-Level CPMs 2.5.2.1 Availability and Quality of Data 2.5.2.2 Model Transferability across Time-Space Regions 2.5.2.3 Variables 2.5.2.4 Model Form 2.5.2.5 Aggregation Bias 2.5.2.6 Sensitivity Analyses 2.5.2.7 Macro-Level C P M Use in Black Spot Programs 2.5.2.8 Proactive Use in Regional & Neighbourhood Planning 2.5.3 Clues to Macro-Level C P M Variables 2.5.3.1 From Macro-Level CPMs & Other Models 2.5.3.2 From Macro-Level Land Use & Transportation Planning Studies 2.5.3.3 From Micro-Level Models 2.6 Summary 3. M E T H O D O L O G Y FOR D A T A E X T R A C T I O N & M O D E L D E V E L O P M E N T 3.1 Introduction 3.2 Data Extraction 3.2.1 Geographic Scope 3.2.2 Aggregation 3.2.2.1 Aggregation Units 3.2.2.2 Boundary Effects 3.2.3 Variable Definitions Lovegrove v i 3.2.3.1 Stratification 81 3.2.3.2 Screening 83 3.2.4 Extraction Sources 85 3.2.4.1 Network Variables 85 3.2.4.2 Exposure Variables 85 3.2.4.3 Socio-Demographic Variables 87 3.2.4.4 Transportation Demand Management Variables 88 3.2.4.5 Collision Variables 90 3.3 Model Development 93 3.3.1 Groupings 93 3.3.2 Regression 94 3.3.3 Form 94 3.3.4 Goodness of Fit 95 3.3.4.1 Selection of Explanatory Variables 95 3.3.4.2 Model Refinement 95 3.4 Summary 97 4. M O D E L D E V E L O P M E N T RESULTS 98 4.1 Introduction 98 4.2 Stratified Results 98 4.2.1 Modelled & Measured CPMs 104 4.2.2 Urban & Rural CPMs 105 4.2.3 Collision Types 105 4.3 Statistical Associations 106 4.3.1 Exposure Models (groups 1, 2, 3, 4) 107 4.3.2 Socio-Demographic Models (groups 5, 6, 7, 8) 107 4.3.3 Transportation Demand Management Models (groups 9, 10, 11, 12) 109 4.3.4 Network Models (groups 13, 14, 15, 16) 109 4.3.5 Integrated Models 111 4.4 Possible Causal Mechanisms 112 4.4.1 Exposure Models (groups 1, 2, 3, 4) 112 4.4.2 Socio-Demographics Models (groups 5, 6, 7, 8) 113 4.4.3 Transportation Demand Management Models (groups 9, 10, 11, 12) 114 4.4.4 Network Models (groups 13, 14, 15, 16) 114 4.5 Summary 115 Lovegrove vii 5. G U I D E L I N E S F O R M O D E L U S E 5.1 Introduction 117 5.2 Selecting the Appropriate Model(s) 117 5.3 Guidelines for Macro-Reactive Use 123 5.3.1 Black Spot Programs 125 5.3.1.1 Identification & Ranking 125 5.3.1.2 Diagnosis 126 5.3.1.3 Remedy 126 5.3.2 Col l i s ion Modificat ion Factor Estimation 128 5.4 Guidelines for Proactive Use 131 5.4.1 Guidelines for Regional-Level Safety Planning 131 5.4.1.1 Zones o f Influence 131 5.4.1.2 Data Extraction 133 5.4.1.3 Interpretation o f Results 135 5.4.1.3.1 Highway Exclusion 135 5.4.1.3.2 Prediction Variance 136 5.4.2 Guidelines for Neighbourhood-Level Safety Planning 139 5.4.2.1 Zones o f Influence 139 5.4.2.2 Data Extraction 139 5.4.2.3 Interpretation of Results 140 5.4.3 Transferability Guidelines 140 5.4.3.1 Data Definitions 140 5.4.3.2 Sufficient Data Points 140 5.4.3.3 G L M Process 141 5.4.3.4 Goodness-of-Fit 141 5.5 Summary 142 6. M A C R O - R E A C T I V E S A F E T Y A P P L I C A T I O N S 143 6.1 Introduction 143 6.2 Black Spot Program Case Study 143 6.2.1 Approach 143 6.2.1.1 Identification & Ranking 144 6.2.1.2 Diagnosis & Remedy 144 6.2.2 Results 145 6.2.2.1 Identification & Ranking 145 6.2.2.2 Diagnosis & Remedy 147 6.2.2.2.1 Col l i s ion Prone Zones 147 Lovegrove v i i i 6.2.2.2.2 Safer Zones 152 6.2.3 Lessons Learned 152 6.2.3.1 Simplicity 153 6.2.3.2 Relevance 153 6.2.3.3 Early Warning 153 6.2.3.4 Dual Indicator Diagnosis 153 6.2.3.5 Thematic Remedies 153 6.2.3.6 Ranking 1 154 6.2.3.7 CMFs 154 6.3 Macro-Level Collision Modification Factors 154 6.3.1 Approach 155 6.3.2 Results 157 6.3.3 Lessons Learned 158 6.3.3.1 Complexity 158 6.3.3.2 Individual versus Cumulative OR Calculations 158 6.3.3.3 Use of'Measured' Models 158 6.3.3.4 Uniformity 158 6.3.3.5 Different Study & Zone Areas 158 6.3.3.6 Reasonableness of C M F Estimates 159 6.4 Summary 159 6.4.1 Black Spot Programs 159 6.4.2 Macro-CMFs 160 7. PROACTIVE S A F E T Y APPLICATIONS 161 7.1 Introduction 161 7.2 Regional Road Safety Planning Case Study 161 7.2.1 Background 161 7.2.2 Approach 163 7.2.3 Results 165 7.2.3.1 Urban Total Collisions 166 7.2.3.2 Urban Severe Collisions 167 7.2.3.3 Rural Total Collisions 167 7.2.3.4 Rural Severe Collisions 167 7.2.4 Lessons Learned 169 7.2.4.1 Influence of the V C Variable 169 7.2.4.2 Use of Individual or Multiple CPMs 169 7.2.4.3 Limitations from excluding Highways 169 Lovegrove ix 7.2.4.4 Data Intensiveness 170 7.2.4.5 Thematic Mapping 170 7.2.4.6 Proof of Concept 170 7.3 Neighbourhood Road Safety Planning Case Studies 171 7.3.1 Background 171 7.3.1.1 Road Network 171 7.3.1.2 Core Size 171 7.3.2 Model Selection 172 7.3.3 Approach 172 7.3.3.1 Road Network 172 7.3.3.2 Core Size 174 7.3.4 Results 175 7.3.4.1 Road Network 175 7.3.4.2 Core Size 177 7.3.5 Lessons Learned 177 7.3.5.1 Selection of Models 178 7.3.5.2 Three-Way Intersections 178 7.3.5.3 Other (Non-Safety) Considerations 178 7.3.5.4 Control Variables versus Sensitivity Analyses 178 7.3.5.5 Interpretation of Results 179 7.3.5.6 Optimum Core Size 179 7.4 Transferability Case Study 179 7.4.1 Background 179 7.4.2 Approach 180 7.4.3 Results 181 7.4.4 Lessons Learned 183 7.4.4.1 Transferability of Macro-Level CPMs 183 7.4.4.2 Adequate Data Points 183 7.4.4.3 Goodness of Fit z Statistic 184 7.5 Summary 184 7.5.1 Regional Planning 184 7.5.2 Neighbourhood Planning 185 7.5.3 Transferability 186 8. CONCLUSIONS, CONTRIBUTIONS & FUTURE RESEARCH 187 8.1 Introduction 187 8.2 Summary & Conclusions 187 Lovegrove X 8.3 Research Contributions 193 8.3.1 Development of macro-level collision prediction models as improved and reliable empirical tools for use by planners and engineers in road safety planning 193 8.3.2 Development of Recommended Guidelines for the Use of Macro-Level Collision Prediction Models in Road Safety Planning, in ways that complement and enhance traditional road safety improvement programs 195 8.3.3 Demonstration of the Validity of Macro-Level Collision Prediction Models by Testing of Recommended Model-Use Guidelines in Macro-Reactive and Proactive Safety Applications 196 8.4 Recommendations for Future Research 197 8.4.1 Macro-Level Collision Prediction Model Development 197 8.4.2 Macro-Reactive Safety Applications 199 8.4.3 Proactive Safety Applications 199 REFERENCES 202 APPENDIX A: LIST OF POSSIBLE M A C R O - L E V E L C P M V A R I A B L E S 212 APPENDIX B: M O D E L D E V E L O P M E N T GLIM4 O U T P U T S A M P L E 215 APPENDIX C: S A M P L E LISTING OF U R B A N T O T A L C P M CPZ RANKINGS 217 APPENDIX D: T R A N S F E R A B I L I T Y GLIM4 OUTPUT S A M P L E 218 Lovegrove xi TABLES Page Table 2.1 Road Collision Costs in British Columbia Using Three Estimating Methods 14 Table 2.2 Transportation Demand Management Strategies 50 Table 2.3 Calculating the Road Safety Risk Index 56 Table 2.4 Macro-level C P M Variables 70 Table 2.5 Other Macro-Level Study Variables 71 Table 2.6 Area-Wide Land Use & Transportation Planning Variables 73 Table 2.7 Micro-Level C P M Variables 74 Table 3.1 Candidate Variables - Collisions, Exposure, Socio-Demographic 84 Table 3.2 Candidate Variables - Transportation Demand Management, Network 84 Table 3.3 Model Groups 93 Table 4.1 Exposure CPMs - Total/Severe 99 Table 4.2 Socio-Demographic CPMs - Total/Severe 100 Table 4.3 Transportation Demand Management CPMs - Total/Severe 101 Table 4.4 Network CPMs - Total/Severe 102 Table 4.5 Other CPMs - A M , A M / P M , A M / P M Severe, Non-Rush, Pedestrians 103 Table 4.6 Integrated CPMs - A M , A M Severe, Bicycle, Non-Rush Collisions 104 Table 4.7 CPMs by Land Use & Extraction Source 104 Table 4.8 Averages (Ranges) of the Leading Exposure Variable Powers 112 Table 5.1 Checklist for Selecting the Appropriate CPMs 119 Table 5.2 Recommended Model Groups based on Planning Topic and Comparison 120 Table 5.3 Recommended Exposure Data Type depending on Time Horizon and Scope 121 Table 5.4 Data Extraction Effort for Planning Analyses, based on Theme & Time Horizon 122 Table 5.5 Candidate C P M Groups 123 Table 5.6 Regional Averages 127 Table 6.1 Urban CPZ & SZ Identification 145 Table 6.2 Rural CPZ & SZ Identification 146 Table 6.3 Traffic Calming C M F Estimates 157 Table 7.1 Three Year Plan Regional Exposure Impacts 162 Table 7.2 Differences in Collision Sums between Three Year Plan & Base Scenarios (2007) 166 Table 7.3 Access Road Network Safety 175 Table 7.4 Relative Comparison to 3-way Offset Network 177 Table 7.5 G V R D Data Points Available for C P M Transferability 180 Table 7.6 Transferability Results - GVRD-wide to Vancouver-only 182 Lovegrove Xll FIGURES Page Figure 2.1 Traffic deaths per billion vehicle kilometres traveled 11 Figure 2.2 United States Road Collision Fatality Rate History 12 Figure 2.3 BC Road Deaths, 1983 - 2002 13 Figure 2.4 Empirical Bayes Identification of Collision Prone Locations 30 Figure 2.5 Subjective Model Goodness of Fit Measures 41 Figure 2.6 Alternative Neighbourhood Road Patterns 52 Figure 2.7 Framework for Proactive Road Safety Planning 54 Figure 3.1 Greater Vancouver Regional District 78 Figure 3.2 G V R D Traffic Analysis Zones & Census Tracts (1996) 80 Figure 3.3 G V R D Travel Demand Density 86 Figure 3.4 G V R D Population Density 87 Figure 3.5 G V R D Job Density 88 Figure 3.6 G V R D Collision Densities 92 Figure 5.1 Conventional & Macro-Reactive Black Spot Methods 124 Figure 5.2 Impact Propagation 132 Figure 5.3 'Modeled'Macro-Level C P M Variables & Data Sources 133 Figure 5.4 Sample of Emme/2 Digital Road Links & Traffic Analysis Zones 134 Figure 5.5 V K T Impacts due to L R T , with highways included 136 Figure 5.6 Example of Data Assembly and CPM Application 138 Figure 6.1 Collision Prone Zones 146 Figure 6.2 Safer Zones 146 Figure 6.3 Urban CPZ6 148 Figure 6.4 Urban CPZ1 149 Figure 6.5 Rural CPZ10 150 Figure 6.6 Rural CPZ 14 151 Figure 6.7 Urban SZ1 152 Figure 7.1 Three Year Plan Zonal Exposure Impacts 164 Figure 7.2 Urban 'Total' Collision Differences, by Zone 168 Figure 7.3 Neighbourhood Access Road Network Options 173 Lovegrove ABBREVIATIONS A R E A = Zonal area BC = Province of British Columbia, Canada Black Spot = Hazardous location C d = Collision Density C M F = Collision Modification Factor CPL = Collision Prone Location CPZ = Collision Prone Zone DRIVE = Zonal commuters who Drive EB = Empirical Bayes G L M = Generalized Linear Regression Modeling INTD = Zonal Intersection Density IALP = Percentage Zonal Intersections Arterial-Local Macro-level = area-wide (e.g. neighbourhood) Micro-level = single location (e.g. intersection) NB = Negative Binomial OCP = Official Community Plan PCR = Potential Collision Reduction RSIP = Road Safety Improvement Program RSRI = Road Safety Risk Index SCC = Zonal Shortcut Capacity (measured) SCVC = Zonal Shortcut Capacity (modelled) SD = Scaled Deviance SIGD = Zonal Signal Density SPD = Average Operating Speed T A Z = Traffic Analysis Zone T C M = Total zonal CoMmuters T L K M = Total Lane Kilometres V C = Volume to Capacity ratio (average congestion) W K G D = Zonal jobs to zonal population ratio A L K P = Percentage of Zone Lane Kilometres Arterial B C A = Benefit/Cost Analysis CCR = Collision Risk Ratio CO RE = Zonal Core area CRP = Core to Zonal Area percentage C P M = Collision Prediction Model DRP = Percentage of Zonal commuters who Drive FS = Average zonal family size INT = Zonal Intersections 13 WP = Percentage of Zonal Intersections 3-Way L L K P = Percentage of Zonal Lane Kilometres Local Macro-Reactive = Reactive use of Macro-level CPMs MoTH = BC Ministry of Transportation & Highways NHD = Zonal Home Density OR = Odds Ratio POPD = Population Density RSPF = Road Safety Planning Framework SCP = Safety Conscious Planning S-D = Socio-Demographic SIG = Zonal Signals SPF = Safety Performance Function SRS = Sustainable Road Safety T C D = Zonal Commuter Density T D M = Transportation Demand Management UNEMP = Zonal Unemployment V K T = Vehicle Kilometres Travelled Lovegrove xiv ACKNOWLEDGMENTS This research has only been possible with the gracious support of many people. First, my sincere thanks to colleagues at the University of British Columbia, Insurance Corporation of British Columbia, Greater Vancouver Regional District, and TransLink, for their data and for their trusted advice, without which the data for this research would not have been realized. To my supervisor and my committee, I give heartfelt thanks for their patience and guidance in seeing me through the thesis 'birthing' process. I would especially like to thank Professor Emeritus Dr. Francis P.D. Navin, who inspired me to pursue a PhD, for his timely advice, experience, and 'soft' strategies to 'get it done'. And to Distinguished University Scholar Professor Dr. Tarek Sayed, my supervisor, my thanks for your masterful technical expertise, and strong direction in 'steering it home.' To my employer, the University of British Columbia, my thanks for allowing me to realize my dream. And especially to Mr. Geoff Atkins, my Associate Vice President, for his understanding, encouragement and support during the time it took for this research. Finally, my thanks and my love to my family and friends for their continual support and encouragement. To my sisters and their families, my nieces and nephews, I thank you all for motivating me to be a good example. To Laura, thank you for your love and support despite the tough times we have faced together and apart. To mom and dad, thanks for always being there, to follow as living examples of love, commitment, and perseverance. Last but not least, I dedicate this thesis to Sarah, my unconditional cheerleader, for putting up with just a "D" for far too long. Lovegrove 1 1. I N T R O D U C T I O N The topic of this thesis is development and use of macro-level collision prediction models (CPMs) in reactive and proactive road safety improvement programs (RSIPs). This topic is motivated by the problem of a lack of reliable empirical tools for planners and engineers to do proactive road safety planning. The problem will be addressed by developing macro-level CPMs using extensive data extraction and Generalized Linear Regression Modelling (GLM) multivariate regression techniques, and then developing guidelines for use of those models based on several case studies involving proactive road safety application. The implications of successful model development and use guidelines include: new empirical tools for planners and engineers to do proactive road safety planning; significantly enhanced effectiveness in RSIPs; and, long term social and economic benefits for all communities. Consequently, this introductory chapter consists of three main sections. Section 1.1 presents background information on the significance of the research problem. Section 1.2 describes the research problem and states the research objectives. Section 1.3 describes the structure of this thesis. 1.1 Background Although many refer to road safety events in terms of accidents, the term collision will be used for clarity in this thesis, because road safety is no accident. The term accident suggests that collisions are totally random events with no way to predict or prevent them. Road collisions that injure people don't just happen without cause, nor are diey purely random or accidental. Moreover, a significant proportion of the number of road collisions that occur at a given location in a given time period is systematic and can be predicted if reliable empirical tools are available (Kulmala, 1995). Development and use of improved empirical tools to reliably predict and prevent road collisions is the goal of this thesis. As a start toward that goal, tliis background section describes the magnitude of the road safety problem, summarizes engineering approaches to the problem, and cites empirical tools used in each engineering approach. Lovegrove 2 1.1.1 The Road Safety Problem The enormous social and economic costs associated with persistent, unacceptably high road collision frequencies have been recognized worldwide as a major problem for many decades. Over 3,000 people die worldwide each day from road collision injuries, roughly 1.2 million annually (WHO, 2004). A further 20 to 50 million people suffer injury and/or disability. Compared with other global health concerns, the magnitude of the toll in human lives lost due to road collisions is considered by many governments and health experts to be a problem of epidemic proportions (WHO, 2004; Gaspers, 2004). Injuries due to road collisions are the 11 t h leading cause of death worldwide. If current trends continue, injuries from road collisions will be the third largest global 'disease' by 2020 (WHO, 2004; Gaspers, 2004). The economic cost of road collisions and injuries is estimated to range from 1% to 2% of gross national product (GNP) worldwide, totalling US$518 billion annually (WHO, 2004). The social and economic burden of road collisions in North America is also enormous. There are more pre-retirement years of life lost to road collisions in the United States than the combined effects of cancer and heart diseases (USDOT, 2001; Waller, 2000). The Canadian Council of Motor Transport Administrators (CCMTA, 1998) reports that the average years of lost life in Canada due to a fatal road collision is 40 years, compared to respiratory disease (9 years), circulatory disease (10 years), or tumors (15 years). The estimated economic cost of road collisions in North America is nearly US$2 trillion per annum (Fricker & Whitford, 2004). 1.1.2 Road Safety Improvement Programs Given the magnitude of the road safety problem and the observation that road collisions are not purely random events, many organizations worldwide have initiated RSIPs. These programs have identified many factors that contribute to collisions, related to failures in one or a combination of the three road system components: the driver, the vehicle, and the road (Sayed, Abdelwahab, & Navin, 1995). Transportation engineering programs focus on improving the safety performance of the road component of the road system. Lovegrove 3 There are two main transportation engineering program approaches to improving the safety performance of the road component: reactive and proactive. The reactive or traditional engineering approach has been to address road safety in reaction to existing collision histories. This approach uses techniques such as empirical Bayes (EB), and micro-level (i.e. single location) CPMs to identify hazardous locations (i.e. black spots) characterized by abnormally high collision frequency and/or severity. After diagnosis to identify the associated collision problems, collision modification factors (CMFs) and benefit/cost (B/C) ratios are used to identify the most cost effective safety countermeasure. While Black Spot Programs are vital and have proven to be very successful, this reactive program approach requires that a significant collision history exist to identify black spots before any action is taken to address the previously undiscovered road safety problems. Moreover, retrofitting countermeasures at identified black spots in existing (i.e. built) communities is usually costly. To prevent black spots from occurring, and to reduce their associated social and economic burdens on society, road safety authorities and researchers are also pursuing more proactive engineering approaches. Rather than working to improve the safety of existing facilities, the proactive engineering approach to road safety improvement focuses on predicting and improving the safety of planned facilities. The goal of the proactive approach is to minimize the road safety risk by evaluating safety throughout each stage of the planning process, to preclude black spots from occurring. If road safety is explicitly addressed as one of the evaluation factors before a project is built, it reduces the number and cost of reactive safety countermeasures that have to be retrofitted into existing communities. Lower-cost RSIP strategies through proactive intervention may ultimately be a better, more sustainable road safety engineering approach than reactive strategies. However, the proactive road safety approach can only be effective if reliable empirical tools are available to support it. 1.1.3 Empirical Tools While reliable empirical tools exist to support the reactive engineering approach (e.g. micro-level CPMs), there is a lack of available tools to pursue road safety in a proactive manner. Yet they are needed to enable planners and engineers to estimate the level of safety of planned projects, of Lovegrove 4 design changes to those projects, and of other proposed safety improvements. Some researchers have tried with limited success to apply micro-level collision prediction models to target safety in planning (Ho et al, 1998; Lord et "al, 2004). However, empirical limitations have been identified with this planning-level use of micro-level CPMs. The limitations relate to the fact that micro-level CPMs inherently predict the level of safety at a single location (e.g. intersection or road segment), where traffic volume levels are known or can be accurately estimated via relatively simple, short-term projections. However, traffic forecasts for planning-level analyses are often done by strategic-level transportation planning models calibrated only coarsely to regional screenlines and not to individual street or intersection counts. As such, planning-level traffic forecasts at any one location are usually inaccurate due to their regional screenline level of calibration, their longer-term timeframe (i.e. often twenty years into the future), and their much broader focus (i.e. major road networks only). In addition to errors in forecasts, the degree of complexity and amount of programming required to incorporate micro-level CPMs into planning level analyses is convoluted, and does not endear itself to being used by practitioners. Therefore, there is a research gap between what is needed and what is available in terms of reliable empirical tools to facilitate the proactive engineering approach to road safety improvement programs. 1.2 Research Problem: Lack of Reliable Empirical Tools for Proactive Engineering Approach To address the empirical tool gap related to the inability of micro-level CPMs to do proactive, planning-level safety evaluations, research is needed on macro-level CPMs. If successful, macro-level CPMs will allow engineers and planners to target safety in a proactive manner, and will lead to significant reductions in the level of road collision frequencies below that achieved to date using reactive techniques. Moreover, proactive tools may be able to complement and enhance traditional reactive Black Spot program effectiveness. Many researchers have recognized the potential of doing proactive, planning-level safety evaluations, and searched for the necessary supporting empirical tools. Dutch researchers in the mid-1990's searched for ways to develop empirical tools to do planning-level analyses, but ran Lovegrove 5 into the problem of predicting future traffic volumes on 'inner roads' that were not part of the major road network usually included in planning models. A number of proxy variables were proposed in lieu of exposure data on these interior neighbourhood or inner roads (e.g. population, intersections, area) (Poppe, 1997). Although no macro-level CPMs have been developed, the Dutch Sustainable Road Safety (SRS) Program has produced guidelines on enhancing road safety through certain land use and transportation planning principles. Subsequently, researchers in Canada have developed a proactive Road Safety Planning Framework (RSPF) that lays further groundwork towards quantifying a planning-level predictive relationship (de Leur & Sayed, 2002, 2003). This has been followed in the United States by development of the Safety Conscious Planning (SCP) program (Herbel, 2004). A recent SCP conference identified the need for planning level use of CPMs, or macro-level CPMs. Early macro-level CPMs research results appear promising and suggest that macro-level CPMs offer an improved empirical tool for proactive road safety improvement programs (Hadayeghi et al., 2003; Ladron de Geuvara et al, 2004). However, the main obstacles have still remained: a lack of reliable macro-level CPMs, and a lack of guidelines for their use. While some early work has been done on development of a RSPF, including a Road Safety Risk Index (RSRI) empirical tool (de Leur & Sayed, 2002, 2003), further refinement is needed. The RSPF and RSRI need to be updated, including development of macro-level CPMs as empirical tools and guidelines for their use. The main goal of this thesis is to develop macro-level collision prediction models, and to propose guidelines for their use by planners and engineers, so that road safety can be explicitly considered and reliably estimated in all stages of the road planning process. Three objectives are identified for this research. 1.2.1 Objective 1: Develop Models and Test Their Significance An initial review of road safety literature has noted two problems related to development of macro-level CPMs, including lack of data, and lack of clear methodology on their development. Three data-related barriers need to be overcome for macro-level C P M development. First, data may not be available in sufficient quantity to enable statistically robust model fitting. Second, the data may be in disparate and non-integrated databases due to multi-jurisdictional boundaries (e.g. municipal, police, insurance broker, road authority). Third, legal issues (e.g. protection of Lovegrove 6 privacy, liability risk) discourage data sharing and central warehousing of road collision databases. The problem related to methodology on their development is that only two studies have been found on macro-level C P M development, each with a differing methodology. Moreover, these two studies have identified only a few variables that may or may not have value for planning-level safety evaluations. Relevant, practical, planning-level descriptors are needed in macro-level CPMs to enhance their effectiveness and facilitate their use by practitioners. These data and methodological problems will be approached in two ways. First, possible data sources and model variables will be identified and screened to ensure only the highest quality data and commonly-used, planning-related variables are used in model development. Second, the literature will be reviewed to identify best practises for model development and goodness-of-fit testing. These two approaches will help to ensure that the resulting models are relevant and reliable as empirical tools that planners and engineers can rely on in proactive safety planning. 1.2.2 Objective 2: Develop Guidelines for the Use of Macro-Level CPMs Despite early studies on the forms of macro-level CPMs, no studies have been found on their application to proactive road safety planning. Practical guidelines for the use of macro-level CPMs are needed to promote consistency in proactive road safety planning. Consistent practises will then promote reliable and comparable estimates of level of road safety. Guidelines need to be developed to facilitate the proper and consistent use of macro-level CPMs in safety planning applications, including: o Complement & Enhance Reactive Approach through Macro-Level Black Spot Analysis -Rather than focusing on those individual facilities in the region with pre-existing collision histories, macro-level CPMs can focus on evaluating the level of road safety risk for each existing T A Z across an entire region. This may facilitate earlier detection of black spots in collision prone zones (CPZs) without having collision histories at all road locations in the zone. Once a CPZ is identified, macro-level CPMs could be further used to diagnose problems, and assess the effectiveness of possible countermeasures across the entire neighbourhood. Earlier detection and remedy of hazards would permit more efficient use of Lovegrove 7 scarce RSIP funding and analytical resources, as well as reduced social and economic burdens on the communities. o Macro-level Collision Modification Factor (CMF) Estimation - CMFs provide estimates of the collision reduction potential of proposed countermeasures, and are a critical part of the diagnosis and remedy phases in Black Spot programs. Macro-level CMFs have area-wide safety benefits (and impacts) that require quantification, but have not previously been documented. If macro-level models are to be used to identify and remedy CPZs, the feasibility analysis of potential countermeasures will require a reliable estimate of their CMF. o Regional Planning Processes - Regional plans usually have long-range spatial-temporal influence, often spanning decades and entire regions. They may therefore benefit greatly from integration with macro-level C P M empirical tools, i f used properly. Guidelines are needed which set out when and how to employ macro-level CPMs in regional planning processes. o Neighbourhood Planning Processes - Planning processes that focus on projects within a single neighbourhood or community provide perhaps the greatest number of opportunities for the proactive application of macro-level CPMs to facilitate safer roads. The Municipal Act in the Province of British Columbia, for example, requires every community to conduct regular updates of their growth management and land use plans. In addition, planners are daily developing sub-area and neighbourhood plans in the course of approving new developments and growth areas. Thus, incorporating the use of macro-level CPMs as an explicit part of neighbourhood-level planning processes has great potential to facilitate sustainably safer communities for all road users. Guidelines are needed which set out when and how to employ macro-level CPMs in these neighbourhood planning processes. o Transferability of Macro-Level CPMs Between Space-Time Regions - Issues related to time trends and geo-demographic differences cause uncertainty for model transferability. Model transferability is desirable when CPMs developed in one time period and/or geo-demographic region (i.e. time-space region) have applicability for use in another region. Not Lovegrove 8 only does there appear to be only limited agreement between researchers on best practises for transferring micro-level CPMs, no studies or guidelines have been found to date regarding transferability of macro-level CPMs. 1.2.3 Objective 3: Provide Case Studies as Examples of the Use of Macro-Level CPMs To demonstrate use of these guidelines and models, several applications of the developed models will be conducted. Two reactive applications will be conducted. First, a macro-level Black Spot analysis will use macro-level CPMs to identify CPZs, related collision problems, and possible remedies. Second, macro-level CPMs will be used to estimate macro-level CMFs. Three proactive applications will be conducted. First, a regional planning application will be carried out to gauge the ability of macro-level CPMs to integrate with and predict road safety risk levels as part of a strategic level, regional transportation planning model. Second, a neighbourhood planning application will be carried out to verify the results of earlier Dutch SRS neighbourhood-level safety planning evaluations. Finally, a test will be conducted of the transferability of macro-level CPMs between different time-space regions. 1.3 Thesis Structure This thesis is divided into eight chapters. This first chapter provides an introduction to the thesis via background information, research problems and goals, and thesis structure. Chapter Two reviews the safety literature, and highlights the state of research on, and specific issues related to, development and use of macro-level CPMs. It describes the magnitude of the road safety problem, reactive and proactive engineering approaches, and current state of their associated empirical tools. Chapter Three describes the data and methodology used to develop the macro-level CPMs used in meeting the goals of this research. Chapter Four discusses the resulting macro-level CPMs, describing their goodness of fit, interpreting their associations, and comparing them with other modeling results found in the literature. Chapter Five contains proposed guidelines for reactive and proactive macro-level C P M use, including model selection, black spot programs, CMFs, regional planning, neighbourhood planning, and model Lovegrove 9 transferability. Chapter Six reports on two reactive safety applications of the developed macro-level CPMs. It discusses the approach and results of their use to identify, rank, and diagnose CPZs, and, to estimate macro-level CMFs. Chapter Seven reports on three proactive safety applications of the developed macro-level CPMs. It discusses the approach and results of their use in a regional planning safety evaluation, a neighbourhood planning safety evaluation, and a model transferability test. Chapter Eight contains a summary of the research, together with conclusions, contributions, and future research topics. Lovegrove 10 2. LITERATURE REVIEW 2.1 Introduction The purpose of this literature review is to describe the research context and theoretical foundations on which this thesis has been built. In identifying and discussing that salient previous research, this chapter is composed of four main sections. In section 2.2, the magnitude of the road safety problem is reviewed from global, North American, and local contexts. In section 2.3, road safety improvement programs are reviewed, including how empirical Bayes statistical methods and micro-level collision prediction models have helped to address many empirical shortcomings. In section 2.4, proactive road safety improvement programs and their associated empirical techniques are reviewed, including road safety audits, sustainable road safety, and road safety planning guidelines. In section 2.5, recent research on macro-level collision prediction models is reviewed, including its potential to be an improved empirical tool for safety planning in support of enhanced RSIP effectiveness. The section concludes by reviewing methodological issues in macro-level collision prediction model development and use, including a review of possible candidate variables to use in model development. 2.2 The Magnitude of the Road Safety Problem 2.2.1 Global Context Unacceptably high road collision frequencies have been recognized worldwide as a major public health and safety problem since the start of the automobile age. So much so that the World Health Organization (WHO) dedicated World Health Day 2004 to road safety in recognition of the enormous social burden of injuries and deaths due to road collisions, and the need for collective action to raise awareness and address this global problem. Their report, sent to governments around the world, notes that over 3,000 people die each day from traffic injuries, or 1.2 million each year, and that traffic injury is the 11 t h leading cause of death worldwide (WHO, Lovegrove 1 1 2004). A further 20 to 50 million people suffer injury and/or disability. Projections indicate that these figures will increase by about 65% over the next 20 years, making traffic injuries the third largest global 'disease' by 2020, unless there is new commitment to prevention (Gaspers, 2004). Fatality rates for selected countries are shown in Figure 2.1 (Transport Canada et al., 2004). The global economic burden of road collisions and injuries is also large and rising. It is estimated to be 1 to 2% of gross national product (GNP), and to total US$518 billion annually worldwide (WHO, 2004). Of that total, low- and middle-income countries account for US$65 billion, which is more than they receive in development assistance (WHO, 2004). An international Red Cross report states that under current trends health departments worldwide will be spending approximately 25% of their budgets on traffic collision casualties by 2020 (Ross, 1999). T3 8.94 9.09 9.09 9.13 922 Figure 2.1. Traffic deaths per billion vehicle kilometres traveled. 2.2.2 North American Context The human toll in the U.S. is also staggering. Relative to other transportation modes, the risk of injury per passenger mile travelled in the U.S. by automobile is 0.91 deaths per billion passenger miles, by far the highest risk with railroads at 0.06, airlines at 0.03, and buses at 0.02 (Waller, Lovegrove 12 2000). Moreover, ninety-four percent of all transportation fatalities and ninety-nine percent of all transportation injuries occur on roads (USDOT, 2000). Road collision injury is the leading cause of death for people in the U.S. between 6 months and 45 years of age. Thus, road collisions are also the leading cause of lost years of productive life, with more pre-retirement years of life lost in the U.S. due to road collisions than the combined effects of cancer and heart diseases (USDOT, 2001; Waller, 2000). More than 41,000 Americans are killed and three million are injured each year in road collisions (Depue, Zogby, Knipling, & Werner, 2000; Fricker & Whitford, 2004). In a typical two-week period, more people are killed on US roads than the 1,500 (from many nations) that perished with the Titanic (Evans, 1999). The US National Highway Traffic Safety Administration (NHTSA) estimates that traffic collisions cost society more than US$230 Billion in 2000 alone (USDOT, 2003a). Fortunately, the rate of road collision deaths per kilometre-traveled has declined despite increasing levels of auto ownership and licensed drivers (see Figure 2.2). However, in absolute terms, the frequency of severe road collisions per year remains at an unacceptably high level (Waller, 2000). 180 Figure 2.2. United States Road Collision Fatality Rate History. The pattern of road collisions in Canada is very similar to that in the United States. Traffic collisions account for nearly half of all accidental deaths in Canada each year (Transport Canada, 1999). In 2001, there were 2,778 deaths and 24,403 injuries due to road collisions, or Lovegrove 13 8.9 deaths and 79 hospitalizations per 100,000 population. The Canadian Council of Motor Transport Administrators (1998) reports that the average years of lost life due to a fatal collision is 40 years, compared to respiratory disease (9 years), circulatory disease (10 years), or tumors (15 years). The economic cost of road collisions to Canadians is estimated at $25 billion annually (Transport Canada & Health Canada, 2004). It would be even higher i f auto-related public health costs were included. For example, the public health cost from disease and lives lost due to auto-emission-induced respiratory failure is forecast to reach $38 Billion by the year 2020 based on current usage trends (Centre for Sustainable Transport, 1998). 2.2.3 Local Context Lack of progress in reducing unacceptably high road collision frequencies is also a problem in the Province of British Columbia (BC). The Insurance Corporation of British Columbia (ICBC) notes that 47,653 traffic collisions were reported in 2002 in BC, resulting in 29,347 injuries and 467 fatalities (ICBC, 2004). A reportable collision is defined as any incident that results in bodily injury or where the property damage exceeds $1,000. In addition to reported collisions, there are an estimated additional 120,000 unreported collisions on average per year in BC (Mercer, 1995). While reported injuries remained essentially unchanged, reported fatalities were up 13% from 2001. Figure 2.3 shows a generally falling trend of traffic fatalities in BC between 1983 and 1996, but remaining relatively constant since that time (ICBC, 2004). Although BC traffic volume data to translate these collision frequencies into collision rates for comparison could not be found, it is known that vehicle use in the Greater Vancouver Region District (GVRD), where the majority of the provincial population resides, is growing at over twice the population rate (GVRD, 1993). This suggests that collision rates in BC are following a pattern similar to the rest of North America. The economic burden of road collisions in BC is also very high. Table 2.1 lists three estimates of the cost per road collision type (Abdelwahab & Sayed, 1993; Sayed & de Leur, 2004; ICBC, 2004). The Human Capital Cost and Willingness to Pay estimating methods are based on Lovegrove 14 economic assessments. The Insurance Claims estimates are based on only the average actual auto insurance claim payments, and do not include the societal costs used in the economic models (ICBC, 2004). Depending on which method is used, these estimates suggest that the Provincial cost of road collisions for the year 2002 alone ranges from $2 Billion to $5 Billion, or roughly $500 to $1,500 per BC resident annually. 700 600 i 500 400 300 1 * — ^ * ' *-983 19 1984 85 1-1986 )87 1 1988 989 19 1991 19? 90 1992 )3 1995 1997 1999 2001 1994 1996 1998 2000 2002 Figure 2.3. BC Road Deaths, 1983 - 2002. Table 2.1. Road Collision Costs in British Columbia Using Three Estimating Methods1 Collision Severity Human Capital Cost Willingness to Pay Insurance Claims Fatal Collision C D N $ 1,000,000 C D N $ 4,200,000 C D N $281,000 Injury Collision C D N $ 35,000 C D N $ 100,000 C D N $ 44,000 Property Damage Only C D N $ 5,000 C D N $ 6,000 C D N $ 4,500 Given the magnitude of this ubiquitous and persistent road safety problem worldwide, many organizations have initiated Road Safety Improvement Programs (RSIPs). These programs have identified many factors that contribute to collisions. However, each factor can be related to failures in one or a combination of the three road system components: the driver, the vehicle, and the road environment (Sayed, Abdelwahab, & Navin, 1995). In particular, transportation engineering focuses on improving the safety performance of the road component of the road • The Willingness to Pay estimating method is based on the willingness of road users (consumers) to pay for a safe journey (i.e. one hat avoids a collision) The Human Capital Cost estimating method is based on the estimated value of the contribution an md.vidua! makes to socety in terms of S impa t on GNP (eg. wages throughout his or her work.ng life). See Miller (1988, 1992) for a full discussion on these two methods. Lovegrove 15 system, using two main approaches: reactive and proactive. First, the reactive or traditional engineering approach addresses road safety in reaction to pre-existing collision histories. Second, the proactive engineering approach seeks to address road safety in the earliest stages of road planning and design, to preclude road collisions from occurring in the first place. Each approach is reviewed, beginning with the most traditional approach, that of reactive RSIPs. 2.3 Reactive (Traditional) Road Safety Improvement Programs The objective of reactive RSIPs is to identify and treat locations that are hazardous, based on an analysis of collision, traffic, and highway data (Sayed, 1998; Khisty & Lall, 1998). These hazardous locations may consist of intersections, road sections, and/or highway system elements. Several terms are found in the literature when discussing hazardous locations, including collision prone locations, sites with promise, and black spots (Sayed, 1998; Hauer, E; Kononov, J.; Allery, B. ; Griffith, M.S., 2002). In broad terms, RSIPs involve the following three functions (Sayed, 1998): o Location identification, or detection of locations which are considered hazardous; o Problem identification, or location diagnosis to find what may be causing the road safety problem; and, o Solution identification, or problem remedy, to find countermeasures to solve the problem. 2.3.1 Black Spot Programs Although several terms are used to describe them, in this thesis programs to identify hazardous locations will be termed black spot programs. The underlying assumption of these programs is that road design often plays a significant contributory role in collision occurrence. Sayed et al. (1995) reported that road-related factors have caused up to 32% of collisions in North America and the United Kingdom. Therefore, improving the engineering elements of black spots can avert a significant proportion of collisions. To ensure that resources are spent only on the locations with the highest potential for safety improvements, it is vital that a sound procedure be Lovegrove 16 used to screen the road network in order to properly identify and rank black spots for diagnosis and treatment. A black spot is defined as any location that exhibits a collision potential that is significantly high when compared with some normal collision potential derived from a group of similar locations (Sayed, 1998). The collision potential of a location is commonly described by several measures, including: collision frequency, collision rate, collision severity, or a combination thereof. The collision frequency measure is defined as the number of collisions occurring at a location during a specific time period. Zegeer (1982) recommended using an observation time period of between one and three years, to minimize the effects of random fluctuations but still remain sensitive to changes over time. McGuigan (1982) showed that frequency alone did not account for traffic volumes or exposure when comparing across different locations. Because this could lead to a bias in favour of high traffic volume locations, which tend to have more collisions, McGuigan recommended that a traffic rate measure be used instead to identify black spots. Improved empirical techniques which can correct for this apparent frequency measure bias have been developed and will be discussed later (Hauer, 1995; Hauer, E ; Harwood, D.W.; Council, F .M. ; Griffith, M.S., 2002). Collision rate is defined as collision frequency divided by some unit of exposure, usually defined as million-vehicle-kilometres (mvk) for sections, and million-entering-vehicles (mev) for intersections. McGuigan (1982) suggested that this measure accounted for exposure when comparing the safety of different locations, and could be used to identify locations as black spots above a certain collision rate. However, there were several problems with the use of collision rates. First, it identified low-volume roads as hazardous even though their collision frequencies were very low. Second, its simple ratio of collisions per unit traffic volume assumed a linear relationship between collisions and traffic volumes, which has been shown to be incorrect (Hauer, 1995). Third, results were biased for short road segment lengths (Zegeer, 1982; Nicholson, 1980). The severity measure, termed the Collision Severity Index (CSI), is defined as the weighted sum of fatal, injury, and Property Damage Only (PDO) collisions. Various agencies use different Lovegrove 17 weights, based on the cost and impact of collisions. For example, some agencies base the ratio on the relative costs of fatal, injury, and PDO collisions. In British Columbia, some road authorities weight fatal collisions as 100 times, and injury collisions as 10 times the severity of PDO collisions (Sayed, 1998). Some agencies divide the CSI by traffic volume to account for exposure (Sayed, 1998). Although many road agencies are moving towards using a measure of collision frequency, the traditional approach has been use of collision rates (Hauer, 1995; Hauer, E; Harwood, D.W.; Council, F . M . ; Griffith, M.S. , 2002). Zegeer and Dean (1977) suggested a two-step procedure using both collision frequency and collision rate criteria to overcome their individual weaknesses in black spot identification. Exactly how the two criteria are combined differs. Commonly, the first step is to identify black spots using some pre-defined collision frequency threshold. The collision rate criterion is then used to rank those identified locations. Abdelwahab and Sayed (1993) took this one step further using a treble-measure criteria, by also including the collision severity index. Regardless of which measure or combination of measures is used to identify blacks spots, the aim of RSIPs remains to achieve the greatest safety improvement for the resources invested (Hauer, E; Kononov, J.; Allery, B. ; Griffith, M.S. , 2002). However, achieving that aim has been difficult to verify due to the significant non-systematic (i.e. random) components of collision occurrence (Fridstrom et., 1995). 2.3.1.1 Statistical Techniques Given this random component in the observed collision frequency at a particular location, one way to consider the occurrence of collisions at a location is as a random variable, with a true mean that is not known with accuracy. That is, the true (i.e. long term) mean collision frequency of a location may never be known, and cannot be predicted with absolute accuracy. As such, it can be estimated as an expected value using reasonable assumptions and empirical techniques (Sayed, 1998). To do that, the standard engineering practise has been to employ statistical techniques (Sayed, 1998). Statistical techniques are used to screen out randomness in the observed historic mean collision frequency in order to provide an estimate of the expected value of the mean collision frequency with some degree of accuracy. The accuracy of an estimate is usually expressed in terms of an estimate's standard deviation or Lovegrove 18 variance. Thus, the estimate and its variance are used to provide the desired level of confidence that only locations that are truly hazardous have been identified as black spots. Although early statistical techniques assumed that the occurrence of collisions per unit time were Normally distributed, Norden (1956) and Oppe (1982, 1992) showed a Poisson distribution to be a more reasonable assumption, based on fit with the empirical evidence. The Poisson distribution has the following properties (Sayed, 1998; Johnson, 2005): o the number of events occurring in a particular time interval or specified region is independent of the number that occurs in another time interval or region (i.e. memoryless); o the probability that a single event will occur during a very short time interval or in a very small region is proportional to the length of the time interval or size of the region; and, o the probability that more than one event will occur during a very short time interval or small region is negligible (i.e. rare events). Relying on this Poisson assumption and a selected reference population of locations with similar traits, Norden (1956) used a rate quality control technique to define a threshold above which a location is identified as a black spot, based on a desired level of confidence. However, there is still a problem with this statistical approach. It is very sensitive to the sample mean of the reference population, which can lead to additional bias and inaccurate safety estimates (Hauer, E; Harwood, D.W.; Council, F .M. ; Griffith, M.S., 2002; Sawalha & Sayed, 2005a). To minimize these problems, Sawalha & Sayed (2005a) have summarized key traits to consider in the size and choice of the reference population. One of the biggest problems in ensuring that only truly hazardous locations are identified as black spots relates to selection bias or Regression-to-the-Mean (RTM) errors (Hauer, 1988; Sayed & de Leur, 2001a). Regression-to-the-Mean describes a statistical phenomena whereby extreme values of a random variable tend to be followed by less extreme values, even i f no change has occurred in the underlying causal mechanism (Hauer, 1988; Sayed, 1998; Sayed & de Leur, 2001a). In road safety terms, R T M occurs when the observed collision frequency (or rate) at a particular location regresses, or returns, to the long-term mean value of collision frequency (or rate) as time goes by. In other words, all else remaining the same (e.g. traffic volumes, driver skills, road conditions, etc.), a period in which the observed mean collision frequency at a Lovegrove 19 particular location is higher tends to be followed by a period with a lower observed mean collision frequency. Given that black spots are usually identified because of a recorded high occurrence of collisions, this R T M bias may lead to a 'false' labelling of a site as hazardous. A statistical technique being used to reduce R T M bias is the empirical Bayes (EB) method (Higle & Witkowski, 1988; Hauer, 1992; Sayed & Abdelwahab, 1997; Sayed, 1998). 2.3.1.2 Empirical Bayes Technique The EB method is becoming widely practised in road safety evaluation, because it reduces R T M bias, and increases the accuracy of the safety estimate (Hauer, 1988; Sayed & de Leur, 2001a; Hauer, E; Harwood, D.W.; Council, F .M. ; Griffith, M.S., 2002). Moreover, EB techniques can be applied not only to black spot identification, but also to the evaluation of countermeasures, or to the assessment of potential safety savings due to site improvements. The method is based on the premise that collision history is not the only clue to the safety of an entity. A second clue is what is known about the safety of similar entities in the same time-space region (the reference population). The EB estimate of safety combines these two clues by means of a weighted average, written conceptually by Hauer et al. (2002) as: Estimate of expected collisions for a location = (weight) x collisions expected at similar locations + (1 - weight) x count of collisions at this location (2.1) Where the value of 'weight' is dependent on assumptions regarding the underlying frequency distributions of the two measures of collisions (i.e. observed count and expected collisions). The theoretical development and methodology for applying the EB technique is reviewed below, and is based mainly on the work of Higle & Witkowski (1988), Sayed (1998), Sayed & de Leur (2001a), and Hauer et al. (2002), with other research as cited. The empirical Bayes refinement technique is developed beginning with Bayes' Theorem, which defines the conditional probability of the behaviour of an entity as dependent upon two pieces of information. The first is related to some observation of the entity itself. The second is related to Lovegrove 20 some engineering judgement on the underlying properties of the entity. Mathematically, the theorem2 can be stated as (Sayed, 1998; Johnson, 2005): O = a parameter such as collision frequency at a location; P(0) = prior probability distribution of O events occurring, derived from personal beliefs about the attributes of the entity; P(x | O) = the likelihood or observation distribution derived from direct 'observation' of the entity itself; and, P ( 0 | x) = the posterior distribution of O events expected to occur derived by a weighted combination of the prior and observation distributions. Bayes' Theorem uses a subjective prior distribution, wherein parameters are based upon personal beliefs of the properties of the entity, typically derived from engineering judgment and past experience (Sayed, 1998). The empirical Bayes approach uses a more objective prior distribution, based upon empirical data from a reference population of entities with similar properties. Assuming that these observations are typical of similar entities gives rise to the term empirical. Higle & Witkowski (1988) set out the EB method to identify collision prone locations using collision rates as a measure, based on two assumptions. First, the actual number of collisions at location / during the time period in question (N,) follows a Poisson distribution such that, at any given location, when the collision rate is known (7,= A), the expected value is XV,, where K, is the number of vehicles passing through location i during the time period in question. The observation probability distribution is then P(01 x) = P(x | O) •/>($) (2.2) 2 ]P (X | 0 )»P (0 ) where: (2.3) 2 Bayes' Theorem was published posthumously in Essay towards solving a problem in the doctrine of chances, in the Philosophical Transactions of the Royal Society of London in 1764. Lovegrove 21 Their second assumption is that the probability distribution of the reference population collision ra te , /^) , is the gamma distribution with parameters a and/?, implying that With these assumptions, the first step is then to estimate the parameters, a and /? . Although there are various methods used to estimate the parameters, the simplest is the method of sample moments, where CC and (5 are chosen so that the mean and variance associated with the gamma distribution are equal to the mean (x) and variance (s 2) of the sample (reference population). This is done by making ; = « a n d s > « , (2.4 b, c) P P or equivalently, by making p=* and a = px (2.4 d, e) s Morris (1988) showed that estimating the parameters this way will lead to biased and inefficient estimation, and recommended estimation of p by using V*x P = -^= (2.4 f) V s -x where: V* = the harmonic mean of the volumes in the reference population (Vh .. .) Using Bayes theorem, the second step is to combine the regional probability distribution (the prior distribution) with the location specific collision rate at each location to obtain the location specific probability density functions, or the posterior distribution, fi(A\Ni,Vi)=f(N,\W)fR(A) (2.4 g) Given the two assumptions to start, the resulting probability distribution is a gamma distribution and the parameters, at and p., are calculated from the original choices of a and p and the observed data, JV, and I7,, as a, =a+/v, and as p. = p+vr (2.5 a, b) The resulting probability density function is given by fXZ\N,,v)=^fr^ (2-6) Lovegrove 22 In the final step, location / is considered as collision prone if there is a significant probability that the location's collision rate, A,, exceeds the observed mean of the reference population collision rate, XR. Mathematically, location / is considered collision prone i f P(XI>XR\N„VL)>S (2.7 a) or equivalently if: 4 IXcc,)** e >d (2.7 b) where: 8 = the desired level of confidence, such as 0.95, or 0.99; and, m 5>i (2.7 c) m = the number of locations in the reference population Sayed, Navin & Abdelwahab (1997) recommend a small modification to Higle & Witkowski's method. Instead of using Equation (2.7 b) to calculate whether a location is collision prone, a critical collision rate is calculated for each location, A . This critical rate corresponds to a probability 8 that the location / collision rate, X>, exceeds the reference group collision rate, XR, and is calculated by solving the equation for \ . (substituting a+^V, for a . in Equation 2.7 b): * i{a+wA, 1 e >s (2.8) The ratio of the observed collision rate, h , to the critical rate calculated from Equation 2.8 represents the collision proneness of the location. If the value is less than one, the location is not considered collision prone. The higher the value above one, the greater the degree of collision proneness. While this represents the conventional Black Spot program using the EB method, there are two modified approaches that provide enhancements. First, Sayed & Abdelwahab (1997), using this modified EB method, have introduced what is called the Modified Black Spot program. The primary difference between traditional and modified programs lies in the addition of Lovegrove 23 correctability criteria in the identification of collision prone locations. In the traditional program, locations are identified as collision prone i f they exhibit a significant number of collisions above the reference population mean (with some specified level of confidence). In the modified version, locations are also compared to reference population means, but in a more focused manner. To be considered collision prone in this modified method, locations must also exhibit a significant number of correctable (i.e. road related) collisions. This subtle but significant difference is based on considering the fact that for the most part only road-related collisions are correctable by RSIPs (Sayed & Abdelwahab, 1997). They recommend that the collision measure for each location be re-calculated to determine correctability using the following equations: Collision Frequency = (2.9) j=\ Collision Rate = >! (2.10) Exposure Measure where Wj - degree with which the jth collision belongs to the road environment group, (0.0 to 1.0, determined from police reports, collision claims records), Nt = total number of collisions at location / during a certain time period Using these re-defined measures, and collision data from the Province of British Columbia Ministry of Transportation & Highways (MoTH), Sayed & Abdelwahab (1997) demonstrated both a reduction in the total number of collision-prone locations with correctable causes (i.e. road-related), and, a significant altering in the assigned rankings of identified black spots. Therefore, in applications of the modified black spot program, it is expected that significant efforts and resources could be saved by no longer trying to do detailed engineering studies and determine remedies for locations which may not even be correctable from a road safety engineering point of view. The second enhancement to traditional black spot programs is termed the Countermeasure-Based approach, and has been described in Sayed, Navin & Abdelwahab (1997). It is different from traditional black spot programs in the way that it identifies hazardous locations. In traditional Lovegrove 24 programs, where collision patterns are not considered, locations with a higher number of collisions are more likely to be identified as collision-prone. However, in the countermeasure-based approach, locations are identified as collision prone when they exhibit well-defined collision patterns that can be remedied by specific countermeasures. In effect, the countermeasure-based approach reverses the traditional black spot approach by first looking for well-defined collision patterns that can be targeted by specific countermeasures and then searching for locations that have an over-representation of these patterns. The objective of the method is to identify locations that are most promising to be treated by road improvements regardless of their total number of collisions. For example, an intersection that exhibited an over-representation of left-turn collisions may suggest left-turn bays as a countermeasure. In the countermeasure-based method, the term over-representation of a particular collision pattern is defined as a high likelihood that i f a collision occurs, it will involve the pattern under investigation, x„ or a higher than usual rate of occurrence, ph of pattern x, in «, number of collisions, mathematically expressed in Equation (2.11) as The variable pt is assumed to be random, with its distribution (the Observation distribution) following a Binomial distribution. Note that instead of using an exposure measure (e.g. traffic volume) to calculate the rate of occurrence, the total number of collisions, nh is used. It is assumed that the prior distribution for pt across the reference group is a Beta distribution, with parameters a and p determined by fitting all observations of (x„ n,) pairs in the reference group to the Beta distribution using the method of sample moments. The resulting posterior distribution is also a Beta distribution, with parameters for each location i as A location is then considered as having an over-representation of a particular collision pattern if the probability that its pattern ratio p{ exceeds the reference group average p is significant. Mathematically, a location is considered to have an over-representation of a particular collision pattern if Pi =Xi/nt. (2.11) a(=a + x, and as =/?+«, -x , . (2.12.a, b) (2.13) Lovegrove 25 The exposure measure related to traffic volume is still considered relevant, but only as one of the criteria in establishing the reference group of similar locations. Using the EB method, the countermeasure-based approach was applied to collision data from the MoTH. A comparison with the traditional black spot approach using the same data set identified many locations that had not been previously identified as black spots, but which had collisions with well-defined patterns. It is therefore considered a valuable contribution, complementary to the (modified) traditional approach, to the empirical tool set for improved identification and RSIP cost-effectiveness. While the EB method is an invaluable contribution to improving the precision and cost effectiveness of traditional black spot programs, and the subsequent modified black spot and countermeasure-based approach programs, there are two problems that surface when using the EB method by Higle & Witkowski (1988) for safety estimation (Hauer, 1992; Sawalha, 2002). First, using a location's collision rate as a measure of its safety has been shown to have the potential for misleading results (Hauer, 1995). Although the rate measure was used originally as a way to equalise compared locations for differences in intensity of use (i.e. traffic levels), Hauer (1995) has shown that there is often a non-linear relationship between collision rate and traffic levels. Therefore, the collision rate is usually an inappropriate measure of safety. Hauer (1995) recommends the use of multivariate regression models, which account for differences in intensities in all cases, to conduct safety comparisons and evaluations. The second shortcoming of the EB method relates to the difficulty in finding a suitable reference group to provide estimates of the mean and variance of the prior distribution. Without them, reliable estimates of the parameters of the prior distribution are not possible, and the method defaults back to non-empirical Bayesian, with only plausible prior and likelihoods. As the method of sample moments is usually used to estimate the mean and variance, the EB method requires that a sufficiently large reference group (i.e. sample size) be found. It also requires that the traits of the reference group closely match those of the location in question (location i), something that is not often possible. In the more usual case where traits of the reference group and of the subject location do not match, one could find several possible reference groups, each with its own mean and variance. Therefore, great care must be exercised in the selection of the Lovegrove 26 members of the reference group in order to ensure that they belong to the same reference population as the location whose safety is to be estimated. The greater the number of traits that have to be matched by the reference group, the harder it becomes to find a reference group of significant size. In the case of matching many traits, for purposes of parameter estimation a real reference population hardly exists (Hauer, 1997). When the reference group collision data are from a reference population that hardly exists, it is not reliable to use the method of sample moments to calculate the prior distribution parameters (Sawalha, 2002). Fortunately, this reference group-based estimate can be refined by using multivariate regression to develop collision prediction models to provide a site-specific estimate of the regional reference group mean collision frequency and its variance (Sayed, 1998; Hauer, 1992; Sawalha & Sayed, 2005a). 2.3.1.3 Using Collision Prediction Models The multivariate regression EB method is superior to the traditional EB method of identification in that it uses collision frequency instead of collision rate, and it overcomes the difficulty posed by the lack of a large enough reference group (Hauer, 1992, 1995; Sawalha, 2002). The multivariate method does this by employing Safety Performance Functions (SPFs), or, as they will be termed in this thesis, collision prediction models (CPMs). Although these CPMs must be developed using regression techniques that still rely on collision data from the same reference group whose traits are different from those of the subject location, the reference group locations need only be categorically similar (e.g. two-lane roads, signalized intersections, unsignalized intersections, etc.) to the subject location (Hauer, 1992; Sawalha & Sayed, 2005a). In other words, the reference group used to develop the CPMs need not necessarily have totally matching traits as was required under the EB method described by Higle & Witkowski (1988). Therefore, the advantage of using CPMs is that they capture the systematic relationship between the expected level of safety of an entity (the dependent variable) and a set of traits (explanatory variables) over a range of values for these traits. Moreover, using a C P M , a location specific prior distribution can be derived for each location from an imaginary reference group, with estimates of the mean and variance that are more reliable than that used in the method of sample moments (Hauer, 1992). The only caution to take is to be careful to use the appropriate C P M , one that was developed using traffic, geometric, and collision data from locations of the same category; more on C P M development and selection is discussed in section Lovegrove 27 23 A.l. At this point, the EB method assuming previously developed and appropriately selected CPMs will be discussed. After development and selection of the C P M , the first step in estimating the safety of a location is to enter the values for each of the location's traits into the C P M equation to predict the location-specific mean of the prior distribution, E(A). Using the mean for this imaginary reference group of sites with similar traits, the prior distribution variance, VarE(A)=[E{A)}2 (2.14) K can also be calculated by assuming that the prior distribution follows a gamma distribution with the shape and scale parameters shown in Equations (2.15 a, b), respectively (Hauer 1992, 1997) as: a - K (2.15 a) The overdispersion or shape parameter, A", is an output derived during development of the chosen C P M . Obtaining K from the model development process, and using it with the developed model to derive location-specific estimates for E(A) and Var(E(A)), then completes the first step, that of defining the virtual reference population (prior distribution). The second step is to gather whatever local data is available to refine the prior estimate provided by the C P M . Additional refinement of the C P M estimate is needed because it is impractical to develop CPMs with enough variables to incorporate all the possible relevant traits in estimating the safety of a specific location. In other words, the EB approach considers the C P M estimate as still only providing one of the two clues necessary, namely how safety varies in a reference population of similar road locations portrayed by the prior distribution (Sawalha 2002). In the absence of local data, E(A) would be taken as the best estimate of the collision potential of the location with those traits. However, when a local, site specific collision history is available, the EB method uses it to refine the prior distribution according to Bayes' theorem. Therefore, the second step in estimating the safety of the location is to use the observed collision history of the specific location itself to refine the estimate. The result of this refinement is a posterior Lovegrove 28 distribution that represents how the mean collision frequency, A , varies in a subpopulation of entities having not only very similar traffic and geometric traits, but also a very similar collision history. The mean of the resulting Posterior distribution is also termed the empirical Bayes safety estimate for location i, EB, . The posterior distribution is also gamma distributed with shape and scale parameters, respectively, as shown in Equations (2.16 a, b) (Hauer et al 1988, 2002a; Kulmala 1995) a = K + count (2.16 a) B = K + E W (2.16 b) E(A) where: count = the location's collision history Hauer et al. (1988) showed that the mean, EB,, and variance, Var(EB,), of the posterior distribution are given by Equations 2.17 and 2.18 as: EB, = E(A\Y = count) = — = B Var(EB,) = Var(A | Y = count) = -j. £(A , ) *r + £ ( A , ) _ E(A,) {K + count) (2.17) {K + count) (2.18) _ir + E ( A , ) _ It should be noted that Equation 2.17 could be rewritten according to the original EB weighted average in Equation 2.1 as (Sawalha & Sayed, 2005a) EB, = weight • E(A,) + (1 - weight) • count (2-19) where: count = the observed collision frequency at location / K 1 1 weight = K + E(A,) \ + E(A,)/K ] ( Var (E(A,)) (2.20) E(A,) Var (E(A,)) = = the variance of the C P M estimate (2.14) K Hauer (1997) notes that E(A) and count must pertain to equal time periods. If not Equation 2.20 for weight is modified by a ratio, r, equal to the number of time units to which count pertains divided by the number of time units to which E(A) pertains, as: Lovegrove 29 weight = Var E(A) £(A) (2.21) 1 + r Sawalha (2002) has shown that the EB safety estimate is location-specific, always lies between the observed and predicted collision values, and can reduce R T M bias. The higher the value of K , the closer the EB safety estimate is to the C P M prediction. In models with low K values, the EB safety estimate nudges closer to the observed count. With CPMs able to refine EB safety estimates, it is logical to extend their use to refinement of methods for identification of collision prone locations. 2.3.1.4 Identification of Collision Prone Locations using CPMs The enhanced EB method using CPMs to identify CPLs is reviewed below, and is based on that set out in Sayed (1998) and Sawalha & Sayed (1999). The three-step procedure below generally follows the EB method set out by Higle & Witkowski (1988), with refinements for using CPMs. The first two steps, obtaining an estimate of the expected mean collision frequency, E(A), of locations similar to the location in question, and, using the location's collision history, count, to provide an EB safety estimate, EBt , of the location, have been reviewed above in Equations 2.14 through 2.21. In step one, the prior distribution of the imaginary reference population is defined using the appropriate C P M to predict the expected collision frequency and its variance for a specific location, with parameters given previously in Equations (2.15 a, b). The specified norm could be the 50 t h percentile (P50) of the Prior distribution's probability density function, or more usually the C P M prediction, E(A). These two values are similar for high values of K (i.e. above 1). In step two, the location's observed collision frequency (count) is used, together with E(A) and K , to define the gamma density function of the posterior distribution describing the variation of the location's collision potential, with parameters as given previously in Equations (2.16 a, b), repeated below for convenience. The third and final step is to compare the value of the location's level of safety, EBj , to the regional average or norm for locations with identical traits (i.e. E(A) or PJO). The location is considered collision prone i f there is a significant probability, S (usually not less than 0.95), that K +1, and ax = K + count (2.16.a, b) E(A) Lovegrove 30 its EB safety estimate, EBi , exceeds the specified norm. As each of these values is only an estimate of true mean values based on assumed probability density functions, this comparison is not simply a question of whether or not EB{ > E(A). It is usually done through use of the integral of the posterior probability density function in the range from 0 to E( A ) , to determine with what level of probability or within what Bayesian credible interval the two values are different and EBi > E(A). If the level of probability, the result of integrating over this range, exceeds the desired level of confidence, the subject location is identified as collision prone. Mathematically, the subject location is identified as collision prone when the following condition is met: my 1- \faWdX~ 1 - j £(A) [ A : / £ ( A ) + 1]( K+COUM) ^ K+COUM-X) „-[*-/ E(\)+\]X -dX >s (2.22) T(K + count) Figure 2.4 depicts graphically how the EB, is used to verify that a location i is collision prone, using a 95% level of confidence, and E( A) or P;o as the prescribed threshold. ftoor £ ( A ) This area should be < 5 for location / to be identified as collision prone at the (1- 8) confidence level. Figure 2.4. Empirical Bayes Identification of Collision Prone Location. 2.3.1.5 Ranking of Collision Prone Locations Where a group of locations has been identified as black spots, they will need to be ranked to ensure the locations in highest need are given more Lovegrove 31 attention. Traditionally, ranking criteria have simply used the calculated Potential Collision Reduction (PCR), based on the difference between the observed, and the expected collision frequency (McGuigan, 1982), now calculated with an R T M correction as (Sayed 1998) PCR = EB~E(A) (2.23) While the attractiveness of this ranking criterion is that it focuses safety improvement efforts on the locations with the highest potential for collision reduction, it does have two weaknesses. The first weakness is the use of the expected mean term, E(A), in the equation. Mean implies that up to half of the 'similar' sites would have a lower mean collision frequencies than the expected mean. In these cases, the PCR would then actually be a greater value, suggesting these locations should have a higher ranking. However, the means for these locations are unknown and varied, making that a difficult calculation. It has been suggested that perhaps a comparison be made to the lowest tenth percentile, which may be closer to what is realistically possible in terms of safety improvement for each site (Hauer et al 2002b). Again, this would require knowledge of the traits of each site in the reference group, which may not be practical. The second weakness of this PCR criterion is that it tends to ignore low collision frequency locations that have experienced significant collision increases. Sawalha & Sayed (1999) have shown that augmenting the PCR ranking criterion with a second risk reducing criterion, Collision Risk Ratio (CRR), can address this concern, based on the ratio of observed to expected collisions, defined as EB CJUt— (2.24) A third 'screening' criterion has been proposed using collision modification factors (CMFs) to rank sites according to their anticipated safety benefits from application of countermeasures, defined as (Hauer et al. 2002b): Anticipated safety benefit = observed (RTM corrected) collision frequency x C M F = EB, • CMF (2.25) This criterion is attractive because of its ease of translation with CMFs, which are used by many practitioners. However, identified black spots are not necessarily similar and hence not treatable to the same degree by the same countermeasure. Moreover, differing countermeasures have differing CMFs. Therefore, this criterion would require additional input of some prior knowledge of each site and its possible countermeasure (and CMF), for each site in the group Lovegrove 32 being screened, to ensure an accurate ranking. Further testing has been recommended to confirm a practical and reliable ranking method to screen sites with promise (Hauer et al. 2002b). 2.3.2 Diagnosis of Black Spots Once a black spot is identified and ranked for treatment, diagnosis begins to find what may be causing the safety problem. To ensure completeness and accuracy, and to avoid premature conclusions, a systematic methodology is followed during the diagnostic phase (Sayed, 1998). There are generally two steps to diagnostic investigations, including collision and site-specific analyses. First, the collision history (usually for the last 2-3 years) is analyzed to identify over-represented clusters of particular collision types. This is done by comparing the actual percentage of collision types at the location with the average percentage of these types at similar locations. Second, location-specific data (physical and operational) are identified and analyzed, including consultation with local road agencies, site traits, and observations of driver characteristics. Depending on the significance of the identified black spot problem, analysis may also include performance of conflict studies, and/or video recordings (NAASRA, 1988; Sayed, 1998). This location-specific information is used to identify possible causes of the over-represented collision types. In identifying possible causes of over-represented collision types, two questions need to be answered (Sayed, 1998). First, what existing conditions at the location could contribute to the occurrence of collisions? Second, what changes, i f any, could be made to reduce the number and/or severity of these collisions? The over-representation of a certain collision type may point to a specific problem. For example, an over-representation of wet weather collisions may be attributed to a poor pavement texture or a poor drainage system or both. Once the causes of the safety problem have been identified, the next step is to generate and analyze a list of possible countermeasures (remedies) to identify the most appropriate safety treatment for this specific location. Lovegrove 33 2.3.3 Remedy of Black Spots Having discovered why the CPLs are hazardous, the last step in the reactive road safety improvement method is to identify the most cost-effective countermeasure(s) to solve the problem. As a decision-aid, many publications relate over-represented collision types, causes, and countermeasures, segregated by location and type (Box, 1976; FHWA, 1981; N A A S R A , 1988; Sayed, 1998). Often, more than one countermeasure is identified that has the potential to remedy the problem, and an economic analysis is then conducted to gauge their respective economic efficacy. The most popular economic analysis method involves conducting a benefit-cost (B/C) study of each candidate countermeasure (Sayed, 1998). In B/C studies, the ratio between the present value of benefits and the present value of costs is called a benefit-cost ratio. Countermeasure costs are those related to installation and operation of the countermeasure. Benefits are usually valued according to the reductions in: travel time, vehicle operation, collision frequency, and collision severity. Each candidate countermeasure with a B/C ratio > 1 is considered as economically feasible, and then ranked according to their B/C ratio. Feasible countermeasures are selected for implementation according to their ranking, subject to budget considerations. If the cost of the selected countermeasure would make the cumulative project cost exceed the budget, it is skipped, and other feasible countermeasures further down the ranked list are selected. In selecting countermeasures for implementation this way, the effectiveness of safety funding can be maximized. In some cases, knowledge-based, expert systems are available as an empirical RSIP decision-aid tool (Sayed, 1997). Regardless of what decision-aid tools are used, the final choice of which countermeasures to implement should involve some engineering judgement and experience. Following the implementation of countermeasures, an evaluation of their effectiveness is usually conducted, including the use of Odds Ratios (ORs) and CPMs. Post-implementation evaluation of countermeasures contributes important data for reference in C M F estimation, and helps to ensure that road safety programs remain cost effective. The method used to estimate the improvement effect of any given countermeasure is based on the approach described in Hauer (1997), Sayed (1998), and Sayed & de Leur (2001a). The reduction in PDO collisions, and in I Lovegrove 34 severe collisions (fatal & injury) are calculated separately using the OR according to Equation (2.26) as OR = AISL, with Treatment Effect = OR-l= C M F (2.26a, b) BID Where: A = the number of collisions in the comparison group that occurred in the period before the countermeasure was implemented; B = the EB safety estimate of the number of collisions that would have occurred at the site had no treatment taken place; C = the number of collisions in the comparison group that occurred in the period after countermeasure implementation; and, D = the number of collisions at the location that occurred in the period after countermeasure implementation. The comparison group must be a randomly selected sample of locations in the same time-space region as the subject location, regardless of collision history. The role of the comparison group is to represent the time trend from the before to the after period (Sayed & de Leur, 2001a). A l l of the quantities are directly observable except for quantity B, which is calculated using the EB refinement and C P M techniques according to Equation 2.27 (Sayed & de Leur, 2001a) for the subject location as B = EB=EBb^^- (2.27) " E(Ah) Where: EBa = the EB safety estimate of the treated location in the after period had no treatment taken place; EBb = the EB safety estimate of the treated location that occurred in the before period; E(Aa) = the mean collision frequency estimated by the C P M for the location using its traffic volumes in the after period; and, E(Ab) = the mean collision frequency estimated by the C P M for the location using its traffic volumes in the before period. Lovegrove 35 As these observed and calculated values all have assumed underlying probability distributions, the OR in Equation (2.26 a) must be adjusted for the expected variance in its estimate. Using the method of statistical differentials, the mean and variance of the OR can be obtained as shown in Equations (2.28, 2.29, & 2.30) as (Sayed & de Leur, 2001a) E(OR) = 'Alt BID ' VarB VarC 1 + ——+ B' Cl Var(OR) = 'Alt BID 2r Var A VarB VarC VarD B' - + -C2 - + -(2.28) (2.29) where: VarB= Var(EBh) = (EBh) E(Ah) (2.30) K + E(Ah) Var A, Var C = variances of A , C, for the comparison group in before, after periods, respectively, Var D = D, since the observed number of collisions is assumed to be Poisson distributed Again, as with all EB refinement techniques, the OR method relies heavily on the use of C P M estimates. However, for the programs on which these techniques rely to be effective, the CPMs must be properly developed. Therefore, it is imperative that CPMs be developed in accord with methods recommended in the literature. 2.3.4 Collision Prediction Model Development The procedures involved in C P M development are reviewed below in several sections, including consideration of regression method, model form, G L M process, selection of explanatory variables, goodness of fit, and outlier analysis. A more detailed discussion of each consideration is given in Sawalha & Sayed (1999), Sayed & de Leur (2001b), and Sawalha & Sayed (2005a). 2.3.4.1 Regression Techniques There are two main approaches to estimating C P M parameters. The traditional approach has used linear regression, assuming a Normal (Gaussian) distribution error structure. However, it has been found that the standard conditions under which conventional linear regression would be appropriate (Normal model errors, constant error L o v e g r o v e 36 variance, and the existence of a linear relationship between the response and explanatory variables) cannot be assumed to exist when modeling the occurrence of traffic collisions (Sayed & de Leur, 2001a). More recently, generalized linear regression (GLM) assuming a Poisson or Negative Binomial (NB) error structure has become the norm, with many G L M models available in the literature (Hauer et al., 1988; Miaou & Lum, 1993, Miaou, 1996; Sayed & Rodriguez, 1999; Sawalha & Sayed, 2001). The G L M method has the advantage of overcoming the limitations associated with the use of conventional linear regression in modeling discrete, non-negative, and rare events such as traffic collisions. The resulting collision model predictions are able to fit the observed collision data much better than do conventional linear regression models. In addition, the non-Normal error distributions better describe the unexplained random variations in road collisions (Kulmala, 1995; Miaou, 1996). With this non-linear, non-Normal regression capability, the choice of model form is also facilitated. 2.3.4.2 Model Form Using the G L M method allows the use of non-linear C P M equations which have now become the convention for modelling collisions (Hauer, E.; Harwood, D.W.; Council, F .M. ; Griffith, M.S., 2002). Sawalha & Sayed (1999) note that the non-linear model form should satisfy two conditions. First, the model must yield logical results. Specifically, there must be zero risk of collision with zero exposure. In other words, when no vehicles are using a road segment, there can be no collisions on that road segment. This fundamental zero exposure equals zero collision risk property (zero risk logic) underscores a major weakness in linear collision models, which can lead to the illogical prediction of negative collisions. Hauer et al. (1988) showed that the most appropriate model form for intersections relates collisions to exposure via the product of traffic flows raised to some power. In addition to traffic flow, Kulmala (1995), Miaou (1996), and Sawalha & Sayed (1999) have shown that there are many variables affecting collision events, such as geometric features, and traffic controls. Based on empirical case studies, they suggest that the proper C P M form consists of the exposure measure(s) (raised to some power) multiplied by an exponential function incorporating other explanatory variables. They show that this C P M form is logical yet recognizes other (less dominant) non-exposure variables, and that its non-negative interactivity fits the data well. For the expected number of collisions predicted at intersections, the recommended model form is: Lovegrove 37 E(A) = aa x V? x V? x ePjXj (2.31a) Similarly, the model form for the expected number of collisions predicted for road segments is recommended as E ( A ) = a0 x If1 x F° 2 x g ^ * ' (2.31b) where: E ( A ) = expected or predicted collision frequency for a specific collision type; L = road segment length; V, V{ 2 =road segment and intersection major/minor road traffic volumes; Xj = explanatory variables (e.g. grade, driveway density, land use, approach width); and, a0>a^,a2,bj = model parameters derived via the G L M process. The second condition that the non-linear model form should satisfy is that there must be a known linear transformation or link function that can linearize this model form for the purpose of coefficient estimation during the G L M process. Sawalha & Sayed (1999, 2001) have demonstrated how this transformation can be done with the use of a logarithmic link function. For example, the log-linear form of Equation (2.31 a) would become Ln[E(A)} = Ln{a0) + a,Ln(V,) + a2Ln(V2) + • x,) (2.3lc) 2.3.4.3 The G L M Process Once a model form has been chosen, there are several G L M statistical software packages (e.g. GLIM4, SAS) that can be chosen to do the actual regression analysis (Dobson, 1990). They can be used to model data that follow a wide range of probability distributions belonging to the exponential family, including: Normal, Poisson, binomial, N B , gamma, and many others. A generic G L M methodology has been set out by Hauer et al. (1988), Bonneson & McCoy (1993), and Miaou (1996). Subsequently, Sawalha & Sayed (2001, 2005a) have critiqued and refined that methodology. Although the C P M methodology has been refined, a well developed C P M will still only explain variation in the collision data related to the systematic variation. As several researchers have shown, the systematic variation accounts for Lovegrove 38 between 40% and 70% of the total variation observed in collision data (Kulmala, 1995; Fridstrom et al., 1995; Miaou, 1996). They have shown that the remaining variation or residual prediction error around E(A) is mainly3 due to purely random variations which are best described by Poisson or N B distributions. As part of the G L M process, the software requires the user to specify which distribution the residuals are expected to follow. Which of these error distributions is chosen is left to the user to specify based on a review of the collision data used. Sayed & de Leur (2001b) noted that collision model parameters can be estimated first assuming a Poisson error structure, then checked via a dispersion factor: Pearsor^l ( 2 3 2 ) n-p where: Pearson%2=t[y'-E(A<)f (2.33) M Var{y,) n = Number of locations; yt = Observed mean collision frequency at location / over a specified time period (e.g. 3 years); Var (yi) = Variance of the observed mean collision frequency at location i; £(A,)= Expected mean collision frequency for location i as obtained by the C P M ; and, p = Number of C P M parameters. If ad exceeds unity, meaning the data is over-dispersed, then a N B error distribution should be assumed (Sayed & de Leur, 2001b). Kulmala & Roine (1988), Maiou & Lum (1993), Kulmala (1995), and Miaou (1996) have shown that most collision data is over-dispersed, with variance greater than the mean, and suggest that the NB distribution is usually the better fitting error assumption for model development. Maiou & Lum (1993) have identified three possible sources of extra-Poisson dispersion or overdispersion in collision data: complex and unknown causes, traffic exposure data errors, and non-homogeneous road environment. Complex and unknown 3 Some residual variation around the CPM estimate for £ ( A . ) may also be due to systematic factors not included as variables in the CPM, possibly because of insufficient data to provide statistically reliable regression parameters for them as explanatory variable parameters. See further discussion following, and in Fridstrom et al. (1995), and Miaou (1996). Lovegrove 39 causes relates to omitted explanatory variables which have not been identified, or for which insufficient data exists to model. Traffic exposure data errors relates to the inaccuracies that can arise in traffic data depending on the sampling methodology. Non-homogeneous road environment causes refers to in-situ effects which can influence drivability and driving environment throughout the day or year (e.g. weather, lighting). In any case, once an error distribution, model form, and link function have been specified, the multivariate regression software is able to provide estimates of the model parameters, and an estimate for the value of the overdispersion or shape parameter, K using one of three methods: the Maximum Likelihood (MLE), the expected value of the x2 statistic, or the mean Scaled Deviance (Lawless, 1987). Sawalha & Sayed (2005a) have shown that the M L method can provide a reasonable estimate of K . Miaou (1996) has confirmed that the M L E method provides accuracy comparable to the other methods. Although there are several ways to estimate K, an iterative estimate through the M L E method was used in this research, as described in Sawalha & Sayed (2005a). As the G L M process proceeds, another model development decision is which variables to select when building the C P M . 2.3.4.4 Selection of Explanatory Variables While many possible factors influence collision events, not all of them may be appropriate as explanatory variables in a C P M . The recommended method for adding independent variables is a forward stepwise procedure (Sawalha & Sayed, 2005a). In this procedure, variables are added to the model one by one and tested for significance, starting with the exposure variables. Each time a variable is added to the model, the G L M process is repeated to evaluate the newly added variable based on several criteria. First, the t-ratio (equivalent to the Wald statistic) of the added variable's estimated coefficient must be significant at the 95% confidence level. Second, its logic (i.e. +/- sign) should be assessed as to whether it meets with intuitive expectations. Last, the addition of the variable to the model should cause a significant drop in the CPM's Scaled Deviance (SD) at the 95% confidence level. The SD is defined as the likelihood ratio test-statistic, which measures twice the difference between the maximized log-likelihoods of the studied model and the full or saturated model. The full model is one with as many parameters as there are observations, which fits the data perfectly but becomes impractical as a forecasting model (Kulmala, 1995). Therefore, the full model, which possesses the maximum log-likelihood achievable under the Lovegrove 40 given data, provides a baseline for assessing the goodness of fit of an intermediate model with p parameters. If the error structure is Poisson distributed, the SD is defined as: SZ> = 2 2 » | - ^ - l (2-34-a) £ ( A , ) If it follows a N B distribution, McCullagh & Nelder (1989) define it as: SD = 2£ y , l n / y, " £(A , ) -(>,+*•) In £ ( A , ) + K (2.34.b) 2 As SD is asymptotically X distributed with (n - p) degrees of freedom for exponential distributions, a drop of at least,£0.05,1 =3.84 is needed to confirm that the added variable significantly enhances C P M predictive accuracy, and that it is not correlated with other independent variables already in the model (Sawalha & Sayed, 2005a). If the added variable meets all criteria, it remains in the model, and the testing process is repeated for the next explanatory variable to be added. The selection process ends when all desired variables have been tested. The focus on model development then shifts to assessing and refining overall model goodness of fit. 2.3.4.5 Goodness of Fit While individual variables are tested to confirm significant contributions to the model, how well the model itself predicts or fits observed data also needs to be tested. Goodness of fit methodology follows that set out by Sawalha & Sayed (1999), Sayed & de Leur (2001b), and Sawalha & Sayed (2005a). There are quantitative and qualitative ways to assess overall model fit. Quantitatively, SD, Pear son %2, and ^measures are used to 2 determine goodness of fit. The SD, and the Pearsonx2 statistics should be less than the X distribution value with (n - p - 1) degrees of freedom at a 95% confidence level. Although no minimum value is recommended, a review of previously developed CPMs reveals that estimates for K usually exceed 1.0 (Sawalha 2002); therefore, this quantitative measure was also used in this thesis. Several subjective measures are also commonly used. A plot of the mean collision frequency predicted by the model, is (A,.), versus those observed at each location, yt, should show points clustered around the 45 degree line. A second subjective measure is to plot the Lovegrove 41 Pearson Residuals (PR,) versus the predicted collisions for each location. The Pearson Residual is defined as: E(A,)-y, PR, = (2.35) JVar(y,) A well-fit model should see values of PRt clustered around zero over the full range of predictions. A third subjective measure of model goodness of fit is to plot the average of squared residuals (SR) versus the predicted collision frequency. The average of squared residuals is defined as: X ( £ ( A ) - X ) 2 Average of SR = — (2.36) Where n is the number of locations in each group (e.g. five locations at a time). This is typically done after ranking the locations in order of predicted collision frequency, by plotting the averages of predicted collisions versus the averages of squared residuals (taken in groups) over the full set of locations (Sawalha & Sayed, 2001). For a well fit model, the points should cluster about the variance function line for a negative binomial error distribution as: E(y,f Var(y,) = E(y,) + - (2.37) Figure 2.5 shows typical plots of Pearson Residuals (a), and Squared Residuals (b) for a C P M . Plot of Pearson Residuals Predicted Average of Squared Residuals 45000 40000 35000 30000 25000 20000 15000 10000 5000 0 K = 2.8 100 150 200 250 300 350 Predicted a. Predicted vs Pearson Residual b. Predicted vs Squared Residual Figure 2.5. Subjective Model Goodness of Fit Measures. Lovegrove 42 2.3.4.6 Outlier Analysis In assessing overall model goodness of fit, one or more of the quantitative and/or qualitative assessment measures may not meet expectations. This lack of fit is often related to problems with data quality. Addressing possible data quality problems using outlier analysis is the last step in refining model fit. This identifies and removes those unusual or extreme observations that are not typical of the rest of the data. It is an especially important step if either of the SD or Pearson x2 values is close to or above the x2 distribution value. Outliers may include either response or explanatory data points. They may be caused either because the data points are genuinely different or because errors took place during data collection and/or recording. The method summarized below follows that from Sayed & Rodriguez (1999) and Sawalha & Sayed (2005b), who describe the use of the Cook's Distance (CD) measure for outlier analysis. The higher the value for a given observation, CDj, the stronger its influence on the model. CD, = ^ lr''s' Y (2.3 8.a) where: h, is the leverage value; r,PS is the standardized residual of point /, calculated as rp*-_ y,~y> = PR, (2.38.b) PRi is the Pearson Residual defined in Equation (2.35); and, p is the number of parameters. Thus, the CD measure is made up of two components. First, PR, reflects how well or how poorly the model fits the observation, yt. Second, /?. reflects how far the data is from the rest of the points. Sawalha & Sayed (2005b) suggest sorting the observations in descending order of CD and, in stepwise progression, removing points with the largest C D value. As each point is removed, regression parameter estimation is re-run with K fixed at its previous value to observe the change in goodness of fit via the SD statistic. If the SD change is greater than xlos,i = 3.84, the G L M software is re-run to provide new estimates of all parameters, including a new K and a new CD for each remaining point. Sawalha & Sayed (2005b) recommend that each variable's t-Lovegrove 43 statistic continue to be monitored to ensure it remains significant throughout outlier analysis. This stepwise outlier analysis is repeated until the change in SD becomes less than 3.84. When the last outlier has been removed from the dataset, regression software is re-run one last time to determine a new estimate for each parameter, including a new K value. Examples of fitted CPMs using the G L M process are well documented, including Sayed & Rodriguez (1999), and Sawalha (2002). 2.3.5 Issues with the Traditional Reactive RSIP Approach Within some limitations, the traditional, reactive engineering approach to road safety has been very effective in identifying and treating hazardous locations. Recent improvements using empirical Bayes techniques and CPMs have further increased the reliability of reactive empirical tools, and the overall RSIP cost effectiveness using the reactive engineering approach. However, there are limitations to the reactive approach. Road safety improvement programs in reaction to existing hazardous locations can only do so much to address unacceptably high collision frequencies. There has been much work done on identification of the shortcomings of the reactive approach versus the need for a more proactive engineering approach to road safety (van Schagen & Janssen, 2000; Waller, 2000; de Leur & Sayed, 2003; Herbel, 2004). De Leur & Sayed (2003) point out four systematic problems with continuing to pursue only reactive strategies. First, only a small proportion of the road system is reviewed for safety (i.e. collision locations only). Second, the road safety of any one location is evaluated only after an extended collision history and several years of excessive collision frequencies identify it as hazardous. Third, collision data on which reactive black spot programs rely is often of questionable quality and/or not fully reported. Without reliable data, the safety burden of hazardous locations on their surrounding communities may lie screened for many years before detection. Finally, fixing road safety problems after the neighbourhood and supporting road system are built and open is always costly. They suggest that it would be much more efficient to evaluate safety, detect potential problems, and revise designs proactively, before construction begins, as part of the planning process. To this end, road safety authorities and researchers are now pursuing proactive engineering approaches to prevent black spots and their associated social and economic burdens on society from occurring in the first place. Lovegrove 44 2.4 Proactive Road Safety Improvement Programs Rather than working to improve the safety of existing facilities, the proactive engineering approach to road safety improvement focuses on predicting and improving the safety of planned facilities (de Leur & Sayed, 2003). The goal of the proactive approach is to minimize the road safety risk by evaluating safety throughout each stage of the planning process, to preclude black spots from occurring at all. If road safety is explicitly addressed as one of the evaluation factors before a project is built, it reduces the number and cost of reactive safety countermeasures that have to be retrofitted into existing communities (de Leur & Sayed, 2003). Lower-cost RSIP strategies through proactive intervention may in the long term be a more effective and sustainable road safety engineering approach than reactive strategies. Enhanced effectiveness and sustainability can occur when road safety is explicitly evaluated throughout road planning and design, before the driver is exposed to it. Reducing the amount of time and/or distance that a driver is exposed to the driving task invariably reduces the chance of resultant driver errors. Reducing driver errors translates to a lower level of collision risk and consequence, thereby increasing protection for drivers and the rest of society (Waller, 2000; de Leur & Sayed, 2003). Increased protection enhances sustainability when it produces a road system with inherently lower long-term traffic risk, for the long-term social, environmental, and economic benefit of the community (Wegman, 1996). However, the proactive approach can only be effective i f supported by reliable empirical tools. While reliable empirical tools exist to support the reactive engineering approach (i.e. m/cro-level CPMs), proactive tools are at a relatively early stage of development and not considered reliable (de Leur & Sayed, 2003; Herbel, 2004). Yet they are needed to enable planners and engineers to estimate the level of safety of planned projects, of design changes to those projects, and of other proposed safety improvements. This section reviews the state of development of current proactive empirical tools, including road safety audits, CPMs, Sustainable Road Safety programs, and Road Safety Risk Indices. Lovegrove 45 2.4.1 Road Safety Audits Road Safety Audits (RSAs) can be applied either in reaction to an existing identified hazardous location, or proactively as part of a road planning and design exercise. Audits can be conducted at several project stages: feasibility, preliminary design, detailed design, pre-opening, and/or post-opening. Typically, RSAs involve an independent, multi-agency team performing an 'audit' or safety evaluation of a subject road, to identify ways to improve road safety performance (Austroads, 2001). RSAs are a proven road safety engineering tool to provide an explicit, formalized safety evaluation of road projects of any size. The cost to conduct an RSA has been estimated at 0.2 - 0.5% of total project costs, with benefits in the range of at least one fewer fatal collision per year, a first year rate of return in the range of 120% to 146%, and a B/C ratio of 36:1 (ARRB, 1999; Jordan, 2001). Until recently, forecasts on collision potential and safety improvements relied heavily on experience and professional judgment. The advent of micro-level CPMs that relate collision frequency-exposure-geometric traits has provided an additional decision aid. However, micro-level CPMs can only be used to aid RSAs involving single intersections or road segments, where exposure is known or can be relatively accurately estimated. The push toward more proactive, planning-level strategies is placing more emphasis on RSAs. Moreover, RSAs are being recommended as part of all regional and community-wide planning exercises (Wegman, 1996; Ho, Nepomuceno & Zein, 1998; Roberts, 1998; de Leur & Sayed, 2003; Hadayeghi et al., 2003; Herbel, 2004).). As the practise of planning-level analyses spreads, improved empirical tools are being pursued that are capable of providing reliable safety planning evaluations of an entire community, including entire networks of road segments and intersections. 2.4.2 Combining Micro-Level CPMs & Regional Transportation Planning Models Early, efforts in the pursuit of improved empirical tools to do proactive, planning level analysis included two studies in which micro-level CPMs and Emme/2 regional transportation planning models were used (Ho & Guarnaschelli, 1998; Lord & Persaud, 2004). Both used forecast Lovegrove 46 exposure data generated by Emme/2 software, one of the most widely used transportation planning model packages in North America (iNRO, 2003). Emme/2 uses the traditional 4-step transportation-planning algorithm of trip generation, trip distribution, mode split, and trip assignment. Model outputs include traffic volumes, travel times, and shortest paths, which can be aggregated by link, mode, node, zone, sub-area, and region. Built-in Emme/2 macros can translate these outputs into vehicle-kilometres, volume/capacity, and speeds for alternative comparisons, priority-setting, and future capital planning programs (INRO, 2003). The first study to attempt a safety planning evaluation using Emme/2 and micro-level CPMs was conducted using the Greater Vancouver Regional District (GVRD) Emme/2 regional transportation planning model, and early forms of micro-level CPMs (Ho & Guarnaschelli, 1998; Volk, Felipe, Ho & Guarnaschelli, 1999). Two families of micro-level CPMs were used, including: for intersections (2.3 9. a) Collisions/ year = UADTmJrdf(AADT^h 1 nr 1,000 1,000 Collisionslkm-yr = (AADT)b, for corridors (2.39.b) where: A A D T = Traffic volumes forecast by Emme/2 for individual road links and intersections; and, ao, bj = G L M parameter estimates related to facility type. Facility types included signalized and unsignalized intersections, and, two-lane, multi-lane, and freeway road segments. Equations 2.39.a and 2.39.b were integrated into the regional model using Emme/2 macro-commands. The macros extracted traffic volume forecasts and calculated A A D T for each road network link and node. Next, the macros calculated road segment and intersection collision predictions. Finally, collision predictions for each link and node were summed to provide zone-by-zone and regional totals. Two long-range planning scenarios were compared. The first base-network scenario included expected future land uses, employment, and road networks. The second improved scenario added a new regional highway corridor to the first scenario. Three tests were conducted to Lovegrove 47 ensure that the integrated Emme/2-CPM safety module was working properly. First, a simple empirical test confirmed that it agreed with manually calculated C P M estimates, verifying correct programming. Second, a corridor near to the highway improvement was evaluated and verified that the predicted change in collision patterns paralleled the changes in corridor traffic patterns. In the third test, the total collisions were checked to verify that the predicted collisions of the two scenarios differed at both the zonal and regional levels. Unfortunately, the results of this third and critical test did not meet expectations. At the zonal level, differences in safety estimates were marginal. At the regional level, the safety module predicted a 45% increase in collisions at 4-leg signalized intersections, a result inconsistent with zonal results. If there was such a significant change predicted at the regional level, it should have first been predicted at the zonal level, the area most likely to be impacted by the network change (Volk, Felipe, Ho & Guarnaschelli, 1999). An error analysis pointed to three possible problems: misinterpretations of non-standard intersections by the safety module macros; instability in the modelling process; and, inherently large variations in forecast traffic volumes that occur in Emme/2 at the individual link level (Ho & Guarnaschelli, 1998). It may also have been related to a Modifiable Unit Areal Problem (MAUP), possibly due to a lack of hierarchical consistency (Openshaw, 1984). Two recommendations were made regarding future study on planning-level empirical tools. First, as the CPMs used were not able to predict collisions at non-standard intersections, such as signalized T-intersections, nor at freeway interchanges, refined CPMs for these facilities to reduce errors were needed. Second, where local network or facility changes were planned in which traffic volumes were not expected to significantly change, Emme/2 sub-area models should have been used to refine forecasts. In a second similar study, Lord & Persaud (2004) attempted to integrate refined micro-level CPMs with an Emme/2 transportation planning model of the Toronto area. The refined CPMs included those for different road and intersection configurations (e.g. 4 lane arterials, 3-way intersections), and collision categories (e.g. PDO, injury, and fatal). Despite these methodological enhancements, they experienced similar results, including the following conclusions and recommendations: o Use of micro-level CPMs with Emme/2 to do planning level analyses is cumbersome. A l l -in-one model software should be developed; Lovegrove 48 o Models that predict collisions based on road segment length may not be appropriate in use with modelled digital networks. Modelled links are often shorter when bisected by nodes, and not reflective of actual alignments. For strategic modelling purposes, one link may replace several routes. A correction factor was recommended; o Micro-level CPMs may be too sensitive to traffic volume values for use with Emme/2. Traffic volume data (and the corresponding explanatory variables) used to develop micro-level CPMs typically explain over 50% of the systematic variation in collision occurrence (Kulmala, 1995; Fridstrom et al., 1995). As such, the reliability of resulting CPMs are heavily dependent on accurate traffic volume forecasts. However, several have pointed out that traffic forecast errors on individual links in Emme/2 (and other strategic transportation models) may be in the order of 40% to 50% (Lin & Navin, 1999; Krishnamurthy & Kockelman, 2003); o Temporal variations need to be taken into account; a correction factor is needed (Lord & Persaud, 2000); o A major assumption in using the micro-level CPMs for planning forecasts and safety evaluations is that all characteristics that impact on safety remain fixed except for the one (e.g. traffic volume) being modified, which is not necessarily correct; o Use of micro-level CPMs to do planning level analyses with Emme/2 is data intensive; moreover, future road and intersection locations, and traffic volumes may be unknown; and, o Macro-level CPMs should be pursued, incorporating spatial statistical attributes and all road segments, on a zone-by-zone basis, rather than just those road segments included in the digitized Emme/2 road network. These two studies revealed that micro-level CPMs cannot fill the gap between what is needed and what is available in terms of reliable safety planning tools for proactive road safety improvement programs. Micro-level C P M limitations relate to their prediction of the level of safety at a single location (e.g. intersection or road segment), where traffic volume levels are known or can be estimated via short-term projections. However, the traffic forecasts at any one location derived from long-term planning-level analyses are known to be inaccurate (+/- 30%) due to their regional screenline level of calibration, their longer-term timeframe (i.e. often twenty years into the future), and their much broader focus (i.e. digitized major road networks only). As Lovegrove 49 an alternative empirical tool, these studies further suggest that macro-level CPMs may fill that gap, and research has been underway to develop macro-level CPMs since the mid-1990s. 2.4.3 Dutch Sustainable Road Safety & Transportation Demand Management The Dutch Sustainable Road Safety (SRS) program was launched in 1986 as a community-based, proactive road safety strategy, to reduce severe collisions through better-integrated community and transportation planning (Wegman, 1996; van Schagen & Janssen, 2000). Early results suggest that the SRS program has been effective at reducing collisions in a proactive manner. It also provides several important clues for the development of macro-level CPMs. The notion of sustainability was linked with road safety in response to suggestions that the root cause of present road safety problems lies in inadequate community planning, which created a built form that nurtures auto dependence. Increased auto dependence invariably has lead to increased auto use, which in turn has lead to increased driver exposure and collision risk (Buchanan, 1963; Mackay, 1993; Wegman, 1996). Alternatively, the SRS program works to ensure that all parts of a community's land use and transportation system promote safety in the long term, by planning an integrated, self-reinforcing, inherently safe, community-based traffic system (Wegman, 1996; van Schagen & Janssen, 2000). Building on the sustainability vision, safety planning initiatives have been launched in five program streams, including (Wegman, 1996, 1997a, 1997b; Schermers, 1999; van Wee, 2000): o Roads and land use; o Technology transfer of national standards and seminars for practitioners; o Financing partnerships with senior governments to encourage early adoption; o User education and consultation; and, o Transportation demand management (TDM) augmented with enforcement. While the first four program streams are common in most RSIPs, the Dutch SRS program also relies significantly on T D M , or a Sustainable Transportation System (STS), something not often done in RSIPs by governments (Brown 1992; Koltzow 1993, Wegman, 1996, 1997a). T D M strategies are in effect meant to manage demand for travel in a way that utilizes the transportation system more efficiently. It represents a demand-side strategy response to growth, Lovegrove 50 as opposed to the more traditional supply-side policy approach of building more roads (Khisty & Lall, 1998; Richardson, 1999). The premise on which T D M is based is that traditional supply-side growth management responses are not sustainable from either social, environmental, or economic perspectives (Nijkamp, 1994; TRB, 1998). Strategies such as those listed Table 2.2 have been suggested as more sustainable ways to manage growth in travel demand by increasing transportation system efficiency (Nijkamp, 1994; Khisty & Lall, 1998; Richardson, 1999; VTPI, 2002). Table 2.2. Transportation Demand Management Strategies. Improved Transportation Choices Pricing Incentives Land Use Management Education & Management Vehicle Systems Technology Driver Education Freight transport Transit marketing management Commute trip Least-cost planning reduction programs Regulatory reforms Non-motorized Smart growth policy transportation reforms encouragement Intelligent Advanced Traveller Transportation Information Systems Systems Smart Highways New Vehicle Technologies Smart Vehicles New Fuels Collision/Incident Management Manufacturer Innovations Traffic Flow Improvements G P S / G I S automated Peak spreading monitoring HOV parking, road priority Access Management Rail substitutes for trucks Transit improvements Walking / Cycling improvements Flexible work weeks Rideshare programs Shuttle services Car sharing Tele-work, tele-shop, tele-study, tele-conference Taxi Improvements Bike/transit integration Guaranteed ride home Congestion pricing Distance-based fees Employee transportation benefits Parking cash out Parking pricing Pay-as-you-drive vehicle insurance Fuel tax increases Vehicle Sales Tax by size, efficiency High-Occupancy-Toll lanes U-Pass, universal transportation pass programs Compact, dense, mixed-use Development Car free planning Location-efficient development Smart Growth Parking management Transit oriented development Traffic calming Bicycle/Pedestrian-friendly development Residential low-speed zones Presumably, T D M is considered an effective SRS strategy because it inherently aims to reduce traffic volumes, and in turn, collisions. However, the safety benefits of T D M have not been widely documented. Research continues to quantify the effectiveness of various T D M strategies, and develop improved empirical forecasting tools for decision-makers (Wallace, Mannering, & Rutherford, 1999; Richardson, 1999; Taylor & Ampt, 2003). However, only qualitative clues to T D M road safety benefits have been found to date. Litman (2002) conducted a comprehensive Lovegrove 51 literature review to assess the empirical traffic safety benefits of individual T D M strategies, but found no empirical safety-related models. Although T D M progress is difficult to assess, the Dutch SRS program did have promising road safety results from comprehensive monitoring and extensive case studies. As a result, SRS program manuals for community and transportation agencies have now been released, including full engineering and planning requirements for urban areas (van Schagen & Janssen, 2000; CROW, 1998). They include the following SRS program strategies for proactive road safety planning: o Traffic-restrained residential areas should be continuous, densely-zoned, and of large cores; o Keep road design homogeneous, simple to understand, and cognizant of user expectations; o The smallest proportion of the trip should occur on the least safe portion of the network; o Design roads of each function to be unique, understandable, and recognizable to users; o Avoid the need for travellers to do extensive searches when nearing destinations; o Reduce speed on approaches to and within potential conflict points or locations; o Physically separate different transport modes, and different road functions; o Limit the number of engineering solutions and road types; o Prevent conflicts with crossing traffic and pedestrians; o The shortest and safest routes should be the same; o Avoid obstacles alongside the roadway; o Prevent conflicts with opposing traffic; and, o A l l trip lengths should be minimised. Based on these safety planning guidelines, Dutch researchers used simple collision rate ratios together with traffic flow and speed data from a traffic forecasting model to evaluate the safety, accessibility, and mobility impacts of the three neighbourhood road networks shown in Figure 2.6 (CROW, 2000). The grid network is based on neo-traditional neighbourhood planning techniques, without explicit consideration of road safety. The discontinuous network is based on the contemporary planning responses to shortcutting problems, which promote cul-de-sacs and interconnected off-road pedestrian pathways. The limited access network is based on SRS guidelines. Estimates of safety for each network were projected by looking at collision rates Lovegrove 52 based on traffic volumes, differential speeds, roadside obstacles, and intersection types (van Schagen & Janssen, 2000). The results suggested a significant safety advantage for the sustainably safe road pattern (i.e. limited access network), with no adverse affects on accessibility or mobility, except on internal auto trips (Poppe, 1997c; Poppe & Galjaard, 1997). They have forecast total collision reductions of up to 20%, and severe collision reductions of up to 60% in the long term (Poppe, 1997c; Poppe & Galjaard, 1997). Unfortunately, the models used in the Dutch analysis assumed a simple linear exposure-collision rate, which was previously discussed in section 2.3.4 as an erroneous assumption. While increasing collisions are generally associated with increasing exposure, Hauer et al (1988), Kulmala (1995), and Sawalha & Sayed (1999) have shown that the relationship is non-linear. Miaou & Lum (1993) have shown that this erroneous linear assumption can introduce model forecasting errors of up to 80%, which casts significant doubt on the accuracy of the Dutch results without further empirical refinements. a. Grid Network (Neo-traditional) b. Discontinuous Network (Culs-de-sac) c. Limited Access Network (SRS guidelines) Figure 2.6. Alternative Neighbourhood Road Patterns. This lack of reliable empirical planning tools has been acknowledged for some time as a significant limiting factor in the Dutch SRS program, to the point that there is uncertainty as to whether the program will meet collision reduction targets (van Schagen & Janssen, 2000). Moreover, all SRS program forecasts have been based on this erroneous linear exposure-collision relationship, using trip forecasting models combined with simple collision rates. Fortunately, early SRS program monitoring has revealed some promising results from this proactive safety planning approach, which has maintained SRS program momentum (Wegman, Lovegrove 53 1996). Meanwhile, research on improved empirical techniques for forecasting the long-term costs and benefits of the Dutch SRS program has been ongoing since 1994 (Poppe, 1995, 1997a, 1997c; van Schagen & Janssen, 2000). Several Dutch studies have been conducted in search of improved empirical tools, including development of a safety and environmental impact module that works with a traditional, four-stage, regional transportation planning model (Poppe, 1995, 1997a, 1997c). Initial testing revealed a problem predicting safety on inner roads, that is, local roads that are not modelled inside each TAZ. To predict collisions on any road would require some measure of exposure on that road. However, usually only major roads in each T A Z are included as part of a transportation planning model street network. To address the lack of traffic forecasts on inner roads, Dutch researchers refocused their efforts by looking at ways to infer the level of road safety in a T A Z directly from other proxy indicators, such as the number of inhabitants, the number of jobs, the surface area, and/or overall street pattern. Van Minnen (1999) also studied proxy indicators when trying to quantify the optimal size of a neighbourhood core (i.e. the maximum inner neighbourhood area not bisected by major roads) from a road safety perspective. His recommended proxy variables include: neighbourhood core size, journey length, traffic volumes, choice of route, car speeds, and accessibility. While both studies produced proxy descriptors, neither took the next step of using them to develop macro-level CPMs. 2.4.4 Proactive Road Safety Planning Framework Building on the clues provided in the Dutch SRS program and T D M literature, road safety researchers in North America have been researching proactive RSIPs, including: Road Safety Planning Frameworks (RSPF), Safety Conscious Planning (SCP), and Road Safety Risk Index (RSRI) guidelines (Roberts, 1998; de Leur & Sayed, 2002, 2003; Herbel, 2004). The RSPF and its associated RSRI guidelines lay further groundwork towards macro-level CPMs which can facilitate the proactive road safety planning approach. In the conceptual development of the RSRI, de Leur & Sayed (2002) have built on previous research on the fundamental elements used to quantify road safety (Hauer, 1982; Koornstra, 1992; Navin et a l , 1999), beginning with an empirical restatement of the fundamental safety risk relationship, as follows: Lovegrove 54 R = E-P-C (2.40) where: R = Risk of collision; E = Exposure = measure to quantify the 'exposure' of road users to potential hazards; P = Probability = measure to quantify the chance of a vehicle being involved in a collision; and, C = Consequence = measure to quantify the severity level resulting from potential collisions. In this formulation, it can be seen that exposure plays a pivotal role in the road safety relationship, something revealed in earlier empirical research and micro-level CPMs (Kulmala, 1995; Fridstrom et al., 1995; Miaou, 1996). Building on this formulation, de Leur & Sayed (2003) took a further step to relate road safety back to the influence of transportation planning and land use, something postulated in the Dutch SRS and T D M initiatives. They recommended the RSPF shown in Figure 2.7 (de Leur & Sayed, 2003). The aim of the RSPF is to explicitly incorporate road safety as early as possible into every stage of road planning and design, including road safety audits, road safety multiple account evaluations, and road safety planning guidelines. Planning Stage Planning for a New Road Apply Guiding Principle Exposure Land Use Shape Network Shape Modal Choice Probabiliy — Maneuverability \— Geometric Design 1— Functionality Conflict Friction Predictability ]— Vulnerable Users |— Consequence 1 Reduce Speed |— Roadside „ r Assess Planning OpHons Develop Planning Options Post-Planning Stage_ Preferred Option MAE Audit Final Plan , Final Plan Design Stage Begins Audit Safety MAE Audit Plan Figure 2.7. Framework for Proactive Road Safety Planning (de Leur & Sayed, 2003). Lovegrove 55 To operationalize the framework, they have recommended the use of road safety assessment guidelines to quantify an RSRI for each facility (de Leur & Sayed, 2002). For each planned or built facility a RSRI can be quantified using the thirty safety planning principles given in Table 2.3 related to exposure, probability, and consequence. The RSRI combines qualitative and quantitative safety measures in an analytically hierarchical approach (Wedley, 1990; de Leur & Sayed, 2002). Quantitatively, micro-level CPMs provide location-specific collision forecasts along the planned facility. Qualitatively, ratings for the remaining factors listed in Table 2.3 are calculated using engineering judgment and experience (de Leur & Sayed, 2002). An overall RSRI rating is the final operational safety measure. RSRI ratings have been calculated for several existing highways sections, and compare favourably with manual assessments by highway engineers. The MoTH Highway Safety Branch has now endorsed the RSRI as a valuable decision-aid tool that promotes consistent ratings independent of the observer, and that facilitates formulation of appropriate road improvement strategies (de Leur & Sayed, 2002). Although the RSRI provides an improved empirical safety-planning tool, it is acknowledged that more improvement is needed (de Leur & Sayed, 2002). The assessment guidelines provide a valuable first step in empirical refinements for planning analyses, but their subjective nature is a limiting factor and could lead to accuracy problems (de Leur & Sayed, 2002). De Leur & Sayed (2002) suggest further refinement to provide improved, planning-level collision predictions and associated economic benefit/cost analyses, noting that it is only limited from wider application by a lack of empirical tools. The full RSRI application can then be used to facilitate improved road safety and community planning decisions. Lovegrove 56 Table 2.3. Calculating the Road Safety Risk Index (de Leur & Sayed, 2002). Exposure Land Use Network Shape Mode Choice Probability Manoeuvrability Geometric Design Functionality Conflicts Road Friction Road Predictability Consequence Vulnerable Users Reduce Speed Roadside 1. Sum the product of traffic volume and distance between O-D pairs. 2. Estimate reduced traffic causes by mode shift due to land use. 3. Quantify separation and connectivity between conflicting land use types. 4. Degree/ compactness of commercial development; access control. 1. Sum volume x distance on each route or use CPM 2. Discourage travel by increased delay, distance or cost; promote HOV. 1. Promote HOV facilities; estimate reduced traffic caused by mode shift. 2. Provide facilities for non-motorized travel; estimate reduced traffic. 1. Conduct level of service analysis to measure system performance. 2. Estimate manoeuvrability restrictions in terms of delay or interference. 3. Sum the product of vehicle manoeuvres and the traffic volume. 1. Qualitative assessment of the magnitude of topographic constraints. 2. Determine the frequency and degree of horizontal and vertical curves. 1. Estimate the unintended use of the desired function of each facility. 2. Use CPMs to estimate safety of different roadways. 3. Identify number / severity of deviations from consistent road character. 1. Measure safety impact by the total number of conflict points. 2. Use CPMs to estimate safety of different facilities. 3. Identify the number of conflict points that are excessively problematic. 1. Determine the number and severity of confining geometric elements. 2. Estimate / qualify the traffic elements causing friction (parking, etc.). 1. Use a qualitative assessment of the number of unpredictable locations. 2. Identify locations of complex geometric design and attempt to simplify. 1. Qualitatively assess the accommodation of vulnerable system users. 1. Determine where speed is a problem and multiply by affected volume. 2. Calculate the speed differential and magnitude of speed reductions 1. Estimate the cut and fill as a surrogate for topographic constraints. 2. Determine the frequency and degree of horizontal and vertical curves. 3. Assess factors contributing to the potential for roadside encroachment. 4. Identify roadside locations that require protection for errant vehicles. Lovegrove 57 2.5 Development of Macro-Level Collision Prediction Models The literature on the results of proactive planning use of micro-level CPMs, Dutch SRS program initiatives, and RSRI guidelines reveals significant empirical limitations, and the need for improved tools to proactively evaluate safety in planning. The discussion around that need for improvement points to the use of macro-level CPMs using community-based proxy variables as explanatory variables. The same conclusion was reached by road safety authorities and researchers participating in two recent North American safety planning seminars (Chatterjee et a l , 2003). Their discussions, documented together with results of case studies and current research, concluded with recommendations to pursue improved CPMs for use in long-range planning. This section reviews the results of studies on the development and use of macro-level CPMs, including methodological issues and possible proxy variables. 2.5.1 Macro-Level CPM Research Several attempts have been made to develop macro-level CPMs, with mixed results. Initial efforts used linear regression and failed to produce adequately fitted models. Levine et al (1995) related zonal collisions to area, population, road-miles (on major arterials, freeways, and freeway ramps), and employment (in retail, manufacturing, service, military, and financial sectors). However, the resulting multivariate model was based on linear regression, which was noted in section 2.3.4 as violating the non-negative, non-Normal, non-linear nature of the collision-exposure-traits relationship. Fotheringham (2000) used a Geographically Weighted Regression (GWR) model with similar explanatory variables to analyze the spatial variability of collisions between zones, with inconclusive results. Several methodological aspects of the GWR technique were appealing in addressing spatial-statistical concerns. However, Hadayeghi et al. (2003), following the same GWR methodology found similar inconclusive results. Kim & Yamashita (2002) tried to relate police collision data with land use categories using similar geo-statistical techniques, again without success. While geo-statistical techniques appear to hold promise, they do not appear to have reached sufficient sophistication to model road collisions. Lovegrove 58 Most recently, two attempts were made to develop macro-level CPMs using non-linear, non-Normal Generalized Linear Regression Modelling (GLM) techniques, and results appear promising. First, Hadayeghi et al. (2003) used G L M techniques to develop macro-level CPMs predicting the mean collision frequency expected for an entire traffic zone, using data aggregated across 463 traffic zones in Toronto, Canada. This aggregation was a significant departure from the methodology followed for micro-models. Instead of using data from an individual link or node (or a series of links or nodes with similar traits), macro-level CPMs were built using data summed across all nodes and/or links in each zone, across an entire community or region. Zonal sums of vehicle-kilometres-travelled (VKT), and zonal averages of congestion (VC) were extracted from Emme/2 output. Average zonal congestion was derived for each zone by averaging the volume/capacity ratios across all modelled links in the zone. Freeway data was not considered a good predictor of zonal collisions and excluded because freeways are typically limited access facilities, with characteristics and traffic flows that relate more to the freeway segment itself than the surrounding neighbourhood through which it runs (Hadayeghi et al., 2003). The city of Toronto provided 1996 geo-coded collision data, broken down by severity and time of day. Population data was aggregated zonally from a 1996 Toronto regional survey. After confirming data availability, final choice of variables and model development followed the usual stepwise forward procedure. Explanatory variables included vehicle-kilometres-travelled, arterial road lane-kms, number of households, area, posted speed, average zonal congestion, intersection density, total employed labour force, and total minor road kilometres. Several other possible explanatory variables were explored, including different employment sectors, certain land uses, neighbourhood geometry, driver age, gender, road conditions, collision reporting practises, and police enforcement levels. The resultant macro-level CPMs predicted either total or severe collisions, for either all day or rush hour time periods, based on the following mathematical form: E(A) = a0VKTb'e^b,x' (2.41) where: E ( A ) = Dependent variable (Mean Collision Frequency); aff bff b. - Constants; VKT = Zonal total of Emme/2 forecast vehicle-kilometres travelled; and, Lovegrove 59 x - Zonally aggregated explanatory variables (e.g. households, population, intersections). Both Poisson and negative binomial (NB) error distributions were tested in the G L M regression process, and the N B distribution was found to best represent the data. The significance of each variable was gauged by looking at several criteria, including: intuitively logical signs, Pearson X2 statistics at 95% confidence level, and contribution to overall model fit. The overall model goodness of fit was judged via the shape parameter, K , Pearson x2 between 0.8 and 1.2, Pearson R-Square, and R2K statistics. No outlier analysis was conducted. The R2K statistic is based on Miaou (1996): > 2 . K~ RK =1 ^ K where: K - Overdispersion factor for the subject model; and, KMIN = Overdispersion factor for a model with only one term (i.e. a constant). From the results, Hadayeghi et al. (2003) concluded that: o Emme/2 vehicle-kilometres-travelled (VKT) output is a reasonable proxy of actual zonal traffic patterns. o Increasing collisions were associated with zones of increasing V K T , households, major road kilometres, and intersection density. These results confirmed intuitive expectations. o Decreasing collisions were found to be associated with zones of increasing average posted speed, not an intuitive result. A possible explanation was offered that higher posted speeds occur where the road environment is engineered to be 'safer'. o Decreasing collisions were found to be associated with zones of increasing average zonal congestion, not an intuitive result. A possible explanation was offered that increasing congestion produces lower operating speeds, resulting in fewer collisions. o Increasing morning peak hour collisions tended to be associated with zones of increasing total employed labour force and minor road kilometres. Lovegrove 60 o Morning peak period CPMs had generally better fits, suggesting that they explained more of the data. A possible explanation was offered that the dominating model influence of the exposure variable came from an Emme/2 morning peak hour model. o The Severe Collision model goodness of fit was no better than that for Total Collisions. The second macro-level C P M successfully fitted was developed by Ladron de Guevara et al (2004) using 1998/99 collision data from Tucson, Arizona. These macro-level CPMs were developed assuming a non-linear, exponential function, with N B error distribution, using simultaneous equation and log-linear transformation techniques. Geo-statistical tools were used to aggregate road network, demographic, and economic geo-coded data for each of 859 traffic zones. Zonal road data was aggregated according to each of seven A A S H T O (2001) definitions, for ease of use in the road planning and design process. A l l road classes were included in the data, from local roads to state highways. However, roads on zonal boundaries were not assigned due to uncertainty over collision and zonal boundary geo-coding errors. This was estimated to introduce an error of less than 5%. Collision data was provided from the Arizona State database for the years 1998 and 1999. The errors in point source location of collisions and other zonal data were assumed to be sufficiently random so as to be insignificant. One difference from conventional C P M methodology involved the reliance on population density as a proxy for exposure, and not using a leading exposure variable (e.g. V K T , AADT) in the model form. Although this represented a significant departure from conventional C P M form recommendations (Kulmala, 1995; Miaou, 1996; Hauer et al., 1988; Sawalha & Sayed, 1999; Hadayeghi et a l , 2003), it was done for several reasons. First, inner roads in the TAZ typically do not have reliable traffic volume forecasts, a result well documented in other research (Poppe, 1997c; Hadayeghi et al., 2003; Lord & Persaud, 2004). On the other hand, population levels are a readily available user input, thereby precluding errors inherent in forecast exposure measures (e.g. Emme/2 outputs). Second, users in smaller communities may not have models available to forecast V K T , but usually always have population data. Hence, using population density instead of V K T could be more practical for all users, especially in long-range planning studies. Finally, a study by Cox (2003) was cited that population density has been found to strongly correlate with V K T . To improve population density fit, the number of minors (age < 17) was separated out Lovegrove 61 from total population, to remove non-drivers typically unrelated to collisions. Separating out youth as a separate variable also provided an indicator of family size. While using population density as a simplifying assumption has practical methodological merit for model development and data collection, it fails to uphold the empirical property inherent in recommended C P M form with a leading exposure variable. That is, when there is no exposure, there can be no collisions (Miaou, 1996; Sayed, 1998; Sawalha & Sayed, 2005a). While this omission is significant, the apparent goodness of fit of the macro-level CPMs is appealing. The simplifying exposure assumption produces the general model form: E(A) = e L h i X i (2.43) where: E ( A ) = Dependent variable in collisions per two years; b = Model parameters; and, x. = Independent variables. Due to the simultaneous equation methodology, forward stepwise procedures to identify explanatory variables was not possible. Instead, numerous candidate models were tested, with final model choice based on several criteria: the minimum value of Akaike's Information Criteria (AIC), a criteria related to model log-likelihood at convergence, and to the number of model variables (Maiou, 1996); maximum R2p statistic, which is based on standardized residuals; and, minimum G2 statistic, which is based on individual deviances. Screening criteria used to verify variable significance included: the parameter's estimated coefficient /?-value < 0.05 (i.e. a 95% confidence level); and, parameter's sign and magnitude agreement with theoretical expectations. No outlier analysis was conducted. Several CPMs were fitted, each predicting fatal, injury, or PDO collisions, with the following associations: o Increasing population density was associated with increasing collisions in all models; o Increasing minor population density was associated with decreasing severe collisions. It was suggested that this was due to more responsible driving habits of parents; Lovegrove 62 o Zones containing increasing amounts of total employment, intersection density, major arterial roads, minor arterial roads, and/or urban collector roads were associated with increasing injury and PDO collisions; and, o Increasing intersection density was associated with decreasing severe collisions. It was suggested that this was capturing the effect of slower average traffic speeds at intersections. Although several calibrated macro-level CPMs were developed by both studies, nowhere in the literature was evidence found of macro-level CPMs being used as part of RSIPs. 2.5.2 Issues Regarding Development & Use of Macro-Level CPMs In view of the limited amount of research on macro-level CPMs, several issues surrounding inconsistencies in model development and lack of guidelines for their use need to be dealt with. These issues are summarized below, including: availability and quality of data; transferability; explanatory variables; model form; aggregation bias; macro-reactive black spot programs; and, proactive regional and neighbourhood-level planning. 2.5.2.1 Availability and Quality of Data Whatever approach is chosen for developing CPMs, sufficient data of sound quality is a cornerstone of well-fit reliable models, and is an issue that needs to be addressed (Miaou, 1996; Chatterjee et al., 2003; USDOT, 2003b; Herbel, 2004). Without data sources and practical techniques that can provide relevant future values of explanatory variables, the C P M is of little practical value for road safety and community planning purposes. For safety planning in particular, current and future data values must be practical to extract or calculate across all zones. There are four barriers to obtaining adequate quality data. First, a statistically significant sample size may simply not be possible, especially in smaller communities. Second, collision data may not be available in a format suitable to allow downloading and/or analysis. Third, many less severe collisions are not attended by police, or are simply not reported. Fourth, where collisions are reported, many jurisdictions have multiple agencies that compile and store the data, in often incompatible and numerous databases. While the first barrier is difficult to overcome, several jurisdictions in North America have overcome the latter three through central, integrated collision data warehousing programs. For example, Lovegrove 63 the Insurance Corporation of British Columbia (ICBC) handles most auto collision insurance claims in BC, and centrally warehouses the claims data as part of its provincial mandate, de Leur & Sayed (2001) have shown than claims data can be used as a reasonable proxy for reported collisions, in order to develop collision prediction models. Under the auspices of its Road Safety Program, ICBC has taken its central claims database and geo-coded all auto collision claims since 1995 (Zein & Navin, 2000). However, the majority of jurisdictions do not yet have this central, geo-coded database capability. Continuing to rely on sporadic and suspect data puts program effectiveness and model reliability in doubt. 2.5.2.2 Model Transferability across Time-Space Regions A n emerging technique to deal with lack of data may be to focus on model transferability, in order to gain use of CPMs developed in other areas where quality data is available. Research is underway on how to transfer previously developed CPMs for use in geo-demographic regions and/or time periods (i.e. space-time regions) other than the one for which they were developed. In an early study, Mountain et al. (1998) observed that collision rates declined by 6% per year from 1975 to 1995, and suggested that it was due in large part to RSIPs. However, it was not confirmed by their subsequent CPMs, which introduced a time trend variable into the collision prediction equation. Other studies have also found no evidence of time trend benefits from RSIPs (Miaou, 1996; Noland, 2002). Noland (2002) performed an analysis of the road safety benefits of road infrastructure improvements from 1984 to 1997, using fixed-effects NB regression analysis of cross-sectional time series databases across all 50 U.S. states. He showed only marginal reductions in collision rates were due to road-related safety improvements over that time. Instead, he concluded that the majority of safety improvements were due to natural demographic changes, and proactive safety programs such as seat-belt use, reduced alcohol consumption, and improved medical technology. Most recently, Hirst et al. (2004) have proposed a time trend correction factor to apply externally, once a collision prediction is made, related to the gap in time between model development and analysis periods, and on observed local time trends and engineering judgement. However, time trend recommendations based on collision frequency are at odds with observations by others who contend that collision frequencies have plateaued (Koornstra, 1992; van Schagen & Janssen, 2000; Waller, 2000). Sawalha & Sayed (2005b) have recommended a methodology for transferring micro-CPMs between time-space regions that Lovegrove 64 appears adaptable for macro-level CPMs. Although this recommended methodology still requires some amount of data for the transferred C P M calibration, it does appear to take data sample size and quality into account in minimizing time-space transference error. The method involves the C P M constant, aa, and shape parameter, K , being simultaneously re-estimated while fixing the values of the other original model parameters. Data of questionable quantity or quality used in the calibration would be at least partly reflected in lowland/or high SD values for the calibrated model, and possibly the subsequent EB safety estimates (depending on sample size). An additional goodness of fit quantitative measure, the z statistic, is introduced to test transferred models. Using this z criterion, the transferred C P M is considered to be successfully calibrated if, in addition to meeting the SD and Pearson%2 measures, it has the following statistical property \X2P-E(xl)\ z = <?(xl) < 1.00 (2.44) where: Pearson^ - ± * J f f - 1 F ( ^ f v , " A = < « 3 > N 1 E(%l) = NMzl)= 2N(\ + 3/K) + Y (2.45) ' V tf £(A,)[1 + £ ( A , ) / K ] N = the number of data points used to calibrate the model. E(A,), K's are all from each calibrated C P M yt, Varfyj) are derived for each individual observation in the new data set. While transferability test results using this methodology are better than results using other methodologies, further testing has been recommended to refine the technique (Sawalha & Sayed, 2005b). As an interim measure, the use of this methodology should be limited in scope, and the subject of much future research. 2.5.2.3 Variables In the two macro-level C P M studies (Hadayeghi et al. 2003; Ladron de Geuvara et al. 2004), twelve explanatory and seven dependent variables have been identified as significant. Consideration needs to be given to whether these variables are the most relevant, and whether others can are better or can enhance macro-level CPMs for planning-level use. Lovegrove 65 Although variables for micro-level CPMs focus on traffic engineering geometries or other detailed road features, variables for macro-level CPMs need to support policy and planning decisions, as well as traffic safety decisions. Thus, macro-level CPMs need variables that are relevant on an area-wide or macro-level basis, as planning tool descriptors. In other words, variables are needed that facilitate CPMs sensitive to and relevant for simultaneous transportation planning and traffic safety engineering decision considerations, not just one or the other (Chatterjee et al., 2003). A first step to addressing this simultaneous safety planning need would be to review the literature on transportation planning and road safety planning research, to identify all relevant measures. However, it was noted in section 2.3.4 that model development is computationally difficult with large sets of candidate variables. The resources required for testing variables and models in different orders and combinations (as per Ladron de Geuvara et al., 2004) may be computationally prohibitive. For example, with eight candidate variables, there would be 2 = 256 possible subset models, which may be manageable. However, with 15 test variables, that subset rises to 32,768 possible models to be tested! One technique to deal with large sets of candidate variables that has been recommended by Sawalha & Sayed (2005a) is to use a forward stepwise procedure (used by Hadayeghi et al., 2003). However, Miaou (1996) has advised caution when choosing variables using forward stepwise procedures, because the order in which candidate variables are added to the model can influences their significance and overall model fit. To address both computational and stepwise procedure concerns, careful screening and grouping of candidate variables has been recommended (Miaou, 1996). Another caution raised by Miaou (1996) relates to studying the variance of the data used in testing variable significance to make final selection choices. If there is little variance observed in the data, chances are the variable association will not be significant (e.g. low t-statistic, insignificant SD drop). However, the variable may indeed be causal. Therefore, engineering judgement is essential in the final variable selection and data extraction methodology (Miaou, 1996). 2.5.2.4 Model Form Assuming data of sufficient quality and quantity are available to support variable selection, the next issue is choice of model form. Many early micro-level CPMs were developed using non-linear functions with one exposure variable raised to some power. More recent models include the product of an exponential function as in Equation (2.31 a, b) (Kulmala, 1995; Hauer & Persaud, 1996; Miaou, 1996; Sawalha & Sayed, 2005a). On the pretence of Lovegrove 66 facilitating model use for practitioners not having access to complex transportation planning models and exposure forecasts, Ladron de Geuvara et al (2004) have suggested using a modified exponential model form without any leading exposure variable as in Equation 2.43. However, this form violates zero risk logic. Alternatively, one way to facilitate use while providing zero risk logic may be to develop a second measured type of macro-level C P M form, in addition to macro-level CPMs that employ V K T as the lead exposure variable, with road-lane-kilometres (TLKM) as the lead exposure variable, such as E(A) = aJLKM *» e ^ b , x ' (2.46) Although there has been no macro-level C P M research on this form, T L K M can be measured and forecast directly by practitioners, and, complies with zero risk logic. T L K M can also be forecast by practitioners who have no access to transportation modelling or traffic flow forecasts. 2.5.2.5 Aggregation Bias The way in which data is extracted and aggregated influences whether underlying causal mechanisms are screened or revealed in the resulting models (Davis, 2004). Many have cautioned that using improperly aggregated, macro-level data correlated by geographic area may screen causal associations between collisions and predictor variables, leading to biased results (Freedman, 1997; Turner, 1997; Kmet et al., 2003; Davis, 2004). Aggregation (or ecological) bias is a phenomenon that can produce a statistically significant association suggesting a causal relationship that is in direct contradiction to the true underlying causal association. For example, Davis (2004) showed how one micro-level C P M inadvertently screened the speed/collision relationship when traffic volume data was improperly aggregated and not stratified. He showed that a systematic sampling method which first stratified the collision data according traffic volume levels facilitated a statistically correct C P M that was able to correctly associate the underlying causal relationship. To minimize aggregation bias, data extraction and model development should be done in a systematic, well-stratified manner. The usual practise is to guide the sampling method and resulting model development based on some prior knowledge of the underlying causal mechanisms (Miaou, 1996; Davis, 2004). Barring an understanding of the underlying causal mechanisms, Davis (2004) recommends using as large a data set as possible, stratifying classes of possible models, and then identifying a member of each class that best fits the data. If the true causal mechanism model is a member of that class, then goodness of fit tests will confirm it. The need to carefully stratify CPMs, with emphasis on Lovegrove 67 exposure variables, reinforces earlier results that exposure is a major influence in traffic collision models (Fridstrom et a., 1995). 2.5.2.6 Sensitivity Analyses Once stratification is done to minimize aggregation bias, the tendency is to use individual variable associations for sensitivity analyses, which assumes variables influence the model independent of other variables. For example, practitioners may want to know the change in predicted collision frequency for a planned facility if its design lane width were changed, while keeping the other variables constant. However, doing this can lead to misleading results, for two reasons (de Leur & Sayed, 2001a). First, there may be correlation between model variables such that any change in safety may be due to more than just the changed variable. Second, there may be correlation between the variable and other unforeseen factors not present in the model. As such, this practise is not recommended. The practise still remains an issue, however, suggesting that further study is needed to determine i f and when individual associations can be used to do sensitivity analyses. One possibility that has been offered by de Leur & Sayed (2001a) is to do sensitivity analyses only when keeping the variable's value within the range of the explanatory data values originally observed in development of the collision prediction model. 2.5.2.7 Macro-Level CPM Use in Black Spot Programs Although no use of macro-level CPMs in black spot programs has been documented, existing methodology for the reactive use of micro-level CPMs in black spot programs would appear to be relatively simply adapted to macro-level CPMs. If so, the resulting macro-reactive black spot program may have several benefits. First, rather than focusing on those individual facilities in the region with pre-existing collision histories, macro-level CPMs can focus on evaluating the level of road safety risk for each neighbourhood across an entire region. This may facilitate earlier detection of black spots in collision prone neighbourhoods or zones (CPZs) without having collision histories at all road locations in the area. Second, once a CPZ is identified, macro-level CPMs could be further used to diagnose problems, and assess the effectiveness of possible countermeasures across the entire neighbourhood. Third, earlier detection and remedy of hazards would permit more efficient use of scarce RSIP funding and analytical resources, as well as reduced social and economic burdens Lovegrove 68 on the communities. However, guidelines for use and testing through case studies are needed to verify these statements. If macro-level CPMs are to be effective in enhancing black spot programs, a related issue that will need to be addressed is provision of a macro-level collision modification factor (CMF) for each proposed countermeasure (Miller, 1988; Abdelwahab & Sayed, 1993; Hauer & Persaud, 1996; Sayed, 1998). Improvement in estimation techniques for micro-level CMFs is already an issue that has been identified by Shen & Gan (2003), and the same issue will likely apply to macro-level CMFs. Four methodological weaknesses have been identified to be aware of in any development of macro-level CMFs. First, many C M F estimates have ignored possible errors due to collision migration, wherein a reduction at the treated site may cause a collision increase at adjacent sites (BTCE, 1992). The concept of collision migration has not been proven and is difficult to verify (Miaou, 1996; Sayed, 1998). Macro-level CPMs, with their inherent area-wide focus may facilitate resolution of the collision migration question. Second, other CMF estimates have ignored external or historical causal factor errors such as changes in traffic volumes, weather, economic conditions, vehicle fleet, and/or crash reporting practises. Third, other studies were based on methodologies that ignored maturation effects. Maturation refers to the effect of collision trends over time due possibly to changes in vehicle technology, or local driver experience and learning. The technique suggested to account for the effects of maturation and history is to compare before and after collision observations with a comparison group of similar locations (Sayed, 1998; Sayed & de Leur, 2001a). Using comparison data sets has become the accepted method to deal with both problems, and is now being applied via such techniques as the Odds Ratio (Hauer, 1996; Sayed & de Leur, 2001a). Finally, many C M F estimates have been based on only simple before and after observations, without accounting for Regression-to-the-Mean error. As noted earlier, the empirical Bayes technique is now the accepted method to address this error. 2.5.2.8 Proactive Use in Regional & Neighbourhood Planning Although no proactive, planning-level use of macro-level CPMs has been documented, studies on integrating micro-level CPMs and Emme/2 provide a methodological starting point for regional transportation planning safety evaluations. If successful, macro-level CPMs may provide a means to address Lovegrove 69 identified shortcomings in RSAs and RSRI analyses of planned regional land use and transportation projects. Moreover, perhaps macro-level CPMs may provide the greatest potential safety benefits at the community or neighbourhood level, where the most amount of planning activity occurs on a regular basis. For example, the Municipal Act in the Province of British Columbia requires every community to conduct updates of their growth management and land use plans at least once every five years. In addition, they have ongoing planning processes to develop sub-area and neighbourhood plans in the course of approving new developments and growth areas. Dutch researchers developing the Sustainable Road Safety Program suggest that community-based road safety planning guidelines will facilitate sustainable development, including significantly safer communities for all road users. 2.5.3 Clues to Macro-Level CPM Variables As a first step in successful macro-level C P M development and use, practical data sources and explanatory variables are needed. Four main explanatory variable themes have emerged in the literature to describe a neighbourhood from a macro-level and safety planning perspective. The four themes include: o Exposure, related to travel demand; o Socio-Demographics (S-D), related to population characteristics; o T D M or STSs, related to efficient use of the road system; and, o Network, related to the built road environment. Using these four thematic groupings, this section identifies possible macro-level C P M explanatory variables found in the literature on micro-level and macro-level road safety studies, and on land use and transportation planning studies. 2.5.3.1 From Macro-Level C P M s & Other Models Table 2.4 lists all variables that were tested, together with those that were found significant, for macro-level CPMs by Hadayeghi et al. (2003), and Ladron de Guevara et al. (2004). Several non-GLM macro-level model studies also provide useful clues, with variables tested shown in Table 2.5 (LaScala et al., 2002; Kmet et al., 2003; Petch & Henson, 2000; Levine et a l , 1995; Ewing et al., 2003; Kim & Yamashita, 2002). Petch & Henson (2000) studied child traffic casualties using geographic information system Lovegrove 70 (GIS) techniques to aggregate traffic exposure, road network, socio-economic, and car ownership data at the enumeration area level into zones based on the respective child density level. While their model development methodology was rigorous, including log-linear transformation of non-linear relationships, their multivariate regression analysis assumed a Normal distribution of residuals (as opposed to the recommended Poisson or NB distributions), rendering their results unreliable. Table 2.4. Macro-level C P M Variables. 1 , 2 Socio-Demograph ic Total Population Part-Time Employed Population Density2 (/area) Total Employed Number of Households1 Employment Density (/area) Household Density (/area) Number of Employees2 Full-Time Employed Population under Age 172 (%) Transportation Demand Management Number of Vehicles No. of Vehicles per Household Network Number of Intersections Intersection Density1'2 (/area) Minor Road Kilometres1'2 Major Road Kilometres1'2 Area1'2 Total Road Kilometres Miles of Urban Collector2 (%) Exposure Posted Speed1 Volume/Capacity1 Vehicle Distance Travelled^n-Flow Total Flow Out-Flow Dependent Variables Total Collisions/yr1 Severe Collisions/yr1 Fatal Collisions2 PDO Collisions2 Injury Collisions2 A M Total Collisions/yr1 A M Severe Collisions/yr1 DataYear(s): 19961, 1998/992 (Note: Variables in bold are those found to be significant in Hadayeghi, 2002'; Ladron de Guevara, 20 042) LaScala et al. (2002) found that pedestrian injuries for 102 neighbourhoods in California were strongly correlated with traffic flow, school locations, and youth population, and less so with household income, unemployment, and alcohol consumption. S-D, environmental, and combined models were developed. However, they used conventional linear regression and GIS aggregation techniques, and the models cannot be considered reliable. Kmet et al. (2003) did a Lovegrove 71 similar study in Alberta using a Poisson distribution error assumption. They looked for associations between fatal collisions, urban and rural land uses, population density, age, sex, impaired driving citations, education level, unemployment levels, and ethnicity. They found statistical evidence of associations with population density, impaired driving, and rural land use, but insufficient data points and correlation problems precluded multivariate regression. Also, their model form did not include an exposure variable. They concluded that use of population-based data allows stable estimation of collision rates by geographic area. Table 2.5. Other Macro-Level Study Variables. 3 ' 4 ' 5 ' 6 ' 7 ' 8 Socio-Demographic Population Density4'6(people/acre) Persons per Household5 High Income Families (%)3 Subsidized Housing5 (%) Unemployed (%) Terraced housing (%) Employment6 Land Use Category College education (%) Low Income Families (%) Population under Age 17/rd-km3 Number of attractors/generators Pop'n aged 15+ with < grade 9 (%) Rate of impaired driving/1,000 Divorced (%) Ethnicity (%) Transportation Demand Management Cars per sq. km No-car households5 (%) Network Urban / Rural road type4 No. of roundabouts/3+ arm int'ns Unclassified Road Kilometres5 Classified Road Kilometres5'6 Area 3' 4' 5' 6 No. of goods vehicles (24 hr) Exposure AADT 3 , 5(24hr) Sprawl7 Dependent Variables Fatal Collisions4'7 Pedestrian Collisions3'4'5 Accident location open space (km2) Data Year(s): April/92-March/963; 1995-19974; May/95-April/985 (Note: Variables in bold are significant in LaScala et al., 20 023; Kmet et al., 20034; Petch & Henson, 20 005; Levine et al., 1995f'; Ewing et al., 20037; Kim & Yamashita, 2002 s) 2.5.3.2 From Macro-Level Land Use & Transportation Planning Studies Although no macro-level CPMs are offered, the literature on macro-level land use and transportation planning contains many variables that suggest some form of association between neighbourhood Lovegrove 72 characteristics, travel patterns, and collision rates, as shown in Table 2.6 (Ewing & Cervero, 2001; Ewing et al., 2003). For example, Ewing et al. (2003) found several indicators of urban sprawl that may correlate with regional traffic fatalities (see asterisks in Table 2.6). They observed that sprawling regions appear to have more collisions than non-sprawling regions. They drew six conclusions that are particularly relevant to macro-level C P M research. First, policy makers need to make decisions based on some reasonable forecasting models. Second, in the absence of better models, simple ratios are better than nothing when doing sketch planning, as used by the US Environmental Protection Agency's Smart Growth Index. Third, trip frequency is associated primarily with socioeconomic level, and less so with built environment. Fourth, trip length is associated primarily with built environment, less so with socioeconomic level. Fifth, mode choice depends on both built environment and socioeconomics. Sixth, V K T is associated strongly with built environment. Most of these conclusions echo earlier findings by Buchanan (1963), as well as.more recent findings by Wegman (1996) and de Leur & Sayed (2002) on how travel demand relates exposure directly to collision risk. Therefore, they may provide additional clues on possible explanatory variables for macro-level CPMs. 2.5.3.3 From Micro-Level C P M s A review of micro-level C P M variables also provides clues to variables for possible use in macro-level CPMs. They focus on single-facility, road-related variables, most of which cannot be aggregated zonally. Significant variables are listed in Table 2.7 (Kulmala, 1995; Sayed & Rodriguez, 1999; Qin et al., 2004; de Leur & Sayed, 2001a; Greibe, 2003; Sawalha & Sayed, 1999). L o v e g r o v e 73 Table 2.6. Macro-Level Land Use & Transportation Planning Variables (Ewing & Cervero, 2001). Family size Work status Auto ownership Family average age Occupation L U - Office, retail (%), mix L U - intensity, mix, building height Households (#, density, per road km)* Socio-Demographic Distance to nearest park, store, gas station Transit-oriented or auto-oriented (dummy) Jobs/Housing ratio within zone Employment within 30 min by auto, transit* Employment, Population density* Income Age of homes, buildings (orientation to street) Average block size* Transportation Demand Management Parking (price, supply), spaces per worker Ridership per bus stop, rail station Homes, workers, jobs, stores near bus, rail Vehicle Occupancy Work trip mode split Transit service level Grid, non-grid network within zone Four-way intersections (%) (connectivity) Quadrilateral block shape (%) Arterial-km near rail station Distance between street lights Intersections/road kilometre Topography Trips/household Trip length distribution Midday trips/employee - walk, drive Vehicle-Distance-Travelled / person, household Trip time/trip purpose Street frontage with trees, empty lots Sidewalk width, benches Bicycle routes, facilities Pedestrian path continuity, convenience Mode split - walk, bike, bus, pool, drive, rail Transit rides/capita Network Road network type Regional location* No. of crosswalks (signalized, Unsignalized) Intersections near bus stops Culs-de-sac/dead-ends near bus stops Grid street network near bus stop Discontinuous street network near bus stop Exposure Transit-Distance-Travelled/person Services, employees within lA, 3 miles* Trips/person - drive, bus, bike, walk Vehicle-Hours-Travelled/person Home-work trip distance* * Sprawl measures Lovegrove 74 Table 2.7. Micro-Level C P M Variables. (Kulmala, 1995; Sayed & Rodriguez, 1999; Qin et al., 2004; de Leur & Sayed, 2001; Greibe, 2003; Sawalha & Sayed, 1999) Population density Median household income Driveway density Traffic calming measures Bicycle/Pedestrian facilities Socio-Demographic Land Use - shops, apartments, low density LU - Industrial / residential / neighbourhood LU - campus, downtown, tourist TDM Vulnerable road users (%, #) Bus stops Lane, Shoulder, Approach widths Roadside hazard rating Approach grade (%) Pavement type Sight distance (% > 400 m, % * 300 m) Horz., Vert, curviness (m/km, £%/#) Intersection control(stop, signal, yield Road lighting (dummy) Intersection angle, density Network No. of Intersection legs (3, 4) No./Type of lanes On-street Parking (length, %, dummy) Median (width, raised, dummy) Major/Minor intersections, driveways (#) No. of mid-block crosswalks none) Roadway class (%, length) Signals (#, density) Length of road section One-way/two-way traffic Exposure A A D T Level of Service (congestion, V/C) Annual/AM Vehicle Kilometres(VKT) Major/minor traffic ratio Passing zones (%) Daily trucks, buses, motorcycles (%) Posted / Operating / Differential Speed Fatalities Injuries Severe Collisions Total Collisions Dependent Variables Pedestrian Collisions Bicycle Collisions Motorcycle Collisions Lovegrove 75 2.6 Summary A worldwide priority has been placed on reducing road collision frequencies. Despite all efforts to date, the enormous social and economic costs to society remain at epidemic proportions relative to other health and safety issues. Traditional, reactive road safety improvement programs using empirical Bayes refinement techniques and micro-level collision prediction models have been effective in identifying and treating hazardous locations. However, this reactive approach requires a pre-existing collision history, treats a limited set of black spots based on available funding, and involves costly retrofits in existing communities. Despite progress in identifying and treating black spots, collision frequencies remain unacceptably high. As a result, researchers and road authorities are pursuing proactive road safety improvement programs. Proactive RSIPs involve explicit evaluation of safety at the planning stage of road projects, to preclude road collision problems before they occur. As such, proactive intervention has great potential to reduce collision frequencies in a sustainable manner. Early efforts at planning-level, proactive safety evaluation included use of micro-level CPMs and strategic, regional transportation planning models (e.g. Emme/2). Unfortunately, those applications were unsuccessful due to the large traffic forecasting variance inherent in Emme/2 forecasts, and the limited road networks involved. Therefore, there is still a gap between what is needed and what is available in terms of reliable empirical techniques to support a proactive approach to RSIPs. To address the gap, research is underway on macro-level CPMs. Dutch and U K researchers have searched for empirical tools to do planning level analyses, including road safety audits, and sustainable road safety strategies. However, problems with predicting traffic volumes on inner roads have precluded quantification. As an alternative, a number of macro-level or area-wide proxy variables were proposed in lieu of traffic data for inner roads. Although no Dutch macro-level CPMs were developed, the SRS guidelines and suggested proxy variables provided important clues. Using those clues, researchers in North America have been researching proactive road safety planning programs, including RSPFs and RSRIs in Canada, and SCP programs in the US. Early results in development of macro-level CPMs show promise, but there Lovegrove 76 remain two methodological issues, related to development and use. First, there are development issues related to lack of data and explanatory variables, to disagreement on model form, and to concerns regarding aggregation bias and causal mechanisms. Second, there are methodological issues related to a lack of guidelines for model use in road safety applications. The first steps to addressing these issues lies in identification of possible data sources and screening of explanatory variables for macro-level CPMs. From the literature on macro-level C P M studies, macro-level land use and transportation planning studies, and micro-level C P M studies, over 200 possible variables have been identified and stratified in four main groupings. Many have never been tested in macro-level CPMs, but have been in common use as planning variables. However, not all variables are expected to be practical for macro-level CPMs. For example, other than local population and road-lane measures, most variables used in micro-level CPMs cannot be aggregated zonally (e.g. traffic volumes on inner roads). Still other desirable variables may not have a reasonable means to procure data, for example, variables related to driver and/or vehicle characteristics. A means of systematically evaluating these and other of the over 200 variables that have been identified in Tables 2.4 through 2.7, to identify those that are most appropriate for use in macro-level C P M development, remains as a next step in this thesis. Lovegrove 77 3. METHODOLOGY FOR DATA EXTRACTION & MODEL DEVELOPMENT 3.1 Introduction This chapter on methodology is comprised of two main sections. In section 3.2, the data extraction methodology is described, including geographic scope, aggregation approach, variable definitions, and sources. In section 3.3, the model development methodology is described, including information on model groupings, regression technique, model form, and goodness-of-fit. It should be noted that the methodologies followed on safety applications of the developed models are included with their associated case study, in Chapters Six and Seven. 3.2 Data Extraction As described in Chapter Two, sufficient data of sound quality is a cornerstone of well-fit and reliable statistical models. Moreover, a successful data extraction process helps to ensure that the resulting statistical associations reflect underlying causal mechanisms. Following from recommendations found in the literature, this section contains a description of the data extraction process that was followed to maximize those chances of success. After the geographic scope of the data is described, the aggregation approach is described, including choice of aggregation units, and boundary effects. The next section on variable definitions contains a description of how variables have been stratified to minimize aggregation biases, and then screened to ensure relevant, practical models (and available, quality data sources). The discussion concludes with a description of data sources for each of the four thematic variable groups (exposure, S-D, T D M , and network), and for the collision data. 3.2.1 Geographic Scope The data extracted for model development relates to the geographic area shown in Figure 3.1, comprising the Greater Vancouver Regional District (GVRD) in the Province of British Columbia, Canada (highways are noted in red; TransLink, 2002; G V R D , 2002). The G V R D Lovegrove 78 land area is roughly 3,000 square kilometres, and is comprised of 21 member municipalities. In 1996, the G V R D population totalled nearly 2 million residents, dwelling in over 600,000 households, working in 900,000 full-time and part-time jobs, and driving over 32,000 lane-kilometres of non-highway3 roads (Census Canada, 1996; TransLink, 2002). Most of the population has traditionally lived and worked in the western, urban communities clustered around the Central Business District (CBD). Lower-density suburban residential growth areas, rural agricultural lands, and industrial hinterlands are located to the south and east. Given this large geographic scope, the first extraction step was to decide on how to aggregate the data. Figure 3.1. Greater Vancouver Regional District. 3.2.2 Aggregation Aggregation of the data into representative neighbourhoods was done to enable planning-level or macro-level C P M development and forecasting in accord with the objectives of this thesis. 3.2.2.1 Aggregation Units After consideration of the research objectives, geographic scope, and computational limits, the aggregation unit was based on the 577 TAZs used in the GVRD's In this thesis, the term highway refers to ail provincial, federal, regional and civic highways with speed limits greater than 60 km/h. Conversely, non-highways refers to roads with speed limits equal to or less than 60 km/h, including arterials, collectors, local road classifications. Lovegrove 79 Emme/2 transportation planning model (LNRO, 2003; G V R D , 1998). The Emme/2 model is a classic four-stage (trip generation, distribution, mode, assignment) gravity-based model, and was described in Chapter 2. Conveniently, the G V R D strategic planning objectives used to establish Emme/2 TAZs coincide closely with the macro-level planning objectives for the models being developed for this research, in two ways. First, the levels of computational effort and efficiency must be reasonably balanced in the number and size of TAZs created. In the Emme/2 software, computational effort is exponentially related to the number of traffic analysis zones. Effort is minimized by keeping the assignment of trips to the road network between generators and attractors reasonably efficient. Efficiency is optimized in the GVRD's Emme/2 model by choosing zone layouts that keep population and employment densities for each zone at roughly uniform levels regionally. These GVRD's Emme/2 T A Z sizes worked well in ensuring adequate data points in each zone, which in turn facilitated model goodness of fit for this research. Second, data quality and relevance must be maximized, and integration of disparate data sources must be facilitated. To address concerns over data quality, relevance, and integration, the GVRD's Emme/2 zone boundaries have been chosen to overlap as closely as possible with those of census tracts and municipalities. This creates an ability to obtain current and future demographic and land use data at a level of detail sufficient to populate each individual traffic zone. Figure 3.2 shows the relatively close correlation between the Emme/2 zone and census track boundaries across the G V R D (TransLink, 2002). This T A Z overlap with census tracts and civic boundaries was also in line with research objectives to ensure a practical neighbourhood-focused planning tool. In suburban and rural future growth areas where the GVRD's Emme/2 T A Z boundaries could not be correlated as well with census tracts (due to larger rural TAZs), zonal area and land use maps were used to aggregate census data for each TAZ. 3.2.2.2 Boundary Effects As part of the aggregation process, three assumptions had to be made concerning the influence of aggregation unit boundary choices, and the coarseness of the T A Z grid. First, Fotheringham (2000) observed that collisions located near zone boundaries may have an inter-zonal influence. Fortunately, Ladron de Guevara et al (2004) examined the issue, found that the number of collisions involved on boundaries was less than 5%, and did not significantly impact their results. A similar boundary distribution pattern of collisions was present with the G V R D chosen TAZs. Therefore, a similar assumption for this research that collisions and other Lovegrove 80 data geo-coded near boundaries would not be a significant influence on adjacent zones was deemed reasonable, allowing aggregation to be based strictly on geo-coded location in relation to T A Z boundaries. Z o n e s 1000 1001 -2000 2001 - 3000 3001 - 4000 4001 - 5000 5001 - 6000 6001 - 7000 7001 - 8000 8001 - 9000 Census Tracts 5.000 10 000 20 000 30 000 40.0.00 l Meters Figure 3.2. G V R D Emme/2 Model T A Z & Census Tract Boundaries in 1996. A second aggregation assumption concerned the location accuracy of self-reported collision claims, which comprised the majority of the collision claims database. The reported collision location could have contained significant horizontal errors both along and across the road, large enough to overlap with one or more other traffic zones. Fortunately, when investigating ways to address this possible reporting error, it was noted that ICBC had already geo-coded claim locations as either mid-block or intersection, and as being only on road centrelines. In effect, ICBC has addressed this potential error by a 'split the difference' geo-coding assumption. Therefore, the location accuracy assumption, by default, was to accept the ICBC geo-coding assumption as reasonably accurate. Third, it was understood that the selection of T A Z boundaries and geographic scale or size of zones themselves (i.e. coarseness of the aggregation grid) directly influence the homogeneity of data. The greater the geographic scale of TAZs, the Lovegrove 81 greater their data homogeneity. Whereas the impact of T A Z size has no basic function, the scale will reflect the particular geography of the measure being analyzed. This is an area of much research regarding Moveable Areal Unit Problems (MAUPs), and touches on how community development patterns (e.g. their efficiency) influence collision patterns (Openshaw, 1984). Based on the previously stated reasons in 3.2.2.1, the assumption in this research is that the use of the GVRD's TAZs - their boundaries and scale - is reasonable for the purposes of macro-level C P M development and identification of underlying causal mechanisms. Review of this assumption, including the effects of M A U P on macro-level CPMs is left as an area for future research. Having decided on how to aggregate the data, the next step in data extraction was to establish the list of candidate variables that would be used in model development. 3.2.3 Variable Definitions While proper choices of aggregation units and control of boundary effects greatly facilitated data extraction and macro-level planning, it was equally important to minimize aggregation bias. If not protected against, this bias could have screened true underlying causal mechanisms and lead to misinterpretation of the resulting models. For example, despite the GVRD's Emme/2 T A Z selection objective to maintain consistent population and employment levels across zones, there was still some variance in zonal size, especially between rural and urban. Not controlling for the average area and population of zones could have lead to the problem of zones causing collisions from larger zones having relatively more residents. Therefore, for this research, the possible effects of aggregation bias were reduced in two ways: first, by stratifying variables (e.g. creating rural and urban classes); and, second, by screening variables (e.g. using density measures versus absolute measures). The variables screened in this research were the over 200 previously identified in section 2.5.3 as being in common use for transportation planning and road safety studies. Other variables that have a significant influence on road safety planning would be desirable (e.g. related to human factors and/or vehicle characteristics), but have not been well researched or quantified, and are left as a topic for future research. 3.2.3.1 Stratification To minimize aggregation bias, stratification was performed for both independent and dependent variable sets. Stratification of data extracted for the dependent Lovegrove 82 (collision) variables sought the following divisions: collision type, collision severity, collision time period, and collision location. Collision types included vehicle-vehicle, pedestrian-vehicle, and bicycle-vehicle collisions. Collision severity included fatal, injury, and PDO. Collision time periods covered all collisions that occurred in a three year period, broken into the following periods: total, A M rush hour, P M rush hour, non-rush hour, and day of year. Location categories included: municipality, intersection, segment, and parking lot. After collisions were stratified, the next step was to stratify the explanatory variables. Stratification of explanatory variables was done in three levels. The first level was discussed in the literature review, including the four themes of: exposure, S-D, T D M , and network variables. The second level of stratification sought to provide models using one of two types of exposure variables, either measured or modeled. Use of measured variables allowed for practitioners that did not have the time or resources to access modeled exposure data. Measured data consisted of information that was obtained directly either digitally (e.g. using GIS software and geo-statistical macros), or, manually from land use, demographic, and road maps and databases. Modeled data consisted of traffic volume, speed, and congestion output from the GVRD's Emme/2 transportation planning model. Developing two complete sets of modeled and measured macro-level CPMs also allowed for comparison of the practical usefulness and predictive accuracy of each approach. The third level of stratification, related to land uses, was based on two factors: first, the literature suggested that an indicator of predominant land use patterns would be significant in explaining collision patterns; and, second, the average size of urban and rural TAZs was significantly different. Both of these issues could have introduced an M A U P bias, as previously discussed. Initial C P M development using a land use dummy variable indicated significance for land use as a variable. Therefore, an urban or rural stratification was also introduced as the third and final level for all variables. After conducting stratification for both dependent and independent variables, the chances of causality-based, empirical relationships were considered to be maximized. There was still a possibility of having explanatory variables in the same model which were correlated, or with associations which were diametrically opposed. However, correlations were expected to be minimal due to the methodology followed, and were double checked during the model Lovegrove 83 development process. The larger concern after stratifying each set of variables was related to determining which out of the 220 previously identified variables in the literature review to pursue in model construction. Computationally, it was prohibitive to test each variable while at the same time pursuing models in each of the fourteen dependent-variable stratifications and sixteen independent-variable stratifications (i.e. 14 x 16 - 224 individual models). Therefore, to keep the analytical requirements of developing over 220 possible models using over 220 possible candidate variables at a reasonable level, a screening procedure was conducted to identify the most practical candidate variables for model development. 3.2.3.2 Screening Several screening criteria were used to ensure that only the most practical of the over 220 possible macro-level variables identified in literature review and stratification process were selected as candidates for model development. First, the data could not be cost (or time) prohibitive to collect. Second, it had to be extractable in a relatively accurate and replicable manner. Third, variable definitions had to be relatively easy to understand so that the CPMs and C P M results would be trusted by the public, practitioners, and decision-makers. Fourth, the explanatory variables had to be predictable for use in strategic planning exercises. In other words, both current and future (i.e. predicted) values had to be obtainable. Next, they needed to be relevant and practical for use as community design, planning, and road safety descriptors. Finally, there needed to be sufficient data points, occurring in the majority of the traffic zones, over the time period of interest for this research (1996 - 1998). With scores given based on fit to each of these criteria, each of the 220 variables in the list was assigned an overall ranking for reference in selection of candidate variables for modelling, as shown in Appendix A . Tables 3.1 and 3.2 list the top-ranked 63 variables that best fit the screening criteria, including: possible data source(s), year(s), abbreviation, units, extraction method (e.g. measured/modelled), and descriptive statistics. Once these candidate variables were screened the data extraction sources were finalized. Lovegrove 84 Table 3.1. Candidate Variables - Collisions, Exposure, Socio-Demographic. Collisions (non-hiahwayl Sym bol M ethod Sou rce Yea rs GVRD Ttl Zn Avg Total collisions over 3 yrs T3 M easu red ICBC 96 - 98 257,970 451 Severe collisions (fatal & injury) S3 M easu red ICBC 96 - 98 63,681 114 Property-Damage-Only collisions PDO M easu red ICBC 96 - 98 194,289 337 Rush hour collisions (6:30-9:30 am; 3-6 pm) R3 M easu red ICBC 96 - 98 80,441 139 Non-Rush Collisions N R3 M easu red ICBC 96 - 98 177,529 308 Total AM collisions AM 3 M easured ICBC 96 - 98 24,850 43 Severe Rush Hour Collisions RS3 M easu red ICBC 96 - 98 21,510 37 Bicycle/Vehicle Collisions B3 M easu red ICBC 96 - 98 4,201 7 Pedestrian/Vehicle Collisions P3 M easu red ICBC 96 - 98 2,319 4 Exposure Symbol M ethod Source Year GVRD Ttl Zn Avg Average zonal speed, km/h SPD M odelled TransLink 1996 n/a 40 Average zonal congestion level VC M odelled TransLink 1996 n/a 0.3 Total (transit & auto) km's travelled - AM Pk TTVKT M odelled TransLink 1996 1,942,468 3366 Total Auto km's travelled - AM Pk Hr VKT M odelled TransLink 1996 1,928,799 3,372 Vehicle-Distance-Travelled / person VKTP M odelled TransLink 1996 n/a 1.00 Total lane km - from ArcGIS TLKM M easu red TransLink 1996 32,252 56 Zonal Area (Hectares) AR M easu red TransLink 1996 303,000 3.5 Socio-DemoaraDhics Symbol M ethod So u rce Year GVRD Ttl Zn Avg Urban Zones URB M easu red TransLink 1996 479 n/a Rural Zones R U R M easu red TransLink 1996 93 n/a Population (# ) POP M easu red Census 1996 1,923,000 3,300 Population Density (=POP/AR) POPD M easu red Census 1996 n/a 30 Participation in labour force [= (EM P + U N EM P)/PO 15] (% ) PARTP M easu red Census 1996 n/a 67% Employed residents (over age 15) (#) EM P M easu red Census 1996 908,000 1,600 Employed (=EMP/P015) (% ) EM PP Measured Census 1996 n/a 62% Employee Density (= EMP/AR) EM PD Measured Census 1996 n/a 3.0 Zone jobs in tourism, retail, govt, const W KG M easu red Census 1996 306,000 500 Zone jobs per capita (= WKG/POP) W KG D M easu red Census 1996 n/a 0.16 Unemployed Residents (over age 15) (#) U N EM P M easu red Census 1996 85,000 150 Unemployment rate [= U N E M P/(U N E M P+EM P)] (% ) UNEMPP M easu red Census 1996 n/a 9% Average incom e $ INCA Measured Census 1996 n/a $27,900 Median income $ INCM M easu red Census 1996 n/a $21,500 Average zonal family size FS M easu red Census 1996 64,000 3.3 Homes (#) N H M easu red Census 1996 673,000 1,200 Home Density (= NH/AR) N H D M easu red Census 1996 n/a 2.2 Population aged 15 & over (P015) was 1,476,980 in 1996. Table 3.2. Candidate Variables - Transportation Demand Management, Network.4 TDM Symbol Method Source Year GVRD Ttl Zn Avg Total c o m m u t e r s f rom e a c h zone TCM Measured C e n s u s 1996 8 0 4 , 8 5 7 1,407 C o m m u t e r density = tcm / hectare TCD Measured C e n s u s 1996 n/a 13 Core area = area w/o major rds x l O " 2 k m 2 CORE Measured G V R D 1996 1,242 2.2 Core area = C O R E / A R (%) CRP Measured G V R D 1996 n/a 2 9 . 1 % Shortcut capacity on local road(s) , vph SCC Measured T ransL ink 1996 3 ,013 5 Shortcut 'attract iveness' (= S C C x VC ) SCVC Model led T ransL ink 1996 1,032 2 No. of c o m m u t e r s by biking (°/o) BIKE Measured C e n s u s 1996 n/a 0 . 7 % No. of c o m m u t e r s by walking (%) WALK Measured C e n s u s 1996 n/a 3 . 3 % No. of c o m m u t e r s as car passengers (%) PASS Measured C e n s u s 1996 n/a 7 . 0 % No. of c o m m u t e r s by transit (%) BUS Measured C e n s u s 1996 n/a 8 . 7 % Mode split - drivers ( = DRIVE/TCM) (°/ 0 ) DRP Measured C e n s u s 1996 n/a 7 9 . 4 % Bus stops (#) BS Measured T ransL ink 2002 26 ,000 45 Bus Stop Density (/Ha) BSD Measured T ransL ink 2002 n/a 0.13 Vehicle O c c u p a n c y OC Measured Field Tr ip 1996 n/a 1.09 No. of c o m m u t e r s driving f rom zone DRIVE Measured C e n s u s 1996 573 ,020 1,002 Network Symbol Method Source Year GVRD Ttl Zn Avg No. of signals SIG Measured T ransL ink 1996 1,240 2 Signal density (/Ha) SIGD Measured T ransL ink 1996 n/a O.029 No. of intersections INT Measured T ransL ink 1996 28,851 50 Intersection density = INT / AR (Ha) INTD Measured T ransL ink 1996 n/a 0 .39 No. of intersections / T L K M ( INT/TLKM) INTKD Measured T ransL ink 1996 n/a 0.94 No. of 3 -way intersections / INT (%) I3WP Measured T ransL ink 1996 15,918 53 No. of arterial - local intersections/ INT (%) IAL.P Measured T ransL ink 1996 4 .429 17 Soft Horizontal Cu rve (< 4 5 degrees ) SB Measured T ransL ink 1996 4 ,600 18 Hard Horizontal Cu rve (> 45 degrees) HB Measured T ransL ink 1996 5 ,650 15 No. of Arterial l a n e - k m ALKm Measured T ransL ink 1996 9 ,400 15.8 No. of Collector l a n e - k m CLKm Measured T ransL ink 1996 10,000 19.9 No. of local l a n e - k m LLKm Measured T ransL ink 1996 13,000 26.5 No. of Arterial l a n e - k m / T L K M (%) ALKP Measured T ransL ink 1996 n/a 3 4 . 9 % No. of Collector l a n e - k m / T L K M (%) CLKP Measured T ransL ink 1996 n/a 2 6 . 5 % No. of local lane -km/ T L K M (%) LLKP Measured T ransL ink 1996 n/a 3 8 . 6 % 4 The Soft/Hard Horz Curve definition of 45 degrees only refers to the actual change in direction; it is not referring to geometric design standard (e.g. where a 40 degree curve refers to a 43.7 m radius). Lovegrove 85 3.2.4 Extraction Sources As part of the screening process, possible extraction sources had to be considered in order to confirm data costs, availability, quality, and predictability ratings for candidate variable rankings. For example, regardless of how valuable a variable might have been for macro-level CPMs, the variable could not be used if there was no source from which to extract its supporting data or forecast future values. Based on these reviews, one or more data sources have been identified for each of the four variable themes and for the collision variables, as described below. 3.2.4.1 Network Variables Network variables consisted entirely of measured data, collected both digitally and manually. The G V R D transportation authority, known as TransLink, provided digital files of roads, signals, T A Z boundaries, census track boundaries, and transit routes, which were aggregated using geo-statistical GIS software (TransLink, 2002). Manual aggregations were performed to extract data on intersection types, road geometry, and road-lane-kilometres. While intersection types and geometry could be derived from mapping, the lane-kilometre data required knowledge of the number of lanes in each direction for all roads in each zone. Local roads were assumed to be one lane in each direction. For collector and arterial roads, prior knowledge, augmented by visual examinations, were used to assign laning. 3.2.4.2 Exposure Variables Exposure variables consisted of either measured or modeled data. TransLink (2002) provided Emme/2 model output files for modeled exposure variables. As the GVRD's Emme/2 model had been constructed in 1996 to model only morning rush hours, P M exposure data was not available. The files contained aggregated data on travel forecasts for a typical morning rush hour across the G V R D , including zonal totals for vehicle-kilometres-travelled (VKT), transit-kilometres-travelled (TKT), average zonal speed (SPD), and average zonal congestion (VC - Volume/Capacity). It was recognized that the accuracy of the modeled exposure data left much to be desired. Lin & Navin (1999) have shown that Emme/2 predictions can contain errors of up to 30%, despite the TransLink practise to calibrate the G V R D Emme/2 model forecasts to within 10% using regional screenlines. This suggests that in individual TAZs, one might expect errors in modeled exposure data of between 10% and 30%. However, this was felt reasonable for the purposes of this research for two reasons: first, it was expected to wash out Lovegrove 86 across zones and the region without introducing a systematic bias; and, second, it was consistent with methodology followed elsewhere (Hadayeghi et al., 2002), and therefore the results would be at least comparable with other macro-level CPMs. In addition to using only modeled exposure data from the A M time period, only non-highway data was used. On the premise that limited-access highways had no causal associations with zonal traffic patterns, all data related to limited-access highways were excluded. With Emme/2 transportation planning model software, this exclusion can be done using each modelled link's associated volume-delay function, which contains an indicator of that link's posted speed. Limited access highways in the G V R D typically have speed zones exceeding 60 km/h. Having excluded highway data, V K T and TKT were extracted by summing, over all links beginning, ending, and passing through a zone, the product of each link's forecast traffic volume by that portion of the link's length residing in the subject zone. Figure 3.3 (TransLink, 2002) shows the density of total transit and vehicle kilometres travelled (TTVKT = V K T + TKT) across the GVRD, showing higher travel densities in downtown areas. Average zonal speed, SPD, was an average of all modelled link speeds (link length / link travel time) in each zone. Average zonal congestion, V C , was the average of all modelled link v/c values in that zone. T T V K T / Hectare O - 12.3 12.4 - 29.9 30 - 55.7 55.8 - 97.8 97.9 - 198 Figure 3.3. G V R D Travel Demand Density. Lovegrove 87 3.2.4.3 Socio-Demographic Variables A l l S-D variables were measured, and derived from Census Canada databases. The most recent census data available had been aggregated by enumeration area and census tract for the census years 1996, and 2001. As the 2001 census occurred during a transit strike in Vancouver, rendering its data unusable for the purposes of this research, the census year 1996 was chosen as the base year for this research. Fortunately, collision data and Emme/2 model output data were also available for this time period. Census Canada data was derived directly from resident interviews done during July 1996 across the region. Figure 3.4 (GVRD, 2002; Census Canada, 1996; TransLink, 2002) shows the regional distribution of population density (POPD), with the highest densities shaded the darkest and focused around regional activity centres, and the lowest densities shaded the lightest and located in the outlying rural areas outlined in blue. Figure 3.4. G V R D Population Density. Figure 3.5 (Census Canada, 1996; TransLink, 2002) reveals the spatial distribution of jobs (government, tourism, construction, & retail) formulated in two ways. First, job density per hectare (WKGAD), shows that jobs are generally clustered in regional activity centres. Second, job to resident ratios (WKGD), paints a much different picture, where neighbourhoods with low jobs per hectare may still have a very high job to resident ratio due to low resident population, as Lovegrove 88 at airports (e.g. see Figure 3.5b) and in agricultural zones. It should be noted that zonal job data in C P M development were based on only the four major sectors: government, tourism, retail, and construction. Although data for other sectors were available, most significantly for education and health, it was not included in order to avoid possible aggregation bias errors. The concern was whether the Census Canada results were reported as being in the zone in which corporate offices were situated, or as being in the zone in which individual job sites were located. For example, the Vancouver school board had only several hundred of their 6,000 listed employees located in their district office, with most teaching at school sites dispersed in zones across the city. The risk from omitting this data (i.e. model goodness of fit from fewer data points) was judged to be more than offset by the reduction in risk of possible bias from including it. In the worst case, had there been no significant bias from including it, model fit would have improved, with no significant change in model parameters. Vancouver International Airport a. By Area b. By Population Figure 3.5. G V R D Job Density. 3.2.4.4 Transportation Demand Management Variables Data for all but one T D M variable was measured, and came from several sources. Census data was extracted on the number of commuters (TCM), broken down by choice of travel mode (i.e. DRIVE, PASSENGER, TRANSIT, BIKE, W A L K ) . Using this data, commuter densities and auto occupancy (OCC) could also be calculated. Also, based on Dutch SRS recommendations in the literature, the G V R D and TransLink databases were accessed for two additional variables. First, Lovegrove 89 Neighbourhood core area (CORE), defined by van Minnen (1999) as the largest portion of the traffic zone area not bisected by major roads, was derived by visual examination of each of the 577 neighbourhood maps overlaid with land use, roads, and zonal boundaries. To ease data extraction effort and promote consistent variable definitions, the CORE values were simplified to being a relative percentage of total zone area. Second, two shortcut-related variables were defined as part of this thesis - shortcut capacity (SCC) and shortcut attractiveness (SCVC) - to provide a zonal descriptor of the neighbourhood's access structure or road network. These shortcut-related variables followed from Dutch SRS guidelines (CROW, 1998) recommending that access or local roads be used for lower-speed, local traffic only, and not for higher-speed, through traffic. Therefore, neighbourhood access structure could be used as a proxy to describe the potential for penetration, speed, and volume of shortcutting traffic on local roads. The two shortcut-related definitions were drawn intuitively by the author, recognizing that shortcutting in neighbourhoods usually occurs when a semi-direct route through a neighbourhood matches commuter desire lines, and/or when travel demand on perimeter arterials exceeds capacity (e.g. during congested rush hours). If there are no local roads connecting across the zone, or, i f traditional shortcut routes have been traffic calmed to effectively reduce speeds and directness, the shortcut capacity is considered zero. Conversely, semi-direct, non-traffic calmed roads matching peak period desire lines create a high shortcut capacity. The second shortcut-related variable, shortcut attractiveness (SCVC), utilized Emme/2 congestion (VC) output data to enhance the measured definition (SCC), and to capture the effect of the growing temptation to try a shortcut route through the neighbourhood during periods of high congestion on its perimeter arterials. Based on these intuitively derived shortcut definitions, mathematical definitions were constructed and are presented in Equations 3.1 and 3.2 below. Data values were assigned to each shortcutting variable by visual examination of neighbourhood street network, land use, and zone boundary maps, augmented by site visits to each neighbourhood. SCC = L-W-Cf-(RNS + RElv)-C, (3.1) scvc=scc-vc (3.2) Lovegrove 90 where: SCC = Shortcutting capacity, SCVC = Shortcutting attractiveness, L = Average number of local road lanes in each direction, W = 1 for one-way, = 2 for two-way, CF = Typical local road capacity (assumed as 150 veh/lane/hr), RNS'R-EW = Number of (north-south, east-west) local roads running completely across the zone = X (RNS + Ri:w ) CTC = Degree of zonal traffic calming = 0 i f traffic calmed; = 1 if no calming; = 0.5 if some traffic calming, A r = Zonal area, and, V C = Average zonal congestion level 3.2.4.5 Collision Variables After considering the quality of collision data available from municipal and police databases, the Insurance Corporation of BC (ICBC) was chosen as the primary collision data source for three reasons. First, police-reporting practises (for example, in the City of Vancouver, police reports are filed only where paramedics - ambulance, fire - have been called), and self-reporting practises (required only for damage above $1,000) jeopardized collision data availability at a municipal level. Second, ICBC handled all auto insurance collision claims in the G V R D , and centrally warehoused the collision claims data, de Leur & Sayed (2001) have shown that claims data can be used to develop well-fit CPMs in lieu of police and self-reported collision data. Third, ICBC was able to provide geo-coded files for over 250,000 non-highway collision claims in the G V R D for the years 1996, 1997, and 1998 (ICBC, 2003). These three years of data were used to reduce randomness and R T M bias in the data (i.e. summing over three years of data versus just one year of data tends to have a smoothing effect), and to quantify collision types for which data were typically sparse (e.g. severe, bicycle, and pedestrian collision claims). Lovegrove 91 The availability of actual geo-coded claim data from ICBC was considered a great advance in overcoming many of the traditional unreported, unattended, and/or incomplete (i.e. self-reported) collision data problems associated with local municipal collision databases, and with C P M development. Several jurisdictions across North America have centrally warehoused collision databases; more are expected to follow. The high number of G V R D collision claims (257,970 over 797 km 2 in 3 years) relative to that in Toronto (53,286 over 633.9 km 2 in 1996) suggested that under-reporting may have occurred in their collision data. Both Ladron de Guevara et al (2004) and Hadayeghi et al. (2003) made reference to this problem in their discussions on model results. In addition to facilitating this research, this central access to the large ICBC database is expected to be beneficial for other metropolitan areas in two ways. First, larger sample sizes and improved data quality from ICBC have facilitated improved collision prediction model stratification, calibration, and accuracy. Second, the resulting models facilitate transferability to, and/or research in, other time-space regions (Sayed & Sawalha, 2005b). Figure 3.6 shows the spatial distribution of Collision density (ICBC, 2003; TransLink, 2002). There is relatively little difference in patterns between the total and severe collision densities, confirming earlier results by Hadayeghi et al. (2003). The collision density patterns across the region are very similar in many respects to the travel demand and population density patterns shown in Figures 3.3 and 3.4, respectively. b. Severe Collision Density Figures 3.6. G V R D Collision Densities. Lovegrove 93 3.3 Model Development With candidate variables screened, and data extracted, model development began on a group by group basis. Subject to the choice of model form, the model development methodology essentially followed the G L M regression method and forward stepwise procedure described in Sayed & Sawalha (2005a). The method is summarized below. 3.3.1 Groupings Table 3.3 shows the sixteen model groups derived from the following explanatory data stratification levels: • Four themes of explanatory variables (Exposure, S-D, T D M , and Network); • Two land use types (93 rural traffic zones, 479 urban traffic zones); • Two data derivations (modelled or measured). Table 3.3. Model Groups Themes Land Use Derivation Group # Urban Modelled 1 Exposure Measured 2 Rural Modelled 3 Measured 4 Urban Modelled 5 Socio-Demographic Measured 6 Rural Modelled 7 Measured 8 Urban Modelled 9 Transportation Demand Measured 10 Management Rural Modelled 11 Measured 12 Urban Modelled 13 Network Measured 14 Rural Modelled 15 Measured 16 Lovegrove 94 3.3.2 Regression For each model developed in each group, the G L M regression method was the same, and followed that of micro-level C P M development closely. The generalized linear regression method had the advantage of overcoming the limitations associated with the use of conventional linear regression in modeling traffic collisions (Hauer et al. 1988, Sawalha & Sayed, 2001). The generalized linear regression modeling software, GLIM4, from N A G (1994) was used, with three user-specified functions. First, a logarithmic link function was used for the linear transformation. Second, the maximum likelihood method (MLE) was chosen to provide parameter estimates, as described in Sawalha & Sayed (2005a). Third, the error structure of the models was assumed to follow the negative binomial distribution as found by Hauer et al. (1988), Kulmala (1995), and Miaou (1996), among others. A sample of a GLIM4 output file is given in Appendix B. 3.3.3 Form To allow for both measured and modeled groupings while still providing zero risk logic, the model form used built on, but is more generic than, that used by earlier research (Sawalha & Sayed, 2001, 2005a; Hadayeghi et al., 2003, Ladron de Geuvara et al., 2004). In this thesis, the model form used is shown in Equation 3.3 as: E(A)=aoZa'e^blXi (3.3) where: E(A) = Predicted mean collision frequency; ad ap b. = G L M derived parameter estimates; Z = Exposure variable (VKT for modeled, T L K M for measured); and, x = Independent, explanatory variables (e.g. CORE, SCC, POPD, V C , etc.). The log-linear transformation was carried out using a logarithmic linking function in the G L M software, transforming Equation 3.3 into: Lovegrove 95 Ln[E(A)] = Ln(a0 ) + axLn{Z) + YJ {b, • xt) (3.4) 1=1 Using this regression method and model form, model development was then conducted to optimize goodness of fit. 3.3.4 Goodness of Fit The method used to evaluate variables and refine overall model goodness of fit is described in McCullagh and Nelder (1989), with an updated description pertaining specifically to CPMs contained in Sayed & Sawalha (2005a). Following from those descriptions, GLIM4 software was programmed to provide statistics for overall model fit (i.e. Scaled Deviance, Pearson x2, K), and for individual parameter fit (i.e. standard error, t-value, collinearity, logic). The Pearsonx2 and Scaled Deviance (SD) statistics have previously been defined in Equations 2.33 and 2.34, respectively, repeated for convenience below. Pearson x2=t[y'-EiA')f M Var(y,) SD = 2YJ y,-In f y, A £ ( A , ) (y,+K)\n (2.33) (2.34) These statistics provide objective measures, and are asymptotically j 2 distributed with n-p degrees of freedom. For these and all other descriptive statistics, 95% was the desired level of confidence used to assess goodness of fit. Model assessment was done in two stages: first, as candidate explanatory variables were selected to construct the model, and second, once explanatory variable selection was completed. 3.3.4.1 Selection of Explanatory Variables In the first stage, explanatory variable selection followed a forward stepwise procedure, as described in Sawalha & Sayed (2005a). In this procedure, each model was constructed by adding one variable at a time, and testing the change in model fit due to the added variable. The first variable tested in each model was the exposure variable, as recommended by Sawalha & Sayed (2005a) and others (Miaou, 1996), due to its usually dominating prediction influence. Additional candidate variables were then Lovegrove 96 systematically drawn from the lists in Tables 3.1 and 3.2. The decision to retain a variable in the model was based on four criteria. First, the logic (i.e. +/-) of the estimated parameter had to be associated intuitively with collisions. Second, the parameter estimate t-statistic had to be significant at the 95 percent confidence level (i.e. > 1.96). Third, the addition of the variable to the model should have caused a significant drop in the Scaled Deviance at the 95 percent confidence level (i.e. > 3.84). Fourth, the variable had to show little or no correlation with any of the other independent variables. For example, transit mode split would be the complement of, and therefore highly correlated with, auto mode split; hence, only one mode split variable (i.e. auto or transit) could be included in any one C P M . Correlation between variables was checked by viewing correlation results in the G L M software. Once explanatory variables were selected for a model, overall model goodness of fit was re-assessed using the SD, Pearson^2, and K measures. If this second stage resulted in a poor overall fit, then model refinement was pursued. 3.3.4.2 Model Refinement If the final stage of model goodness of fit assessment revealed statistics which did not meet expectations, model refinement was carried out to try and improve that fit. Having followed the recommended development method in the first stage, the cause of a poor fit for the resulting model pointed to some aspect of data quality. Therefore, the refinement method focused on an assessment of data quality, and in particular, whether there were any outliers in the data set. Outlier analysis using the Cook's Distance (CD) technique was done, following the method described in Sawalha & Sayed (2005b). It was used to identify and delete those data points (i.e. TAZs) with the highest CD, previously defined in Equation 2.38. High CD values indicate points that are very likely outliers. The analysis was done in stepwise progression, removing the points with largest CD values first. As each high CD data point was removed, the G L M software was then re-run while fixing the value of K at its previous value to test i f the point was an outlier. This resulted in a new SD statistic value being calculated based upon the revised database (i.e. with outlier removed). If the new SD statistic value dropped significantly when compared to its original value, the removed data point was considered an outlier and removed. At the 95% confidence level, a drop in SD > 3.84 per data point deleted was considered to be significant. This outlier search procedure was repeated until the drop in SD < 3.84, indicating insignificance. Once outliers were removed from the dataset, G L M software was re-run setting K - 0 to determine new estimates for each parameter and K , and descriptive Lovegrove 97 statistics. If the improvement in fit was enough to bring all quantitative assessment statistic 2 2 values into line with targeted values (e.g. SD, Pearson X < X > t-statistics < 1.96), the model was considered well-fit and ready for use in safety applications. It should be noted that while truncation of outliers is common practise for single facility micro-level CPMs, its use has not been documented before in development of macro-level CPMs, which aggregate over multiple locations. Therefore, to reduce possible introduction of bias due to outlier truncation aggregated over several areas while at the same time pursuing some model refinement, in this methodology outlier truncation was limited to an average of seven (for rural) and ten (for urban) TAZs per developed C P M . 3.4 Summary Based on methodology recommended in the literature, data was extracted and macro-level collision prediction models were developed in a manner intended to maximize data quality and model usefulness. Two precautions were taken during data extraction to facilitate successful models. First, errors due to aggregation bias were minimized through stratification of independent variables into the sixteen model groupings shown in Table 3.3. Second, to ensure practical, relevant models and ease of data extraction for practitioners, an evaluation framework was employed to screen the list of 220 possible variables in Appendix A to identify candidates for testing in model development. The sixty-three candidate variables are shown in Tables 3.1 and 3.2. Based on these candidate variables, several data extraction sources were identified and have been described, including geo-coded databases obtained from ICBC (1996 - 1998), TransLink (1996), G V R D (1996), and Census Canada (1996), with data years noted in brackets. Using this data and a generic macro-level C P M form, a G L M regression process was followed, with negative binomial error distribution, forward stepwise explanatory variable selection procedure, outlier analysis, and goodness of fit criteria as described in Sawalha & Sayed (2005a, b). A sample of a G L M software output file has been given in Appendix B. The recommended goodness of fit assessment measures were followed to confirm model fit with a 95% level of confidence. Having developed the macro-level CPMs in accord with recommended procedures, with adjustments to address methodological and bias error concerns, the resultant models have been presented in Chapter Four. Lovegrove 4. MODEL DEVELOPMENT RESULTS 98 4.1 Introduction Based on the data extraction and model development work described in Chapter Three, the resulting models are discussed in this chapter, in three parts. In section 4.2, model stratifications are presented, including modeled and measured, urban and rural, and collision types successfully developed in each of the original sixteen groupings. In section 4.3, statistical associations are presented, categorized according to the four major model groups (i.e. exposure, S-D, T D M , and network). In section 4.4, an analysis of possible causal mechanisms is presented under each of those same four groups. 4.2 Stratified Results Model development was conducted in each of sixteen different model groupings (see Table 3.3) using the following data stratifications: o Four variable themes (exposure, S-D, T D M , and network); o Two land uses (93 rural traffic zones, 479 urban traffic zones); o Two exposure variable sources {modeled or measured); and, In each of the sixteen model groupings, CPMs were developed for at least one of the nine collision types (Total, Severe, PDO, A M , A M / P M , Non-Rush, Bicycle, Pedestrian). Forty-seven macro-level CPMs were successfully fit, as listed in Tables 4.1. to 4.6 together with descriptive statistics. In the first four tables (Tables 4.1 to 4.4), thirty-five total and severe collision-type CPMs are presented. In Table 4.5, eight CPMs for other collision types are presented, including: A M , A M / P M , Non-rush, and Pedestrian collisions. In Table 4.6, four 'Integrated' CPMs for A M , A M / P M severe, Bicycle, and non-Rush collision types are presented. Integrated models are those having explanatory variables from more than one of the four data themes (Exposure, SD, Network, TDM). Although the fits are generally in accordance with goodness of fit criteria, in a small number of models the intercept in the log-linear fit equation, a0, has a low t-statistic Lovegrove 99 (<1.96), suggesting that the constant term leading the C P M may not significantly differ from 1.0 or influence the model. This does not pose significant difficulty since the parameter is the intercept (i.e. a constant not a variable), and the insignificance is marginal (similar results were noted by Hadayeghi et al., 2003). Table 4.1. Exposure CPMs - Total/Severe. Model Group # Pearson Model Form Urban, Modeled, Exposure K DoF Total Collisions!3yr = l.952VKT 0.6889 1.47 1.39 Severe Collisions / 3yr = 0.0S49VKT -0.898 1.7 Total Collisions 13yr = 1.15VKT 0.685 L45vc 1.5 Severe Collisions 13yr = 0.16154FAT 0 7 2 6 5 e 2 m v c 2 Urban, Measured, Exposure Total Collisions 13yr = 92 ASTLKM0A32] 3 _ Rural, Modeled , Exposure 1.2 2.0 Total Collisions/3yr = 0.35736VKToim 1.6 83 83 Severe Collisions 13yr = 0.01082FAT0 9 8 7 4 2.4 Total Collisions 13yr = 0.32368VKT0Mn J m v c 81 Severe Collisions/3yr = 0.01863FAT 4 Rural, Measured, Exposure 0 8672 2.032 vc e 1.6 Total Collisions/3yr = \.92TLKM 0.1549 1.0 85 Severe Collisions 13yr = 0 . 0 6 7 5 M M 1 3 6 9 X 459 495 77 68 79 67 63 94 92 SD X t-Statistics 0.05, dof 470 542 525 467 462 529 522 518 508 510 461 447 520 512 518 470 530 518 93 95 105 105 87 101 94 103 90 99 103 108 Constant =1.8 v k t = 15 Constant = - 6 v k t = 18 Constant = 0.4 vk t= 13 vc = 5 Constant = - 4 v k t = 12 vc = 7 Constant = 18 tlkm = 7 Constant = - 2 vkt = 11 Constant = - 6 vkt= 10 Constant = - 2 vkt = 10 vc = 4 Constant = - 5 vkt = 8 vc = 2 Contant = - 3 t lkm = 9 Constant = - 3 t lkm = 7 Lovegrove 100 Table 4.2. Socio-Demographic CPMs - Total/Severe. Model Group # Model Form 5 Urban, Modeled, Socio-Demographic K DoF Pearson 2 X x S D 0.05, dof t-Statistics Total Collisions /3yrs=l.822VKT 0.818 Severe Collisions 13 yrs=0.2613VKT1 2.0 454 461 500 505 (0.853vc + 0.00401 wkgd +0.004924 popd -0.5359 fs) e 1.7 457 411 515 508 18509 (l.595vc+ 0.00315w*grf + 0.004599poprf-0.5217/v) Const 1.1 vkt 16 vc 3 wkgd 3 popd 4 fs - 4 Const-2 vkt 15 vc 5 wkgd 2 popd 4 fs -4 Urban, Measured, Socio-Demographic 1.6 463 508 518 514 rp . i /~i 7 / • • in 7 / n n C T r F l / » 8 2 1 8 (0.007462po/>d + 0.06295w«em />- 0.743/v) Total Collisions l 3 yrs-14..2175TLKM -Q \ 1.3 464 437 532 515 Severe Collisions/3yrs=8.3645TLKM0™2 .e(°mSp"P<+™™~"-Const 9t lkm 12 popd 6 unemp 7 fs -5 Const 4 t lkm 12 popd 5 unemp 8 fs -3 Rural, Modeled, Socio-Demographic 2.7 77 rr> . i /~-r 77• • A T 1 1 T/f"T<>-6344 (2.409ve + 0.3529nW) Total Collisions 13 yrs=0.31WKT g 1.9 79 67 64 o / - 77- • li A A 1 T T ^ T - 0 8 5 7 9 (l.705vc + 0.3622n/irf) Severe Collisions 13yrs=0.0\7VKT Q 88 93 98 101 Const - 3 vkt 10 vc 3 nhd 2 C o n s t - 6 vkt 9 vc 2 nhd 2 8 Rural, Measured, Socio-Demographic 2.0 78 Total Collisions 13 yrs=0.055465TLKM]m e y 0.8916 81 Severe Collisions/3yrs=0.046468TLKMl363 e° 91 85 88 94 100 103 Cons t -4 t lkm 9 unemp 3 nhd 4 Constant - 4 tlkm 7 nhd 4 Lovegrove 101 Table 4.3. Transportation Demand Management CPMs - Total/Severe. Model Group # Model Form K Pearson 2 DoF „, 2 SD X t-Statistics X 0.05, dof Urban, Modeled, TDM Total Collisions/3yrs=\.63052VKT 1.9 462 483 510 513 Const 1.4 vkt 15 Severe Collisions 13 yrs = 0.07016VK7* 931' / 0 5 2 9 5 — 0 , 0.6887 (0.07924.vcvc-0.0000207core + 0.000000912rfr/ve) SCVC 9 COre - 5 drive 2 Const-7 vkt 19 scvc 6 core - 6 1.6 462 454 521 10 ' Urban, Measured, TDM 1.5 460 484 517 rj, , , „ „ . . /I-1 T ) 0 « T r ^ J ^0.5762 (0.02702.scc-0.0000277cr,re + 0 .000123to») 7bta/ Collisions 13 yrs = 43 J2S5TLKM Q 1.2 461 433 532 7 0 0 M T r ^ , /0 .72O5 (0.019!8.vcc-0.0000334cOre + 0.0000909(Cm) Severe Collisions 13 yrs = 7.22S3TLKM Q Const 14 sec 7 core - 6 tcm 3 Const 7 tlkm 8 sec 4 core - 6 tcm 2 ffi;il:Mrsi5;MiiiiMM 2.6 77 67 Total Collisions 13 yrs=0.303613KKT0 6 4 2 9 1.9 80 65 c /-> n c i / r T " ™ ' 5 (2.352vc+0.00000437<fr/ve) .Severe Collisions 13 yrs=0.025 5 F A . / g 88 Const - 3 vkt 10 vc 3 ted 2 95 Const - 5vkt 8 vc 2 drive 2 ;I2 Rural, Measured, TDM 1.8 79 86 89 Total Collisions 13yrs=0.162513MM'496g 00049*6"" 1.3 82 92 95 Severe Collisions / 3 yrs=Q.020\0\TLKM] 12 ~ 4 9 6 -Constant = - 3 tlkm 10 crp -2 Constant = - 5 tlkm 8 core - 3 Lovegrove 102 Table 4.4. Network CPMs - Total/Severe. Model Model Form Group # 13 \ "' Urban,•Mpdel^ -;N6!rworic DoF Pearson SD K 2.4 X 464 485 X 0.05, dof t-Statistics T * 1 n IV • II (\ O n £ O A W T ° 7 8 5 1 (2.399.v/g</ + 0.7947intrf-0.02213/3Wp) Total Collisions 13 yrs=0.9Q62oVKl g 505 515 Const-0.2 vkt 19 sigd 4 intd 6 i3wp = - 7 2.0 458 463 Severe C o / / / W 3 > ^ Const-6 vkt 16 514 509 vc 3 sigd 4 intd 3 i3wp - 6 l£L.£ Urban, Measured, Network 1.9 463 511 511 514 rr , 1 ^ IV • II O O n - 2 / l O T T - f i t Y 0.8675 (4.748.v/gc/-0.0204/3« .p+0.007193o/Ap) Total Collisions 13 yrs=29.9342TLKM g 1.5 463 463 526 514 C r- IT • II T T n O ^ T T - ^ ^ l O 4 (4.002.v/gd- 0 .01899/3«77+0 .01587a/ *p) Severe Collisions 13yrs=2.7Q96TLKM Q 15 Rural, Mb'dleTried^ etwpfk 2.7 78 66 Total Collisions/3yrs=0.3WVKT0 65* .£">™™<*») 2.0 Severe Collisions/3yrs=0M559VKT09me"1-9"gd 82 70 TO^;!iMra,» M e a s u r e d , Network 2.8 77 90 Total Collisions 13yrs=0A456TLKM,MO e 2.1 1.316 (193.7 .v/grf + 0.01001ia//>-0.0263/«p) 80 103 89 100 97 104 85 98 94 102 C r* IV • II n A « « T f Y\A\SX (212.v/gd + 0.01496/fl/p- 0.03349/*p) Severe Collisions / 3 yrs=0.05653TLKM g Const 11 tlkm 14 sigd 6 alkp 3 i3 wp = -12 Const 3 tlkm 15 sigd 4 alkp 5 i3wp = - 10 Const-2 vkt 10 vc 2 sigd 3 Const = - 6 v k t 10 s i g d 4 Const = -1.4 tlkm 10 sigd 5 ialp 2 llkp - 5 Constant = - 4 tlkm 9 sigd 5 ialp 3 llkp - 5 Lovegrove 103 Table 4.5. Other CPMs - A M , A M / P M , Non-Rush, Pedestrian. Model Group Model Form # _ _ 5 Urban, Modeled, Socio-Demographic Pearson K DoF 2 SD X 2.9 441 420 491 AM Collisions/3 yrs = 0.1054 7 5 9 4 e<,-2S4w + 0 - W M 3 w ^ + 0-M7,M^-0-2809ft> 6 JJrbjirijMeasured.i S^ eio^ emographic 2.6 410 400 459 AM CoHis^ons/Ty^=2^86TLKM°S651 e(°*°™W + ™ ™ > « ™ P - a * * -0.0000309, core) 10 Urban, Measured, TDM 2.2 404 373 „ , „ (0.00008182tcm + 0.03446scc -0.00004185core) AM Collisions/ 3 yrs = 1 . 8 6 9 7 X K M 0 7 8 e 14 Urban, Measured, Network 2.4 452 427 Attn iv • 17 _ 0 0.8809 p(4.218sigd-0.0197i3wp + 0.00863alkp) AM Collisions / 3 yrs = 2.496TLKM Q Urban, Modeled, TDM 2.0 458 432 449 502 508 AM/PM Collisions / 3 yrs = 0.7411 VKT 0.6254 p (0.007608tkt + 0.05252scvc - 0.00001582core) X 0.05, dof 491 458 452 503 509 t-Statistics const-4 VKT 16 VC 5 WKGD 3 POPD6 FS-2 Const 2 TLKM 15 POPD 7 UNEMP 9 FS - 4 CORE - 6 Const 2.4 TLKM 9.9 TCM 2 SCC 9 CORE - 7 Const 3 TLKM 15 SIGD6 I3WP- 13 ALKP3.2 Const-0.9 VKT 14 TKT5 SCVC6 CORE - 4 9 Urban, Modeled, TDM 2.2 488 458 488 Non-Rush Collisions/3yrs = 2.041 VKT0™1 e(o008658,k, + 0 0 6 7 4 s c v c"° 0 0 0 0 , 9 c o r e + 8 Rural, Measured, Socio-Demographic 4.3 89 105 Pedestrian Collisions /3yrs = 0.0035154TLKM 0.8714 g (o.i797unemp + 0.4022nhd) 16 Rural, Measured, Network' Pedestrian Collisions /3yrs = 0.002027V:A7V/" 8 4 e 4.1 90 1.55 intkd 100 84 86 496 112 113 Const 2 VKT 12 TKT 6 SCVC 7 CORE-5 DRIVE 1.5 Const- 4 TLKM 3 UNEMP2 NHD 4 Const-5 TLKM 4 INTKD 4 Lovegrove 104 Table 4.6. Integrated* CPMs - A M , AM/PM-Severe, Bicycle, Non-Rush. Pearson Model Form K DoF y 2 SD Urban, Modeled, Integrated 4.6 432 428 478 AM Collisions / 3 yrs =0.1068 KA'7o-50387Z^A/°-4153e(1-73,'c + 6-2™gd-0-0™""P +i-isoe-o6rfr™o Rural, Modeled; Network 2.1 93 95 105 AM/PM -Severe Collisions / 3yrs = 0.004503 TLKhf)1M VKT35197 Q (' "J"c + IS4-6^d) Rural, Modeled, Integrated 3e+15 81 82 87 Bicycle Collisions/3yrs = 44.389 e (ft«»"«*+fl«W7"«"»-'-» )^ X t-Statistics 0.05, dof 482 Const -6.8 VKT 11 TKLM6.7 VC8.2 SIGD 10.1 I3WP-10 DRIVE 3.0 Const -6 TLKM 4 VKT4 VC2 SIGD 4 Const 1.1 VKT 2.4 BIKE 1.6 FS-1.5 Rural, Measured, Integrated 3.0 81 97 92 103 Non-Rush Collisions /3yrs = Q3079TLKMiMSQ ( ' * ' »»"*"'»^-«»-''-'^-'"»»'»^ Const -2.2 TLKM 9 UNEMP 3.1 SIGD 6.5 IALP2.4 LLKP-5.2 POP 2.8 "Models incorporating all four variable themes 4.2.1 Modeled & Measured CPMs Twenty-one measured and twenty-six modeled CPMs were developed, as shown in Table 4.7. Goodness of fit for measured CPMs is only slightly lower than that for modeled CPMs. The values for the measured model shape parameters, K, averaged over 2.0. Their parameter estimate t-statistics averaged over 7.0 in absolute terms. This suggested that it was possible to use T L K M as a measured exposure variable to produce macro-level CPMs, and that their development did not require Emme/2 or other complex transportation modelling resources in all cases. Table 4.7. Numbers of CPMs by Land Use & Extraction Source. Modeled Measured Totals Urban 14 10 24 Rural 12 11 23 Totals 26 21 47 Lovegrove 105 4.2.2 Urban & Rural CPMs Twenty-four urban CPMs and twenty-three rural CPMs were developed, at least one in each group. While the values of K for rural models was on average higher than that for urban models, their t-statistics were significantly lower. Therefore, it appeared that this higher K value was reflecting the smaller rural sample size (i.e. 93 rural zones versus 479 urban zones). This may also be why the overall goodness of fit for rural models was better than that for urban models. For example, while all rural model statistics were within acceptable ranges {<%2), eight urban modeled CPMs and five urban measured CPMs had SDs slightly exceeding their %^ statistic. 2 Fortunately, all of them were within 3% of the target X statistic, and all of their other fit criteria were met. 4.2.3 Collision Types While macro-level CPMs were successfully developed for all sixteen groupings, CPMs were not successfully fit for all nine collision types. In fact, thirty-five of the forty-seven CPMs were for total (eighteen) and severe (seventeen) collision types only. Although the causes of most unsuccessful model fits were related to inadequate collision data, PDO CPMs were deliberately not pursued, in view of the fact that PDO collisions by definition are the mathematical complement of and therefore derived from total and severe collisions. Each total I severe model pair had similar variables and parameters, but severe models tended to have slightly lower K values and t-ratios. Whereas smaller sample size could be assumed as the contributing factor when K values are higher and t-ratios are higher (e.g. rural versus urban data), in this case a higher degree of dispersion in the data seemed to be the main cause, regardless of sample size. In any case, the smaller sample size did still play some role regarding other collision types, for which a total of twelve other CPMs were successfully fit, including: A M Rush Hour Collisions (5 urban AM3 CPMs), A M / P M rush hour collisions (1 urban R3 model), A M / P M severe (1 rural RS3 model), Non-Rush (1 urban, 1 rural NR3 model), Pedestrian (2 rural P3 CPMs), and Bicycle (1 rural B3 CPM). However, overdispersion was not the only reason that fewer CPMs of other collision types were successfully fit. For example, CPMs for pedestrian and bicycle collision Lovegrove 106 types were developed only in rural zones, with K values and overall fit among the best of all models. This suggested that additional stratification, and/or other explanatory variables may have also been needed to reveal associations and causal mechanisms in these model groups. 4.3 Statistical Associations For the statistical associations that were found, the observed effects on collisions were found to be consistent across all collision types and model groups. The models revealed that increased collisions were associated with increases in the following explanatory variables: o Exposure-related: vehicle kilometres travelled (VKT), total road lane kilometres (TLKM), and average zonal congestion (VC); o S-D-related: job density (WKGD), population density (POPD), unemployment (UNEMP), residential unit density (NHD); o TDM-related: shortcut capacity and attractiveness (SCC, SCVC), number of drivers (DRIVE), total commuters (TCM), total commuter density (TCD); and, o Network-related: signal density (SIGD), intersection density per unit area (INTD), intersection density per lane-km (LNTKD), arterial-local intersections (IALP), total arterial road lane kilometres (ALKP). The associations of increasing vehicle-kms, lane-kms, congestion, and number of drivers appeared to confirm intuitive expectations - the more travel, drivers, and/or road-kilometres, the higher the probability (and incidence) of collisions. Shortcut attractiveness (SCVC) was simply short-cut capacity (SCC) multiplied by congestion level (VC), discussed under T D M . The association of increasing signal density (SIGD) with increasing collisions suggested that more signals are not necessarily safer. The association of increasing collisions with increasing intersection density agreed with some previous results (Hadayeghi et al., 2003), but differed from others (Ladron de Geuvara et al., 2004), which is likely due to differing variable definitions. This associaton may suggest that closer attention is needed when planning neo-Traditional communities with redundant street (i.e. grid road) patterns that may foster higher intersection densities. Moreover, these associations appeared to confirm earlier findings by Poppe (1997c) in the Dutch SRS regarding neighbourhood road pattern influences on road safety. Lovegrove 107 While the above associations varied directly with collision, several inverse associations were also discovered. Several models revealed that decreased collisions were associated with increases in the following explanatory variables: family size (FS), core size and percentage (CORE, CRP), 3-way intersections (I3WP), and, local road lane-kilometres (LLKP). These inverse associations appeared to support earlier research findings, and are discussed more fully below. 4.3.1 Exposure Models (groups 1, 2, 3, 4) Eleven 'exposure' models were developed, with all variables associated directly with increased collisions. While intuitive and direct associations were found with all exposure variables, average zonal operating speed (SPD) based on Emme/2 output was found to be significant but inversely associated to collision frequency. That is, increased operating speeds were associated with reduced collision frequency. Intuitively, one might expect that increased operating speeds would be associated with increased collision frequency and severity, so there may have been several factors at work. First, there may have been some correlation with another un-identified variable causing this counter-intuitive result. Second, it may also have been a symptom of the need for improved stratification. Third, the most likely factor that may have been involved was the fact that this speed data came from Emme/2 modelling results, which is not measured and has been shown to have significant errors on individual links. In any case, operating speed was dropped in final model refinement. As previously noted, a similar inverse association was found by Hadayeghi et al. (2003), who used Emme/2 model output (VKT, VC) for all the CPMs. But instead of using operating speeds from Emme/2 output, Hadayeghi et al. (2003) used posted speeds from Emme/2 input databanks. This speed variable and its unidentified co-variant warrant revisiting, especially as to its association with collision severity. 4.3.2 Socio-Demographic Models (groups 5, 6, 7, 8) Eleven S-D models were developed. The models revealed that increased collisions were associated with increases in the following S-D variables: job density (WKGD), population Lovegrove 108 density (POPD), unemployment (UNEMP), residential unit density (NHD). Vehicle ownership and part-time employment data were not available. Income data (INC), which can usually be used as a proxy of vehicle ownership, was available but dropped after initial model results showed low t-ratios. Although total job figures were not tested, these results suggested that earlier research recommendations on using only certain employment sectors were at least reasonable. For reasons discussed in Chapter Three (section 3.2.4.3), job density (WKGD) did not include all zonal jobs, just those jobs in the retail, government, construction, or tourism sectors. While the association of increasing collisions with increasing job density (WKGD), population density (POPD), and residential density (NHD) seemed intuitive, the association with increasing unemployment (UNEMP) was difficult to explain. Intuitively, fewer residents with jobs on average should lead to fewer commuters, less exposure, and fewer collisions. Several other studies have tried and failed to confirm this theory by finding a statistically significant association between road safety and unemployment levels (La Scala et al., 2002; Kmet et al., 2003). The results in this research suggested that the relationship may be an inverse association. It may have been related to school trips made by stay-at-home parents driving young children to/from school, and/or students making secondary/post-secondary school trips, both made by employable but unemployed people, and both occurring in peak periods. However, further research would be needed to verify the cause of this association, perhaps considering additional socioeconomic factors. While most S-D variables had a direct association with collision frequency, there was one inverse association. Decreased collisions were associated with increased family size (FS). This association supported earlier results by Ladron de Guevara et al (2004), who suggested that parents were more responsible drivers. However, it may have also been capturing the influence of the way zonal size was determined (Openshaw, 1984). Zone size has been usually determined for transportation planning model purposes (e.g. Emme/2) to closely match Census tracts, and to keep zonal population within some average range (e.g. 1,000 in Tucson; 3,000 in Vancouver; 5,000 in Toronto). In this case, zones with a larger average family size likely were comprised of a higher number of children and a lower number of adults as compared with other zones. Consequently, in neighborhoods with larger average family-size, one should have seen a lower number of collisions due to fewer commuters (i.e. more children meant fewer adults; less adults Lovegrove 109 meant fewer commuters), and/or more responsible drivers (i.e. more parents). Therefore, this family size association has been viewed with caution as it may actually have been related to an aggregation bias related to zone size. 4.3.3 Transportation Demand Management Models (groups 9, 10, 11, 12) Seven models using Transportation Demand Management variables were successfully developed. The models revealed that increased collisions were associated with increases in the following explanatory variables: shortcut capacity (SCC), shortcut attractiveness (SCVC), number of drivers (DRIVE), total commuters (TCM), total commuter density (TCD). Shortcut capacity and shortcut attractiveness variables appeared to have successfully captured the access structure influences that Poppe et al (1997) proposed. While the shortcut associations were successfully captured, it appeared that TRANSIT, BIKE, W A L K mode split associations had not. However, some intuitive reasoning suggested that of all mode split variables, only DRIVE needed to be in ^ the final model equations. This reflected the fact that mode split variables (DRIVE, PASSENGER, TRANSIT, BIKE, W A L K ) were correlated, with any one a statstical reflection and mathematical complement of the rest. DRIVE happened to be the variable with the strongest association (highest t-ratio), because of predominant auto use. The one mitigating T D M variable associated with decreasing collisions was increasing core neighbourhood size (CORE) relative to zonal area. This result confirmed earlier findings by van Minnen (1999). However, further research would be needed to confirm whether the optimum (i.e. maximum) core size also suggested by van Minnen (1999) existed in Canada. Again, as noted earlier, this effect would have to be researched in conjunction with consideration of whether it was related to community development patterns (or efficiency) and MAUPs. 4.3.4 Network Models (groups 13, 14, 15, 16) Eleven network models were developed. The models revealed that increased collisions were associated with increases in the following explanatory variables: signal density (SIGD), intersection density per unit area (LNTD), intersection density per lane-km (INTKD), arterial-Lovegrove 110 local intersections (IALP) in rural areas only, and, total arterial road lane kilometres (ALKP). These results were for the most part intuitive. The arterial-local road intersections - collision association in rural areas was likely related to several issues. First, it may have been related to the usual nature of rural intersections, which are often screened by foliage, rolling hills, or other rural features. Second, it could also have been related to the less frequent, unexpected incidence of intersections between low-volume/high-speed rural grid roads and rural residential local roads or farm driveways. One result that differed from earlier research was the association of intersection density (INTD), wherein Hadayeghi et al. (2003) found that urban severe collision data was inversely associated with intersection density. Although INTD and the other variables were directly associated with all collision frequency types, there were two network variables with inverse associations in both total and severe CPMs. First, decreased collisions were associated with increases in the proportion of 3-way intersections (I3WP). This 3-way intersection result was similar to other research findings (Curtis et al., 2001), and facilitated analyses on street network patterns (e.g. grid versus discontinuous). Second, decreased collisions were associated with increased proportion of local road lane-kilometres (LLKP). The percentage split between zonal lane-kilometres of each road class - arterial (ALKP) , collector (CLKP), and local (LLKP) - had a significant association with collisions. This result was intuitive when considering that running speeds (and volumes) were generally lower on local roads than arterials. Thus, as the kilometres of local roads increased in proportion to that of arterial roads, the results suggested that fewer and less severe collisions were predicted to occur. As these variables were highly correlated, it was important to note that only one of the two appeared in any given network C P M . The G V R D regional average is a 32% / 30%o / 38%o split between arterial, collector, and local road lane-kilometres, respectively. Arterial lane-kilometres (ALKP) was found to be more strongly associated with urban collision prediction, whereas local-lane kilometres (LLKP) was the predominant predictor in rural models. This may have been an indication of the relatively low number of local roads in rural areas, where farms usually take access directly off rural arterial and collector grid roads. Lovegrove 111 4.3.5 Integrated Models Four integrated models were also developed to gauge the influence of stratification on model fit and statistical associations. They predicted urban A M rush hour (AM3) , rural non-rush hour (NR3), and rural bicycle (B3) collisions. A l l had relatively high K estimates, and met goodness of fit criteria. The better fit relative to the other groups suggested that more of the data randomness was being explained, which may also suggest a linkage to underlying causal mechanisms. However, this interpretation should be used with caution and requires further research to be sure (Davis, 2004). One new variable association was revealed in the integrated models, that of increased bicycle collisions with increased bicycle mode split. While this was an intuitive result, it was unfortunate that an association between bicycle use and total collisions was not found (i.e. relating bicycle use to the sum of bicycle, pedestrian, and vehicle collisions). The question of whether increased bicycle use increases or reduces road safety in general has been a subject of much interest. The bicycle C P M may point to an answer, and warrants further research to that end. The bicycle model was also different from the recommended C P M form in that it did not include an lead exposure variable (i.e. neither V K T nor T L K M ) . This was a decision made given the particular characteristics of bicycle use. In rural areas, bicyclists often go where roads do not exist, for example on off-road trails using farm fields between homes. Therefore, bicycle use was not as strongly influenced by the presence of roads and/or traffic. Bicycle use has also not been conventionally included in Emme/2 V K T projections. Hence, the modified C P M form was found to better fit the bicycle collision data, and provided another clue to development of empirical safety tools. Pedestrian collision models on the other hand, did include exposure variables as a leading model coefficient. However, only rural, measured models could be successfully fit. This rural collision association with pedestrians may be indicative of the typical rural walking environment, where most pedestrians must walk along the road shoulder due to ditches and lack of sidewalks, in a much more vulnerable position relative to vehicular traffic. Lovegrove 112 4.4 Possible Causal Mechanisms From the observed associations, a number of possible causal mechanisms were analyzed. 4.4.1 Exposure Models (groups 1, 2, 3, 4) The Exposure models confirmed earlier research regarding the dominant influence of exposure on collision predictions of all types (except bicycles) (Hauer et al., 1996). Given the predominant influence and non-linear relationship with collision frequency that exposure variables have traditionally had, the power parameter values for the lead variables (i.e. V K T , T L K M ) were noteworthy. Tanner (1953) found that collisions vary according to the square root of volumes, whereas Dutch SRS researchers (Poppe, 1997a, 1997b, 1997c; Poppe et al., 1997) used collision rates, which imply a linear relationship. Averages and ranges for the exposure variable power (i.e. a/ in Equation 3.3) for the models developed in this research have been presented in Table 4.8, and indicate that, for the most part, power values for V K T were in the range of 0.5 to 1.0, averaging around 0.75. Power values for T L K M averaged slightly higher with a bit more dispersion, suggesting that more refinement through additional research would be appropriate. Table 4.8. Averages (Ranges) of the Leading Exposure Variable Powers. VKT T L K M Total Severe A M Other Total Severe A M Other Urban 0.733 0.854 0.632 0.603 0.674 0.871 0.76 (0.685 - (0.727 - (0.504 - (0.58- (0.432 - (0.721 - (0.415- n/a 0.819) 0.931) 0.759) 0.625) 0.868) 1.04) 0.966) Rural 0.657 0.883 1.076 1.497 0.972 (0.634 - (0.792 - n/a n/a (0.155- (1.363 - n/a (0.784 -0.704) 0.987) 1.496) 1.72) 1.184) Overall 0.695 0.869 0.632 0.603 0.875 1.229 0.76 0.972 (0.634 - (0.727 - (0.504 - (0.58- (0.155 - (0.721 - (0.415- (0.784 -0.819) 0.987) 0.759) 0.625) 1.496) 1.721) 0.966) 1.184) *The CPM parameter a; in Equation 3.3 Lovegrove 113 In addition to the dominant influence that the lead exposure variables (VKT and T L K M ) had on the collision models, the results also suggested a significant influence by the congestion (VC) variable. Take for example the evaluation of an urban neighbourhood where a new or widened road was being built. Traditional CPMs may predict an increase in collision frequency due to increases in forecast V K T . However, the newly developed CPMs included V C , which could moderate earlier VKT-only predictions. In at least the short term after the new road was opened, average zonal congestion (VC) would drop and partially offset this predicted collision increase. Therefore, macro-level CPMs appear able to improve the empirical accuracy and reliability of safety evaluations by including a measure of congestion in urban models. It may be advisable to identify a similar corresponding congestion indicator to increase the predictive accuracy in measured models as well. 4.4.2 Socio-Demographic Models (groups 5, 6, 7, 8) The S-D models introduced several new variable associations. In almost every model, explanatory variables were the same for each collision type. This commonality of variables among models suggested a fairly strong causal mechanism. However, it also suggested complex relationships, precluding the adjustment of one variable while holding the others constant. This confirmed earlier suspicions by de Leur & Sayed (2001) against using individual C P M variables for sensitivity analyses. For example, one of the associations suggested that increasing the proportion of families with higher numbers of family members (FS) is associated with decreasing collisions in urban areas. The FS parameter estimate suggested that family size played a most dominant S-D role, and may even over-shadow population density and job density effects in influencing zonal road safety. However, adjustment of this C P M variable could not be done without some effect on population density and/or home density. Their inter-relationship is typically described as POP = N H FS. Moreover, planners often link population levels to zonal job levels via W K G D = JOBS/POP. These then are some of the fundamental building blocks to designing a community. Given their interdependencies, they should not be considered in isolation of each other. Lovegrove 114 4.4.3 Transportation Demand Management Models (groups 9, 10, 11, 12) The T D M models provided several clues on neighbourhood road safety planning with respect to mode splits, residential core size, and road network. Again, it appeared based on commonality across models that underlying mechanisms have been revealed, but that no one variable could be taken in isolation. This was less so the case with rural models, where mode split (e.g. DRIVE) and core size (CORE) variables appeared in separate measured and modeled CPMs. However, the presence of the CORE variable in nearly all models suggested that it may be a key neighbourhood safety planning variable. In urban neighbourhoods which are often surrounded by high traffic areas, guarding against shortcutting in concert with as large a residential core as possible seemed to be a reasonable safety enhancement strategy. This could be done using the recommended SRS discontinuous street network as recommended by Poppe (1997c). Reducing auto use (DRIVE) and total commuters (TCM) through T D M or auto-alternatives also seemed to help (e.g. tele-commuting, transit, or walking). As already noted, however, this should only be done in conjunction with a re-evaluation using other CPMs, to ensure that it does not negatively impact other safety variables and predicted results. 4.4.4 Network Models (groups 13, 14, 15, 16) The Network models also contained explanatory variables that were key neighbourhood planning variables, particularly related to intersection configurations and road class. Signal density was common to all models, suggesting a predominant influence next to the exposure (VKT, V C , T L K M ) variables in the models. In urban models, the key to improved neighbourhood road safety may be to prevent intrusion of and conflicts with higher-speed through traffic. One way this might be done is by keeping through traffic on high capacity perimeter arterials, and by minimizing signals through use of three-way intersections that limit conflicts and restrict access points into the neighbourhood. In rural areas, safer solutions pointed to less signals. The rural associations also spoke to the need for residential enclaves to not be developed in predominantly rural areas, and to better separate urban and rural land uses and travel purposes. Lovegrove 115 4 . 5 Summary The results showed that it was possible to quantify, on a macro-level T A Z scale, a statistically predictive association between traffic safety and neighbourhood characteristics pertaining to traffic exposure, road network, socio-demographic, and transportation demand management. Forty-seven macro-level collision prediction models were presented in Tables 4.1 to 4.6, including at least one in each of the sixteen groupings. Models have been developed which predict level of safety in both urban and rural areas. Thus, practitioners in all communities, whether predominantly urban or rural, can conduct neighbourhood safety evaluations. Fifteen of the CPMs relied on only measured data, allowing practitioners without access to major (e.g. Emme/2) model resources to develop and use the models. Each C P M predicted the three-year total of collisions for one of eight types of collisions - Total, Severe, A M rush hour, A M / P M rush hours, AM/PM-Severe, Non-Rush Hour, Pedestrian, and Bicycle - using one or more explanatory variables. Thus, the models showed a potential to address areas of particular safety concern in a particular neighbourhood. However, it is critical to review the definitions and statistical properties of variables, which can significantly impact results and use in other research. Increased collisions were associated with most explanatory variables; however, a decreased number of collisions were associated with an increase in: family size (FS), core neighbourhood area (CORE), 3-way intersections (I3WP), and local road lane kilometres (LLKP). These statistical associations were analyzed to reveal possible underlying causal mechanisms, which taken together, provided clues to several possible neighbourhood road safety strategies, as follows: o Increased proportion of 3-way intersections (I3WP) was associated with decreased collisions in urban areas. This result supported earlier research on the relative traffic safety of 3-way versus 4-way intersections, and on discontinuous versus grid internal neighbhourhood road networks. o Decreased proportion of arterial-local intersections (IALP) was associated with increased collisions in rural areas. Further research would be required to verify whether this association may also apply in urban areas, where busier arterial streets during rush hours leave fewer and smaller gaps for stop-controlled local-street traffic to merge/cross. Lovegrove 116 o Increased size of the core residential area (CORE) was associated with decreased collisions in both urban and rural areas. This result appeared to successfully confirm earlier research by van Minnen (1999) on the neighbourhood core, an intuitive and now quantified result. o Increased proportion of local road lane kilometres (LLKP) was associated with decreased collisions in rural areas. This result likely stemmed from the low volume nature of rural roads leading to higher operating speeds. It may not apply to urban areas, where traffic calming is used to reduce speeds, and where traffic volumes are typically higher. o Increased proportion of families with higher numbers of family members (i.e. family size, fs) was associated with decreased collisions in urban areas. This strategy needed to consider effects on population density, and job/resident ratio. While these results appeared relatively promising, this research topic of macro-level CPMs has not yet progressed past an early stage of development. As such, these possible safety strategies have been listed with caution until verified by more studies on the underlying causal mechanisms, refined variables, and model-use guidelines. Several safety application case studies have been conducted to begin the process to confirm the practical value and ease-of-use of these models, including macro-reactive use on Black Spots and CMFs, and, proactive use on regional and neighbourhood planning and time-space transferability. The results of these case studies have been presented in Chapters Five and Six. Lovegrove 5. GUIDELINES FOR MODEL USE 117 5.1 Introduction While forty-seven macro-level CPMs were presented and analyzed in Chapter Four, their potential to be effective, reliable new tools for use by planners and engineers was not demonstrated. Moreover, nowhere in the review of literature were guidelines or case studies found regarding how to use macro-level CPMs in either reactive or proactive road safety applications. Therefore, the next step in this research was to develop guidelines for using the macro-level CPMs developed in this research. Consequently, a set of recommended guidelines was developed in two stages. First, as the macro-level CPMs were developed, an initial set of draft guidelines were prepared based on knowledge of variable data ranges, empirical needs, and model properties. Second, as the CPMs were tested in several reactive and proactive safety applications, these initial guidelines were refined into the recommended guidelines proposed in this thesis, based on lessons learned from these case studies. When referring to safety applications in this thesis, the term macro-reactive has been used to categorize macro-level C P M use in reactive road safety applications. To discuss these guidelines, this chapter has been split into three main parts. In section 5.2, guidelines for selecting the appropriate models are recommended, including a six-step checklist. In section 5.3, guidelines for macro-reactive applications have been presented, including methods for black spot programs, and C M F estimation. In section 5.4, guidelines for proactive applications are presented, including: regional planning, neighbourhood planning, and model transferability. 5.2 Selecting the Appropriate Model(s) While over forty macro-level CPMs have been developed to select from in this research, it is important to use only the appropriate number of models needed for each safety application, for several reasons. First, to be practical, the level of effort, amount of time, and computational Lovegrove 1 118 resources required to use these complex models must be minimized. Second, each model used requires available and quality datasets extracted in a timely manner. Third, while using all models in an analysis may be possible, there may not necessarily be significant gain in the accuracy of the results. Fourth, depending on the task, some models may not be applicable. Fifth, keeping the number of models used, and the associated level of effort, to a minimum encourages more practitioners to apply these CPMs, advancing the standardized consideration of road safety, especially in the emerging safety planning process. Therefore, careful consideration is needed in order to select only the most appropriate models for use while still providing reasonably accurate and reliable results. As an aid to choosing the appropriate model for each safety application, a six-step selection checklist has been recommended, as shown in Table 5.1. Step A. Application Scope To begin, the C P M type (i.e. micro- or macro-level) needs to be determined, by confirming whether the scope of the safety application is a single site (e.g. intersection or segment), a particular neighbourhood (i.e. traffic zone), a municipality, or a region (e.g. GVRD). For a single site evaluation only, the use of micro-level CPMs is required, and methods have been previously reviewed. However, for applications involving a neighbourhood, a community, or parts of a region, the use of macro-level CPMs is required, and the focus of these guidelines. Step B. Application Task The second step is to determine whether the application is reactive (e.g. black spots), or proactive (e.g. land use plan, road plan, T D M plan, etc.), to gauge whether some or all of the sixteen macro-level C P M groups are candidates for use. For reactive applications, all sixteen model groups are recommended as candidates for use. However, for proactive applications, several model groups are shown as optional, depending on several factors, including: predominant land use, relevant (or trigger) variables, choice of planning time horizon, and available data sources, as discussed below. Step C. Land Use In the third step, the predominant type of land use in each neighbourhood under evaluation is determined, to further narrow the choice of model group candidates. For example, if the application involves only urban TAZs, then the urban model groups 1, 2, 5, 6, 9, 10, 13, and 14 would be the recommended candidates for use. If both rural and urban zones Lovegrove 119 were involved, then all of the model groups screened in previous steps would continue to be recommended as candidates to the next step. Table 5.1. Checklist for Selecting the Appropriate CPMs. Step Consideration A Scope Confirm the physical scope of the safety application (choose one) B Task Confirm whether the specific task(s) to conduct the safety application are mainly reactive or proactive (either one) Land Use Confirm the predominant land use type(s) in which the safety application will take place (choose all that apply) Criteria • Single Site • Neighbourhood (Nbd.) • Municipality (Mun.) • Region (Reg.) • Reactive Black Spots Road Design • Pro-active Nbd Road Design Nbd L U Mun. OCP Reg. L U Reg. Road Reg. T D M • Urban • Rural D Relevant (Trigger) Variables Confirm the main variable theme(s) involved in or impacted by the safety application (choose all that apply) • Exposure • Socio-Demographic • T D M • Network E Collision Type Confirm what collision types will be used in the safety application (choose all that apply) F Data Confirm the quality and source(s) of data available to conduct the application (choose all that apply) • Total Collisions • Severe Collisions • A M Collisions • A M / P M Collisions • Non-Rush Collisions • Pedestrian Collisions • Modelled o Measured Step D. Relevant (Trigger) Variables In the fourth step, the specifics of the application are confirmed to identify which specific variables are relevant to the analysis (i.e. trigger variables). Lovegrove 120 Table 5.2 contains recommended model groups in which to search for trigger variables, depending on planning topic and comparison type. For example, i f the safety evaluation involved whether or not to traffic calm an urban neighbourhood in the future (i.e. a future vs present, roads combination) as opposed to assessing the relative safety benefits of alternative future traffic calming schemes (i.e. a future vs future, roads combination), then the three candidate model groups - exposure, T D M , and network - would be recommended. Moreover, the specifics of the traffic calming scheme would dictate the choice of model group(s). In the case where the safety benefits of speed humps and other in-road traffic calming devices were proposed as the main treatment (as opposed to road network changes), the shortcut capacity variables (i.e. SCC and SCVC) would be identified as the trigger variables, which translates to use of the urban T D M model groups (9, 10) in which the shortcut variables are contained. Table 5.2. Recommended Model Groups based on Planning Topic and Comparison Time Horizon Planning Topic(s) Comparison Type Land Use Onlv Roads Onlv T D M Onlv Future vs. Present S-D, T D M , Network Exp., Network, T D M T D M , S-D Future vs. Future S-D Network, Exposure T D M Step E . Collision Type In the fifth step, specific models need to be identified within each remaining candidate model group, by determining which of the six collision types (i.e. Total, Severe, A M , A M / P M , Non-Rush, and/or Pedestrian) are of interest in the safety evaluation. While there are many different models to choose from, not all collision types may be of interest. For example, i f prior knowledge suggests that pedestrian collisions should be checked, then perhaps only the pedestrian collision type C P M would be warranted. Unless the specifics of the application suggest otherwise, for most evaluations, the total and severe collision type CPMs can be used, because they generally are reflective of the data and the most commonly used in safety analyses. Step F. Data Source Having identified trigger variables within specific model groups, the last step is to check that datasets can be assembled to use these models. Adequate data ensures that the selected models will provide accurate, reliable and credible results. In assessing the Lovegrove 121 availability of data, there are two considerations. First, the time and effort to assemble datasets for the TAZ(s) of interest must be in balance with the level of accuracy and significance of the decision required. In other words, while high quality data is always desirable, and directly increases the credibility of modelling results and study recommendations, it usually requires a costly amount of time and effort to extract that is not always affordable or needed. Moreover, the type of exposure variable used (i.e. modeled using V K T , or measured using T L K M ) can significantly influence C P M estimates, depending on whether the focus of the analysis is at the regional or the neighbourhood level. For example, for regional-level analyses, it has been observed that modeled exposure variables are strongly influenced by changes in regional traffic patterns and more appropriate for use than measured exposure variables. Fortunately, for long term regional planning and policy reviews, data extraction budgets are usually significant enough to provide modeled data (e.g. Emme/2). However, for neighbourhood-level planning analyses, zonal traffic volumes do not usually change between scenarios under analysis, suggesting that exposure (i.e. T L K M ) , and hence collisions, are more sensitive to local changes in street network, population, and land use. Thus, measured exposure variables are usually most appropriate for neighbourhood-level evaluations. Table 5.3 sets out the recommended exposure data type based on the planning scope and time horizon. Table 5.3. Recommended Exposure Data Type depending on Time Horizon and Scope.* Desired Region-wide Municipal-wide Neighbourhood-wide Planning (LU, Roads, TDM) (Official Comm. Plan) (LU & Roads) Horizon Modeled Measured Modeled Measured Modeled Measured Short Term < 5 Years R ? ? R NR R Medium Term 5 to 15 Years R NR R ? NR R Long Term > 15 Years R NR R ? ? R *R = Recommended; NR = Not Recommended; ? = Optional, only if resources/time permit, and data relevant The second consideration that often has a direct bearing on the availability and costs of dataset assembly, and consequently influences the final model selection, concerns the choice of desired time periods used in the analysis. For reactive applications, using a minimum two to three years Lovegrove 122 of collision data was found in the literature to be accepted practise. Therefore, the recommended time horizon is the most recent for which an integrated dataset covering this minimum two or three year collision period can be assembled from all sources: collision records, census surveys, road networks, land uses, and exposure data. For proactive applications, Table 5.4 sets out recommended data sources and forecasting techniques for each model group, categorized by planning horizon. Again, these recommendations are based on ensuring adequate data quality in balance with the time and cost of dataset assembly. Table 5.4. Data Extraction Effort for Planning Analyses, based on Theme & Time Horizon Desired Planning Horizon Exposure (Measured) T L K M Exposure (Modelled) V T K , V C Socio-Demographic (Measured) POPD, W K G D T D M (Measured) CORE, SCC Network (Measured) INTD, SIGD Short Term < 5 Years Aggregate manually if small area, or use GIS software Only available through Emme/2 type models Extrapolate from most recent Census, using Historical Trends Extract manually on a zone by zone basis at all levels Aggregate manually if small area, or use GIS software Medium Term 5 to 15 Years Aggregate manually if small area, or use GIS software Only available through Emme/2 type models Trend or OCP Extract manually on a zone by zone basis at all levels Aggregate manually if small area, or use GIS software Long Term > 15 Years Aggregate manually if small area, or use GIS software Only available through Emme/2 type models Use OCP data from each municipality Extract manually on a zone by zone basis at all levels Aggregate manually if small area, or use GIS software As a decision aid for each of these six steps, based on the considerations contained in Tables 5.1 to 5.4, Table 5.5 contains a summary of recommended candidate model groups to be considered at each of the six steps in the model selection process. Highlighted on Table 5.5 are the steps taken for the urban traffic calming example discussed above. Following these model selection guidelines, recommended guidelines have also been proposed on how to use models in reactive and proactive safety applications. Lovegrove 123 Table 5.5. Candidate C P M Groups. (R = Recommended; N R = Not Recommended; ? = Optional; N A = Not Available)* S T A R T : Eligible Model Groups M i c r o . Scope Single Site ^^Neighbourhood (Nbd .g Municipality (Mun.) Region (Reg.) . Task Reactive Black Spots Rd Design Pro-active Nbd Rd Design Nbd L U Mun. O C P Reg. L U Reg. Road Reg. T D M , C . L a n d Usi Urban Rural !, D . Trigger Variables Exposure Socio-Demographic T D M Network Collision Type Total Collisions Severe Collisions A M Collisions A M / P M Collisions Non-Rush Collisions Pedestrian Collisions ' F. Data R M N R 1 N R R R N R N R N R N R N R N R R R R ? N R 9 R R R N R N R N R Measured R 9 Modelled END: Candidate Model Groups M i c r o N R N R N R N R R R R R R R N R N R N R R ? R 9 1 N R R ? R 9 R R R N R N R N R R R R N R N R Exposure R R R N R N R N R N R N R N R N R N R N R N R N R N R R R R R R N A R R N A N A N A N A N A N A N A N A N A N A N A N A N A N A N A N A N R R N R R R N R R N R Exposure 1 2 3 4 N R N R N R N R R R • R R R R R R R R R R R R R R N R N R NR N R N R N R N R N R ? R ? R R ? R ? R ? R ? ? ? ? ? ? ? ? ? R R N R N R N R N R R R Socio-Demographicj N R N R N R N R R R R R N R N R N R N R N R N R N R N R R R R R R R R R R R N A N A N A N A N A N A N A N A N A N A N A N A N A R R R R R R N R R N R Socio-Demographic 5 6 7 8 10 11 12 N R N R N R N R R R R R R R R N R N R N R LL ? R ? ? R R R[ N R N R N R N R R T D M R ? ? ? ? N R R R R| N R r R R R R N A R R N A R* ,NA N A 1 N A / R R) \ R y N R 13 14 15 16 N R N R N R N R R R R R R R R R IE T ? R ? ? 9 N R N R Network R R R R R R N R N R N R N R R N R N R R R N R N R N R N R N R . N R N R N R N R N R N R N R R R N R N R N R N R N R N R R R R R R R R R R R R R R R R R N A N A N A R N A N A N A N A N A N A N A N A N A N A N A N A N A N A N A N A N A N A N A R R R R R R R R N R R N R R N R T J M Network 11 12 13 14 15 16 *How to use table: Enter table at S T A R T , and work down through successive Steps A through F to E N D . Only the model groups chosen at each step are to continue to be carried forward as candidates for further evaluation at the next step. A n example is shown (see arrows) for a safety evalution of whether or not to implement traffic calming in an urban neighbourhood. In this example, final selection involves modelled (9, 10) and measured (9) macro-level C P M s . 5.3 Guidelines for Macro-Reactive Use Having selected the appropriate model(s) based on the specifics of the safety analysis, the logical starting point for development of macro-level model-use guidelines in the absence of other documented guidelines has been to review the well-documented use of micro-level CPMs. Techniques for the use of micro-level CPMs have been reviewed in section 2.3, and detailed in Sayed, 1998; Sayed & Rodriguez, 1999; Sayed & de Leur, 2001a; Hauer et al., 2002a; and Lovegrove 124 Sawalha & Sayed, 2001. Following from these /w'cro-reactive methods, guidelines for macro-reactive use have been proposed below, with guidelines for proactive use proposed in section 5.4. Although macro-reactive guidelines generally follow conventional reactive methods, there are some differences as highlighted in Figure 5.1. Black Spot Programs Using micro-level CPMs (Conventional Reactive Method) (2.3.1.4) 1 Single model 2. Single location (One intersection or toad segment) 3. Simple decision 1 - J > 5 = GPZ (2.3.1.5) 4. Single Ranking P G R + O C R Using macro-level CPMs (Macro-Reactive Method) (2.3.2) 5: Single Indicator (over-represented collision patterns) (2.3.3) 6. Facility design change 7. Strict Problem-to-Remedy matching 8. Well documented C M F s for B C A to rank / choose remedy 1. Multiple models Table 5.5 2. Single zone (many locations) 3. Majority decision 4. 'Scored' Ranking P C R + C C R + / - 5 % •4 5. Two Indicators (collision patterns, phis trigger variables) (5.3.1.1) (5.3.12) • 6. Zone-wide strategies (5.3.13) 7. Thematic generation (Exp, S-D, T D M , Net) 8. No macro-CMFs Figure 5.1. Conventional & Macro-Reactive Black Spot Methods Lovegrove 125 5.3.1 Black Spot Programs As in the conventional micro-reactive approach, macro-reactive use involves black spot programs. However, the proposed guidelines differ from the conventional methods in two ways. First, only the aggregate zonal collision frequency is needed for macro-reactive black spot analyses, as opposed to a collision history for each intersection and road segment for micro-reactive analyses. Second, the unit of analysis is an individual TAZ, not just an individual intersection or road segment within that TAZ. Based on these two differences, several adjustments have been made to the reactive methods, beginning in the identification phase. 5.3.1.1 Identification & Ranking To identify and rank black spots with macro-level CPMs, four adjustments are needed to the conventional method. First, the observed local collision history is based on a zonal aggregate, which provides the first clue of safety (the observation clue). Second, zonal E(A) and Var[E(A)] are calculated for each selected C P M , providing the second clue of safety (the location-specific prior clue). Using these two clues, the zonal empirical Bayes (EB) safety estimate and its variance are calculated for each selected C P M according to Equations 2.17, and 2.18, repeated for convenience. However, instead of just one EB safety estimate being calculated for each zone, there are multiple estimates, one for each of the CPMs selected to evaluate that zone. . EB, = E(A\Y = count) = j = Var{EB, ) = Var(A\Y = count) = = As in conventional reactive methods, zonal E(A) can also be used as the reference group norm for comparison with the Empirical Bayes safety estimate to evaluate whether or not the zone is collision prone. This calculation is repeated for each selected model, using Equation 2.22, which leads to the third modification from conventional reactive methods. Third, as each zone has multiple CPZ evaluations, some additional interpretation of results may be required in the event that not all models agree. Therefore, as a general guide, when most models identify a zone as collision prone, it should be considered as a collision prone zone (CPZ). E(A,) K + E(A,) (K + count) (2.17) g ( A , ) K + E(A,) {K + count) (2.18) Lovegrove 126 £(A) T £'(A) [K/E(A) + \f+C0U"')Ai 1- \fEB(X)dX = 1- j T(K +count) -dX >S (2.22) 0 I 0 Fourth, as the calculated zonal E(A) and EB safety estimates usually differ for each selected model, the resulting differences in rankings for each zone must be reconciled using a modified ranking approach. In the conventional ranking approach, each Collision Prone Location (CPL) is usually evaluated and ranked using only one C P M , with the top-ranked CPL considered the most hazardous for diagnosis and treatment. However, in the macro-reactive method, multiple CPMs are usually used with the result that the same zone may not necessarily be top-ranked across all selected models. Therefore, an additional step is required to confirm the most hazardous zones for diagnosis, by summing zonal rankings across all selected models, to derive a total ranking score for each zone. In effect zonal scoring identifies which zones are most frequently ranked at or near the top. With this slight modification, the same twin-ranking criteria, Potential Collision Reduction (PCR) and Collision Risk Ratio (CRR), described in Sawalha & Sayed (1999), can be used to rank and identify CPZs, according to equations 2.23 and 2.24, repeated for convenience as E(A) The zone(s) with the highest ranking score are recommended as those in most need of attention. 5.3.1.2 Diagnosis Once a CPZ is confirmed as top-ranked, it can be diagnosed to identify the safety problem using a method similar to the conventional approach in all but one aspect. As in the conventional approach, the diagnosis begins by first looking for over-represented collision patterns. However, in the macro-reactive method, a second indicator can be used in the form of the trigger variables from each of the model themes that identify the zone as collision prone (i.e. exposure, S-D, T D M , or network). The value of each trigger variable is compared with regional averages, to confirm which of the variables are triggering the collision prone identification (i.e. which values are significantly different from regional averages). For example, regional averages for each of these variables for the G V R D are given in Table 5.6. This second indicator beyond that used in the conventional approach, together with additional information obtained from collision patterns and site visits, can be used to provide evidence identifying the overall road PCR = EB -E(A) (2.23) CRR = EB (2.24) Lovegrove 127 safety problem in the subject CPZ. Identifying the zonal safety problem using this dual-indictor technique then allows a search to begin to identify suitable countermeasures. Table 5.6. Regional Averages. Patterns Urban Zones Rural Zones Collision Types Severe (fatal & injury) 24.6% 28.0% AM 9.5% 12.7% Collision Locations Arterial 46% 46% Collectors 26% 26% Locals 28% 28% Densities Intersections/ha 0.46 0.047 Signals/ha 0.034 0.00151 Congestion (VC) 0.32 0.16 Proportions 61% 3-way intersections 52% Arterial-Local intersections 17% 14% Drivers 68% 83% Road Classes Arterial-Lane-Kms 32% 33% Collector-Lane-Kms 30% 31% Local-Lane-Kms 38% 36% Peak Collision Period PM Rush AM/PM Rush 5.3.1.3 Remedy In order to match a zonal road safety problem to a suitable zone-wide safety remedy, three adjustments are needed to the conventional process. First, a detailed evaluation of each neighbourhood, including hundreds of road segments and intersections in their supporting transportation systems, is both impractical and not possible with macro-level CPMs. Therefore, instead of a detailed evaluation, a strategic level of safety analysis on a zone-wide basis is conducted. Second, the four main themes (i.e. exposure, S-D, T D M , network) are used to categorize countermeasure strategies. It is recommended that at least one possible remedy be considered under each theme, as well as an integrated strategy that uses parts from some or all of these themes. In this way, safer, more sustainable strategies such as those proposed in Section 4.5 can be considered. Third, Collision Modification Factors (CMFs) are needed to calculate B/C ratios for the economic and ranking analyses of potential zone-wide remedies. However, macro-level C M F estimates of remedies are not yet well researched. Until additional studies can be done, it is recommended that macro-level CMFs be estimated based on engineering Lovegrove 128 judgement of what can be reasonably expected, in view of benchmarks given in the literature related to micro-level CMFs. Guidelines for obtaining macro-level C M F estimates are proposed below. 5.3.2 Collision Modification Factor Estimation To address identified research gaps, a macro-level C M F (macro-CMF) estimation guideline has been proposed. It is based on the micro-level C M F estimation method described by Sayed and de Leur (2001) and reviewed in Section 2.3.3, but with minor adjustments related to differences in the zonal unit of analysis. Nonetheless, the macro-CMF calculation uses the same Equation 2.26.a. In discussing this equation, it is important to note that the term collisions should be considered as synonymous with the term claims, because de Leur & Sayed (2001) have shown that claims data can provide a reasonable estimate of collision data for use in CPMs. For use with macro-level CPMs, two minor adjustments are recommended in the choice of comparison groups, as noted below. OR = (2.26.a) BID where: A = the number of collisions (or claims) in the comparison group that occurred in the before period (change in choice of comparison group); B = EBafter = the EB safety estimate of the number of collisions that would have occurred in the after period in the subject zone had no treatment taken place (no change); C = the number of collisions in the comparison group that occurred in the after period (change in choice of comparison group); and, D = the number of collisions observed in the after period in the subject zone after treatment took place (no change). While comparison groups A and C are directly observable quantities, the practitioner may only need to use zonal collision totals in the comparison group rather than those of intersections or Lovegrove 129 road segments. Regardless of what unit of collision data is used, it is important that the comparison group data be drawn from a randomly selected sample of locations (i.e. roads, intersections, or zones) in the same time-space region as the subject zone, regardless of collision history. This is important because the role of the comparison group is to represent an unbiased indicator of the time trend from the before period to the after period (Sayed & de Leur, 2001a). Regarding the derivation of B, it continues to be calculated using the EBhefore factored by C P M predictions according to Equation 2.27, to account for the changes in traffic exposure (i.e. V K T , T L K M ) over time between the before and after periods. B = EBfl =EBhf £ ( A o / ' e r ) (2.27) " ^ after before 77/ A \ V ' HAbefore) where: EBafter ~ what the EB safety estimate would have been in the after period had no treatment of the zone taken place (no change); EBbefore = t n e EB safety estimate of the zone in the before period (no change); E(AaJier) - the collision frequency given by the macro-level CPM(s) for the zone using values from the after period for its exposure variables (i.e. V K T , T L K M ) , and, values from the before period for all other variables (no change); and, E(Ahefore) = the collision frequency given by the C P M for the zone using values from the before period for all variables (no change). Whereas calculation of E(Ahefore) involves using all before period values for variables, when calculatingE(Aa/ler) only the exposure variable value in the macro-level C P M is changed from its before period value to its after period value. For example, i f we are conducting a traffic calming CMF estimate, then the estimates E(Ahefor<,) and E(Aafler) would be predicted using the before value of SCC. Moreover, all other non-exposure model variables would also remain at their before period values to derive both E(Abefon) and E(Aafler). Only the exposure variable value would be adjusted to after period values to derive E(Aafler) • Lovegrove 130 Where multiple data points are involved (e.g. several observed collision reductions due to traffic calming), one overall OR and C M F estimate are usually calculated, one for each collision type, using the method summarized in Chapter Two and described in Sayed & de Leur (2001a). In that method, each individual value for " A " , " B " , "C" , and " D " are summed and then combined in Equations 2.28 and 2.29, repeated for convenience below (Sayed & de Leur, 2001a). E(OR) = AIC BID , VarB VarC 1 + — ^ + Bz Cl (2.28) Var(OR) = AIC BID VarA VarB VarC VarD = - + = - + — - = - + • BA C1 D (2.29) Using these adjustments to the conventional approach, macro-CMF estimation can be completed. These proposed guidelines for black spot programs and C M F estimation complete the recommended macro-reactive model-use guidelines. A summary of the differences between the proposed guidelines and the conventional reactive approach using micro-level CPMs is shown in Figure 5.1. Tests of these proposed macro-reactive guidelines in two road safety applications have been presented in Chapter Six. Based on the results of these macro-reactive tests, guidelines for proactive use of macro-level CPMs have been proposed in Section 5.4. Lovegrove 131 5.4 Guidelines for Proactive Use In the previous section, macro-reactive model use guidelines were proposed based on modifying conventional or micro-reactive methods. Based on the results of testing those macro-reactive guidelines, proactive model use guidelines have been proposed, including those for regional planning, neighbourhood planning, and model transferability. 5.4.1 Guidelines for Regional-Level Safety Planning For regional safety planning applications, guidelines for using the selected CPMs have been proposed below, including recommendations on 1) choosing zones of influence for the analysis, 2) assembling and processing the data, and, after running the models, 3) interpreting the results. 5.4.1.1 Zones of Influence The first step in a regional safety planning evaluation is to define the geographic boundaries within which to focus the study. Fortunately, in regional planning processes a regional transportation model (e.g. Emme/2) is usually employed to analyze the changes in exposure (e.g. V K T , VC) across the region, and can be used to examine changes on a zone-by-zone basis. Given that exposure heavily influences C P M estimates, a quick check as to the proper scope of influence can be done by reviewing the forecast exposure changes on road links adjacent to the planning project zone(s). It is generally observed in using regional transportation planning models such as Emme/2 that the greater the impact of a project, the more widespread the resulting road volume changes propagate throughout the system. Those zones showing significant changes are deemed to be a zone of influence, and are recommended for inclusion in the analysis. Those zones where only minor changes are noted are usually not included. To facilitate the zone of influence identification process, thematic map plots of exposure should be considered. For example, Figures 5.2.a and 5.2.b (Lim, 2004) illustrate the patterns of automobile and transit vehicular volume impact propagation, respectively, due to a planned light rapid transit (LRT) project in the northeast sector of the G V R D , using Emme/2 software. These sample Emme/2 plots showing volume changes on individual road links reveal that the impact of the L R T project extends farther across the transportation system than merely the traffic zones adjacent to and containing the LRT line (thick line). Lovegrove b. Transit volume changes due to LRT. Figure 5.2. Impact Propagation (highways excluded) Lovegrove 133 5.4.1.2 Data Extraction After selecting the models and confirming the zones of influence, the next step in using the models for regional safety planning consists of data assembly and processing. Figure 5.3 (Lim, 2004) details the relationships between the source data and the required modeled C P M group input data, showing that most of the input data is typically derived from one or more source datasets and agencies, as described in Chapter Three. Data Sources Modeled C P M Input Variables vkt, km [ E M M E / 2 ] 13 w . int (D R A ] vkt, km [ E M M E / 2 ] i3 w p , ii w /In t \ n v a l % Figure 5.3. Modeled macro-level C P M Input Variables & Data Sources (Lim, 2004) Lovegrove 134 For extraction of modeled exposure data (VKT, VC) , the process begins with computation of the forecast volumes for each link in the regional transportation model's digital road network for the given planning scenario. Once computed, each road link, together with its link traits and the computed volume data, must be associated with all zones that it crosses. This must be done because links in the digital road network are only conceptually represented as running between a pair of origin and destination nodes that are often not in the same zone, and that may or may not be located in the zone of influence. Because many road links only cross through zones (i.e. without origin or destination nodes in those zones), this zonal link aggregation process cannot be done by simply identifying which zones each road link's nodes are located in. Instead, a specialized GIS routine is used to split or cookie-cut the digital road network by the T A Z system. Figures 5.4 a & b are snapshots of a typical Emme/2 T A Z system superimposed over a road networks (a) before and (b) after the GIS cookie-cutter process (Lim, 2004). Once the links are associated with zones (b), their associated data can be extracted and queried to filter out road network links that have posted speeds in excess of 60km/hr (i.e. highway links), before finally aggregating V K T and V C for each zone. a. Before GIS, as Emme/2 road network. b. After GIS, associated with zones. Figure 5.4. Sample of Emme/2 Digital Road Links & Traffic Analysis Zones. Following extraction and aggregation of the modeled exposure data, it is integrated with zonal measured data. For this data integration to translate into accurate results, it is important that values in all applicable datasets covering exposure, S-D, T D M , and road network be current to Lovegrove 135 the planning horizon of interest. With all data assembled and integrated into a dataset compatible for C P M use, collisions for each T A Z can be predicted and results interpreted. 5.4.1.3. Interpretation of Results After running the CPMs, the last step is to correctly interpret the results, which involves two considerations related to highway exclusions and prediction variances. 5.4.1.3.1 Highway Exclusion As the effects of limited access highways have been excluded from the macro-level C P M estimates, the results are not in an absolute sense a true regional collision estimate. As originally intended, the models only offer collision predictions for each TAZ of which of the evaluated scenarios is less safe or more safe. While this exclusion meets most neighbourhood-level planning needs, it requires careful interpretation on regional planning projects. Take for example the impacts on exposure of the LRT project previously presented in Figure 5.2.a, which showed that road impacts are the heaviest in the vicinity of the major transit line. However, Figure 5.5 (Lim, 2004) below shows a plot of the same scenario, but with all highways included. This example reveals that impacts due to excluded highways can be significant and far reaching, and presumably that same impact would also be reflected in their associated collisions. Therefore, because limited access highway collisions have been specifically excluded, CPMs predictions should be viewed as a proxy of the complete regional safety impacts only, representing a general indication of safety impacts for planning regional projects. While one would expect that the non-highway macro-level C P M results would still generally reflect reality, i f one wished to have an estimate of the true regional safety impacts in absolute terms, additional collision estimates for regional highways would have to be added into the analyses. Lovegrove 136 5.4.1.3.2 Prediction Variance The second key consideration in interpreting macro-level C P M results is that the predictions are only expected values. They are estimates of the true value which, as a random variable, usually varies over a wide range, as indicated by the estimate's variance in Equation 2.15, repeated from Chapter Two for convenience. Var[E(A)] = [-^^- (2.15) K The magnitude of that variance can be interpreted in two ways, depending on whether the comparison involves only future scenarios, or present and future scenarios. In comparisons involving only future scenarios, usually only a relative comparison of the future scenarios is desired. To do a relative comparison of two future scenarios, zone by zone collision predictions in each scenario are typically summed across all zones of influence to provide an overall collision estimate for each scenario. The difference in the two sums is then assessed for significance using a hypothesis test to verify whether the difference is significantly different from zero. This hypothesis test relies on a normally distributed test statistic, T, calculated according to a desired level of confidence interval, usually 90% or 95%. The difference of sums Lovegrove 137 is significant i f the calculated test statistic meets the criterion in Equation 5.1 (Benjamin & Cornell, 1970; Dobson, 1990). [ £ ( A , ) - £ ( A 2 ) ] ^Var[E(A)W / 4n where: n = the number of zones of influence used in deriving the sums £ ' (A 1 ) ,£(A 2 )= the sums of zonal predictions for the scenarios being compared Var[E(A)] = the greater of Var[E(A])] or Var[E(A2)] a = the desired level of confidence, typically 95% T = the T-statistic derived from standard T-distribution tables = 1.96 @ 95% for large « In the second type of regional planning analysis, one comparing future and present scenarios (e.g. to identify future capacity constraints on an existing network), a different method is usually used to interpret the results. Although the difference of sums technique described above may also be desired, absolute collision estimates for cost-benefit analyses are often needed in this comparison type. While part of these estimates can be derived by simply summing the macro-level model estimates, additional effort may be required to derive highway collision estimates using micro-level models. Regardless of the type of comparison involved, regional collision prediction sums are typically large, with corresponding large variances. While statistically valid, these large variances make interpretation of the C P M results difficult. Therefore, a visual qualitative interpretation technique is proposed, involving the use of GIS tools to produce graduated thematic maps of C P M results. These maps allow for a quick visual indicator of variability for each model across the zones of influence. However, in generating these maps, careful selection of the number of ranges, or bins, and their associated values is needed for unbiased analysis. For example, having too few ranges may filter out the collision predictions, especially i f the collision predictions have >T 2 •(5.1) Lovegrove 138 a large variation of low and high values. Likewise, using too many ranges may make the maps too difficult to read and interpret. Figure 5.6 (Lim, 2004) summarizes how the datasets can be integrated and used in a regional planning C P M application: using regional transportation planning models (e.g. Emme/2) to provide modeled exposure data; using GIS software (e.g. Maplnfo, ArcGIS) to extract modeled and measured non-highway data, and to zonally aggregate that data; using spreadsheet software (e.g. FoxPro, Excel) to integrate the data for input to the CPMs; using the CPMs to produce collision estimates for each scenario; and finally, using GIS and spreadsheet software to tabulate results for interpretation by the practitioner. EMME/2 1 Data Assembly and Preparation EMME/2 3 Yr. Plan Databank 2001 Census Digital Road Atlas Maplrifb ArcGIS Excel CPMs E(A) = a0Z"es''Jf' FoxPro Application of CPMs and Reporting Maplnfo Graphical Output : Thematic Maps of CPM predictions by Traffic Zone | Tabular Output : C P M predictions by Traffic Zone Figure 5.6. Example of Data Assembly and C P M Application. Lovegrove 139 5.4.2 Guidelines for Neighbourhood-Level Safety Planning In neighbourhood-level planning (e.g. OCP, land use, roads), evaluations typically deal with a much smaller geographic scope than regional analyses. As this smaller scope usually includes an individual zone of influence or a smaller number of zones of influence, the recommended guidelines require a separate discussion on zones of influence, data extraction, and interpretation of results. 5.4.2.1 Zones of Influence Given the smaller geographic scope of neighbourhood planning, there is much less of a computational need to identify zones of influence. In most cases, the zones of influence will be either the zone in which the planning study is conducted, or, for larger municipal projects, the group of zones contiguous to it. However, in large and/or significant municipal planning projects where resources permit, assessment of which zones exhibit significant change and therefore are zones of influence could be verified using the thematic mapping described in the regional safety planning guidelines (e.g. plots of V C or V K T changes). 5.4.2.2 Data Extraction The smaller geographic scope of neighbourhoods also usually translates to a smaller budget being approved for data extraction, relying on more use of directly available, less costly measured data. Regardless, it is recommended that data be extracted from the most current census survey, land use, T D M , and road network information available, with extrapolations as necessary for short or long-term planning. For short term planning (< 5 year horizon), population and employment growth are usually extrapolated using predominant historical trends and/or economic cycles. Measured network and exposure data are usually based on approved short-term transportation plans (e.g. 3-Year Plans). For longer term planning, there are usually approved municipal growth plans or Official Community Plans (OCPs) that set out approved (i.e. expected) future land uses, roads, population, and employment projections, broken down by neighbourhood, in five or ten year increments. After extracting and aggregating the data to run the models, interpretation of the results of the neighbourhood analysis are the next step. Lovegrove 140 5.4.2.3. Interpretation of Results As individual neighbourhoods are usually physically isolated from, and therefore not significantly influenced by, bisecting highways, their macro-level C P M predictions will represent a relatively complete evaluation of safety levels on their roads. As such, all comparisons can use the collision predictions from any C P M in whatever formulation (i.e. sums, differences, percentages) provides the desired clarity for practitioners and decision-makers. 5.4.3 Transferability Guidelines There may be occasions where previously developed macro-level CPMs need to be calibrated, either due to the passage of time (i.e. updating with more recent collision data being available), or, in order to use them in other regions (i.e. spatial transfers). Space-time transferability is most commonly performed when models developed using data from one geographic area and time period are transferred for use in a significantly different time and/or place. The decision on what constitutes a significant difference is left up to each individual practitioner. In the case of transferability between two significantly distinct regions, a four-step transfer calibration process is recommended, based on that described in Sawalha & Sayed (2005b) for micro-level CPMs, with minor modifications. 5.4.3.1. Data Definitions The first step is to ensure that data used in the transferred C P M calibration process generally conforms to the specifications of the original dataset used to develop the models, which requires careful attention to variable definitions and units. While definitions for most variables are fairly self-evident, the valuation of several variables (e.g. urban, rural, core size, and shortcut capacity) rely heavily on engineering judgment. Therefore, careful consideration of Chapter Three documentation on variable definitions and valuation is recommended. 5.4.3.2. Sufficient Data Points To meet goodness of fit criteria, the transfer calibration needs to be based on a sufficient number of data points, depending on both data availability and data quality, expected roughly as 15 to 20 TAZs. Moreover, it is important that the predominant land uses in the zones used in the calibration be clearly categorized as either urban or rural. Lovegrove 141 5.4.3.3. G L M Process Once the data is extracted and aggregated zonally with proper variable definitions, the G L M software needs to be re-run, following essentially the same method described in Chapter Three for the original model development, but with one change. Instead of fitting the entire model, transfer calibration only involves obtaining new values for the overdispersion parameter, K, and for the lead coefficient, a0. For example, using GLIM4 software (NAG, 1994), this would be done using the OFFSET command. 5.4.3.4. Goodness of Fit While the G L M transfer calibration process is fairly straight forward, the calibrated models need to be tested to ascertain that statistically reasonable fits have been achieved. The recommended goodness of fit criteria are described in Sawalha & Sayed (2003), and given below. While the SD, and Pearson j 2 test statistics can be used, and should not exceed the target J 2 , there is a third test statistic, the z criterion, which is also recommended. Using this z criterion, the model is considered to be successfully calibrated if, in addition to meeting the SD and Pearson % 2 statistical measures, it has the statistical property shown in Equation 2.44. z = Xl-E(xl) <1.00 (2.44) where: Pearson %2 = [ ; / - £ ( A,)] 2 ^ Ly , -£ (A, ) ] 2 Var(yt) ~ ^ E(Ai)[\ +E(A,)f K] ' (2.33) I N 1 E(yl) = N;cr(zl) = , 2N(\ + 3/K) + Y KyipJ K/i'J y E(A,)[\ + E( A, (2.45.a, b) N = the number of data points used to calibrate the transferred model; £ ( A ( ) , K 's are all from each calibrated C P M ; and, yi, Var(yi) are derived for each individual observation in the new data set. Four case studies have been conducted in Chapter Seven to test these proposed proactive model-use guidelines. Lovegrove 142 5.5 Summary Traditionally, the use of collision prediction models in reactive road safety improvement programs has focused mainly on a single facility (i.e. micro-level CPMs). Guidelines for their use in black spot programs have been well researched, with a set recipe of countermeasures and estimated collision modification factors. However, no guidelines were found for the use of macro-level CPMs in safety planning applications. Therefore, following from micro-level C P M methods, guidelines for the macro-reactive use of macro-level CPMs have been proposed in Section 5.3. Based on the results of macro-reactive guideline tests, presented in Chapter Six, recommended guidelines for the proactive use of macro-level CPMs were proposed in Section 5.4, including regional planning, neighbourhood planning, and space-time model transferability. Several case studies testing the proposed proactive model-use guidelines are presented in Chapter Seven. Lovegrove 143 6. MACRO-REACTIVE SAFETY APPLICATIONS 6.1 Introduction Guidelines for macro-level C P M use in reactive and proactive road safety applications were proposed in Chapter Five. In this chapter the results of testing those guidelines in two macro-reactive safety applications are discussed. To present the macro-reactive case studies, this chapter has been split into two main parts. In section 6.2, the first case study is discussed, including the study approach, results, and lessons learned in applying the guidelines and models to a black spot analysis of G V R D neighbourhoods. In section 6.3, the second case study is discussed, including the study approach, results, and lessons learned in estimating traffic calming CMFs for three urban G V R D neighbourhoods. 6.2 Black Spot Program Case Study This first case study was intended to follow the proposed guidelines and conduct a black spot study to identify, diagnose, and recommend remedies for CPZs in the G V R D . Details of the approach to the case study are described below, followed by discussion of results and lessons learned. 6.2.1 Approach Using the checklist in Table 5.1, and selecting candidate models using Table 5.5, the models were screened as follows: A. Scope = Regional, therefore macro-level C P M use was warranted; B. Task = Black Spots, therefore all sixteen model groups were candidates; C. Land Use = Both rural and urban neighbourhoods existed across the G V R D ; D. Trigger Variables = A l l variables were potentially trigger variables, therefore all groups; Lovegrove 144 E. Models = In order to permit detailed CPZ diagnosis, and identify any need for guideline refinements, all thirty-five Total, Severe, and A M CPMs across all sixteen groups shown in Tables 4.1 through 4.5 were selected; and, F. Data Source = modeled and measured data existed for the 1996 - 1998 time period; therefore, all thirty-five CPMs could be run. Having selected the appropriate models, evaluation of the collision potential and ranking of each zone began. 6.2.1.1 Identification & Ranking To start, models were used to estimate the location specific expected collisions for each zone, including E(A) and Var[E(A)]. This clue was used together with the observed three-year zonal count to produce an EB safety estimate with each model, resulting in sixteen (for rural zones) or nineteen (for urban zones) EB safety estimates for each zone. E(A) was also used as the reference group norm for comparison to identify Collision Prone Zones (CPZs). Using the modified ranking technique, the zones most frequently ranked at or near the top were ranked for diagnosis and remedy. 6.2.1.2 Diagnosis & Remedy In-office and on-site analyses involved identifying the safety problem using the two indicator technique recommended in the guidelines. As the first indicator, the mean frequencies for each collision type (e.g. Severe), and for each location type (e.g. % collisions on collectors) were compared with the regional means contained in Table 5.6. As the second indicator, trigger variables were identified by comparing values for each C P M variable with their corresponding regional averages (also listed in Table 5.6), to confirm which of these variables was in fact triggering the collision prone ranking. On site analysis was used to verify in-office findings, and to ensure that other safety factors were not missed (e.g. adjacent land uses, speeding, high truck volumes, existing traffic calming levels). After identifying the zonal safety problems using these two indicators, CPZs were carried forward for identification and analysis of possible remedies. The safety problem-countermeasure search process provided at least one possible remedy strategy for each of the identified trigger variable themes (e.g. exposure, S-D, T D M , network). Using assumed CMFs derived from engineering judgment to Lovegrove 145 generate CPM estimates, each potential remedy was evaluated based on which zonal safety strategy showed the greatest potential zonal collision reduction. 6.2.2 Results 6.2.2.1 Identification & Ranking Following this approach, each of the macro-level CPMs identified at least one CPZ, with the sixteen top-ranked CPZs listed in Tables 6.1 (urban) and 6.2 (rural). A listing of the top 5% of all top-ranked zones across all models is given in Appendix C. Also identified were Safer Zones (SZs), those zones which were considered to be non-CPZ, and which were ranked lowest in terms of collision potential. Figures 6.1 and 6.2 show the geographic locations of the top-ranked CPZs and the bottom-ranked SZs, respectively. Two observations were made regarding ranking consistency. First, it was hoped that the model groups would provide relatively similar CPZ and SZ identification and rankings. However, the 35 different models identified sixteen different top-ranking CPZs (ten in urban and six in rural zones), and eighteen different lowest-ranking SZs (ten in urban and eight in rural zones). Although this was more variability than expected, the smaller number of ranking differences among rural models was likely due to the smaller number of rural zones in the sample, meeting intuitive expectations. Second, although the ranking of each zone differed across models and model groups, the variance in rankings between models was generally less than 5% (e.g. within 24 urban zones, and within 5 rural zones). Following identification and ranking, a sample of four zones identified as CPZ was carried forward for diagnosis to identify the safety problems. Table 6.1. Urban CPZ & SZ Identification Model Exposure Exposure Socio-D Socio-D T D M T D M Network Network Group Modelled Measured Modelled Measured Modelled Measured Modelled Measured CPZ Total CPZ1 CPZ2 CPZ3 CPZ3 CPZ1 CPZ3 CPZ1 CPZ3 Severe CPZ4 n/a CPZ5 CPZ6 CPZ7 CPZ3 CPZ8 CPZ9 A M n/a n/a CPZ 10 CPZ6 n/a CPZ6 n/a CPZ9 SZ Total SZ1 SZ1 SZ1 SZ1 SZ2 SZ3 SZ4 SZ3 Severe SZ2 n/a SZ1 SZ5 SZ6 SZ7 SZ1 SZ8 A M n/a n/a SZ9 SZ10 n/a SZ3 n/a SZ3 Lovegrove 146 Table 6.2. Rural CPZ & SZ Identification Model Exposure Exposure Socio-D Socio-D TDM TDM Network Network Group Modelled Measured Modelled Measured Modelled Measured Modelled Measured CPZ Total CPZ11 CPZ 12 CPZ 13 CPZ 14 CPZ 13 CPZ 14 CPZ 13 CPZ 14 Severe CPZ 15 CPZ 16 CPZ 16 CPZ 12 CPZ 16 CPZ 12 CPZ 16 CPZ 16 SZ Total SZ11 SZ12 SZ13 SZ14 SZ15 SZ14 SZ14 SZ14 Severe SZ13 SZ16 SZ13 SZ16 SZ13 SZ16 SZ17 SZ18 Figure 6.1. Collision Prone Zones. Figure 6.2. Safer Zones. Lovegrove 147 6.2.2.2 Diagnosis & Remedy The diagnosis of the four sample CPZs, together with one SZ, has been described below, including recommended remedies for each CPZ. 6.2.2.2.1 Collision Prone Zones The first of the four CPZs analyzed, urban CPZ6 shown in Figure 6.3, was identified by high rankings produced from the S-D, T D M , and network CPMs. Looking at the collision patterns and trigger variables, together with land use and road maps, provided three clues to possible road safety problems. First, the zonal land uses consisted of old residential homes in a small central core surrounded by predominantly commercial and industrial development, all set in the midst of a discontinuous road grid pattern. Second, the CPZ ranking was triggered by abnormal values in the following trigger variables: A M collisions (high), intersection density (high), signal density (high), volume/capacity ratio (high), arterial-local intersections (high), and 3-way intersections (low). Third, a review of collision locations revealed over-represented arterial collision locations. These geographic clues combined with the higher than average number of peak-period collisions occurring on congested arterials suggested that the safety problem might have been due to local trips forced by irregular street patterns and large commercial and industrial lots onto perimeter arterials. These local trips diverted onto the arterials appear to have created high arterial-local turning volumes which conflict with higher speed, higher volume through-traffic on the arterial roads. A number of possible remedies were considered to solve these safety problems. The first remedy traditionally considered to remove internal, local trips from the perimeter arterial roads, without reference to the newly developed macro-level CPMs, would be to increase internal zonal street connectivity through new local road construction. However, consulting the C P M predictions based on these new roads, by adjusting trigger variables and re-running the CPMs to provide new collision estimates, forecast that this traditional strategy would not reduce collisions. Instead, the CPMs forecast an increase in the number of collisions, due to increased exposure from increased lane-kilometres (TLKM), and due to an increase in shortcutting capacity (SCC) through the neighbourhood. Alternative remedies predicted by the CPMs to produce fewer collisions were generated using thematic strategies (e.g. exposure, S-D, T D M , network). The recommended countermeasures predicted an overall collision reduction of up to 7%, or 10 collisions per 3 year period, as follows: Lovegrove 148 o Socio-Demographic - remove the old residential core and infill with industrial; o T D M - Reduce shortcut capacity by traffic calming short-cut routes; and, o Network - Close or restrict left-turning and shortcutting movements at local-arterial intersections to keep through traffic outside the neighbourhood core. Figure 6.3. Urban CPZ6. In the second of four CPZs analyzed, the urban CPZ1 shown in Figure 6.4 was identified by the network, T D M , and exposure total CPMs. In reviewing the zone in-office and on-site, three clues were revealed regarding safety problems. First, the zone contained a major regional shopping centre surrounded by suburban residential town homes. The shopping centre precluded internal neighbourhood connectivity due to large parking lots, with only one cross-zone collector. Second, the main trigger variables were: mode split, and volume / capacity. Third, nearly 90% of collisions were on perimeter arterials. From these three clues, it appeared that local / through trip conflicts were causing the safety problem, perhaps together with access management and/or intersection controls. A thematic analysis was done to identify community-based strategies to address these road safety problems. Overall collision reductions of up to 7% or 5 collisions per three year period were predicted if the following remedies were implemented: Lovegrove 149 o Exposure - Arterial access management and signal synchronization, to increase arterial capacity and reduce congestion (i.e. volume/capacity); o T D M - Transit service improvements to decrease drive mode split; and, o Network - Increased internal zonal connectivity by making one of the existing shopping centre parking lot driving aisles as a shared, traffic calmed entry (i.e. linkages) to/from the residential area. This was meant to remove local-arterial trip conflicts, but without increased lane-kilometres or short-cut capacity. "FT Figure6.4. Urban CPZ 1. The third of four CPZs analyzed, the rural CPZ10 shown in Figure 6.5, was identified by exposure, S-D, and T D M models. Three clues were found to possible safety problems. First, trigger variables included arterial-local intersections, signal density, and driver mode split. Second, there were over-represented collisions on rural collector roads of non-Severe crash types. Third, this CPZ contained rural land uses that bordered a congested regional highway. These clues suggested that there was short-cutting through traffic in conflict with low speed, local trips on the internal rural roads. Conventional solutions would have called for increasing highway capacity by four-laning it, closing some local-arterial intersections, and/or building frontage roads (and some had already been built in this area). However, use of the models Lovegrove 150 improved awareness of the need to reduce traffic conflicts without increased road kilometres, thus leading to consideration of other countermeasures related to the four themes. After assessing potential remedies in each theme, two strategies were recommended that predicted overall collision reductions of up to 3% or 3 collisions per 3 year period i f the following remedies were implemented: • Network - Conversion of the existing full-movement local-arterial intersections to a restricted right-in / right-out intersection (i.e. no crossing, no left-turns), which maintained local access, increased highway capacity, but discouraged shortcutting; and, • T D M - Implementation of a rural community bus program, to increase resident accessibility to regional transit. Figure 6.5. Rural CPZ 10. The last of four CPZs analyzed, the rural CPZ14 shown in Figure 6.6, was identified by all four severe CPMs, and four clues were apparent. First, it contained predominantly agricultural land uses, with large, rural residential lots that covered one-third of its area and precluded internal zone connectivity. Second, almost all collisions occurred on zonal perimeter roads. Third, a major regional freeway ran along one boundary, and was another mobility barrier. Fourth, the trigger variables were collector road lane-kilometres (high), local road lane kilometres (low), and Lovegrove 151 volume / capacity (high). Taken together, these clues suggested that local and through trips were over-loading supporting perimeter collector roads, which in effect were functioning as arterial roads. A package of theme-related remedies was recommended that predicted collision reductions of up to 14%, or over 5 crashes per 3 year period, as follows: • Exposure - Reclassify the existing Collector road to an Arterial road, and implement access management measures (e.g. turn restrictions) to increase capacity and reduce congestion (i.e. volume/capacity); • Socio-Demographic - Introduce a local 'corner store' internal to the zone, to reduce local / through trip conflicts, and vehicular traffic volumes; • T D M - Introduce Community Shuttle transit to reduce vehicle volumes, drivers, V K T , and volume / capacity; and, • Network - Increased zonal connectivity of pedestrian/bicycle routes to corner store, keeping more local trips away from arterials. Figure 6.6. Rural CPZ 14. Lovegrove 152 6.2.2.2.2 Safer Zones A review of the geographic characteristics and trigger variables that facilitate identification of the safer zones shown in Figure 6.2 was also done. An analysis of SZ1, shown in Figure 6.7, and its associated trigger variables revealed the benefits of one-way, ring-roads with low speed limits. In fact this remedy had already been retrofitted in this zone, suggesting it as a countermeasure that could be pursued even after a neighbourhood was built. Figure 6.7. Urban SZ1. 6.2.3 Lessons Learned As a result of conducting the macro-level black spot case study, several lessons can be learned about the reactive use of macro-level collision prediction models. Lovegrove 153 6.2.3.1 Simplicity The macro-reactive approach worked reasonably well, in a manner similar to the conventional micro-approach, with differences as shown in Figure 5.1. This simplicity should make it attractive for use by practitioners. 6.2.3.2 Relevance While a number of zones have been theoretically identified in this research as CPZ, the question remains of what is actually being experienced by users and done by practitioners. From one informal check, in which a presentation of preliminary macro-reactive CPZ results using circa 1996 - 1998 data were reviewed with ICBC road safety engineers, it was discovered that of two CPZs described in the presentation, one was a current neighbourhood of interest being analyzed by ICBC and municipal staff. Formal verification on a wider scope with both ICBC and municipal road agencies would be a logical next research step. 6.2.3.3 Early Warning This macro-reactive black spot method appears capable of enhancing conventional black spot programs by facilitating early identification of neighbourhood road safety problems without the need for all roads and intersections to have a reported collision history. Evidence of this was discussed in 6.2.3.2 above, where the macro-reactive method using 1996 data identified road safety problems that are only now being addressed nine years later using conventional reactive methods. Of course, further consultation with municipal staff would likely identify the reasons for the delay (e.g. lack of funding), but the potential for enhancement is still valid. 6.2.3.4 Dual Indicator Diagnosis The two indicator diagnosis method, looking at over-represented collision patterns, and at trigger variables, did enhance identification of zonal safety problems and remedies. 6.2.3.5 Thematic Remedies The thematic remedy approach prompted new safety planning considerations of conventional, traditional solutions, which appeared to result in safer remedies. Without these CPMs and a thematic approach, these macro-level remedies might not otherwise have been considered. Further research would be needed to verify the sustainability of these thematic safety strategies in the long term. Lovegrove 154 6.2.3.6 Ranking In one instance, the same zone was identified as a top-ranked CPZ by several total models, and, as a SZ by one severe model. While it is possible that above-average frequencies of non-severe collisions were the reason for this contradictory ranking, it does highlight the need to analyze results from as many perspectives as practical. It is also the reasoning behind the recommended use of all sixteen model groups for macro-reactive studies. 6.2.3.7 CMFs An important aid in any recommended remedy before it can be implemented is to conduct a proper B/C analysis, which rely heavily on CMFs. This macro-reactive case study highlighted gaps in macro-level C M F estimates, which should be the subject of future research. 6.3 Macro-Level Collision Modification Factors As discussed in the black spot case study, there has been limited research and documentation on macro-level CMFs. Until this lack of C M F information has been rectified, the use of macro-level CPMs in assessing possible remedies will be limited to making assumptions on their collision reduction potential based on engineering judgment and experience, which jeopardizes confidence in the results and subsequent decisions. To alleviate these concerns, the objective of this second case study was to test the previously developed guidelines, and to obtain an estimate of the C M F for traffic calming. Traffic calming was chosen as the subject for this C M F study because of its usual area-wide scope of application and influence. Also, data was available on several recent neighbourhood-wide traffic calming projects in the G V R D . From information reported in a previous traffic calming C M F study conducted for the Insurance Corporation of British Columbia (Geddes et al. 1996; Zein et al. 1998), before and after collision data was obtained, together with details on traffic calming location, scope, and measures. Three of the four urban neighbourhoods examined in that study were traffic calmed in the early 1990's, considered recent enough so that their data could be integrated and analyzed without major adjustments using the same datasets used to develop the macro-level CPMs (circa 1996). For these three traffic-calmed neighbourhoods, Geddes et al (1996) used a simple before and after methodology and found that, on average, total collision frequency decreased by 47% (ranging 34% to 60%), and severe collision frequencies decreased by 49% (ranging 25% to 67%). Lovegrove 155 However, that same study examined results of 85 other traffic calming studies across North America, Europe, and Australia, and found a much wider range of observed collision frequency reductions, ranging from 8% to 95%. While no international traffic calming C M F estimate was offered given those wide ranging results, this wide range somewhat reduces confidence in relying on the G V R D traffic calming C M F estimate in the ICBC study. Using meta-analysis techniques, Elvik (2001) conducted a more recent review of international traffic calming project results. The results of this ICBC study would have been included in his study, but the ICBC study data were not available at the time of his study. Elvik (2001) estimated the area-wide safety benefits of traffic calming as closer to a 15% reduction in severe collision frequency, much lower than the 49% reduction estimated for ICBC. Therefore, in addition to testing the guidelines, it was hoped that this macro-reactive case study would do two things: 1) increase confidence in the previous ICBC estimates for area-wide traffic calming, and 2) resolve at least in part the wide ranging (i.e. 15% versus 49%) traffic calming C M F estimates found between these ICBC and Elvik studies. 6.3.1 Approach Using the guidelines in Chapter Five, the total and severe urban TDM-related CPMs were chosen, because the analysis involved urban traffic calming, which was related to the shortcutting capacity trigger variables (i.e. SCC, SCVC) found in T D M model groups. As modeled data for each of these three neighbourhoods was not available for the time period involved (improvement years 1989 to 1992), only the measured models could be used. Having selected the appropriate models, the C M F estimation technique followed the guidelines proposed in Section 5.3.2. For the comparison group values, A and C, the three-year running average of annual collisions reported by ICBC for the G V R D were used, broken out for total and severe collision types. Collision count data for the before and after time periods at each neighbourhood were taken from the original ICBC study. The before and after period no treatment counts producing the EB safety estimate, B, were adjusted for R T M bias using the Lovegrove 156 C P M forecasts for total and severe collisions as per Equation 2.27. The raw count, D, for the after period with improvement was used directly in Equation 2.26.a. Three assumptions were required in this case study to adjust for temporal and spatial differences between the study data (circa 1991) and the data sets used in model development (circa 1996). First, as these were known to be established residential neighbourhoods with relatively stable demographic and land use characteristics, it was assumed that no significant neighbourhood changes had occurred since the 1991 traffic calming. Second, there was a need to adjust model-derived E(A) estimates (circa 1996) into the same time period for combination with each location's observed before and after collisions (circa 1991), thereby producing EB safety estimates. To do this adjustment, it was assumed that changes in the regional comparison group collision counts were a reasonable proxy for time trends and other external factors not included in the models. This adjustment is similar in concept to the ratio used in Equation 2.26a (i.e. A/C) to adjust the EB safety estimate between before and after periods. Third, to adjust model variable values for differences between the area of the traffic calming study and the area of the zone, it was assumed that the ratio of areas could be used. This assumption was not considered an issue for variables normalized to zonal area, such as population density (POPD). However several variables in the CPMs did need adjustment, including core size (CORE), total commuters (TCM), and total lane kilometres (TLKM). While CORE, and T L K M could be measured directly off land use and road maps, the T C M value could not, and had to be pro-rated based on area. Land use maps were used to verify that this was a reasonable assumption. Finally, as the measured exposure variable (TLKM) did not change in value from the before to after periods, and no other changes in non-trigger variables were assumed, the value of B in Equation 2.27 simplified to B = EBafler = EBhefore. With these assumptions and datasets, the analysis was conducted and traffic calming C M F estimates calculated for total and severe collisions in two ways. In the first calculation, OR and C M F estimates were calculated for each individual neighbourhood, for comparison with the results derived using the simple method in the original ICBC study. This allowed for checking of the three assumptions used to extrapolate the data from 1996 to the relevant study periods. In Lovegrove 157 the second calculation, the guidelines were followed to produce one overall OR and C M F estimate, as per Equations 2.28 and 2.29, repeated for convenience (Sayed and de Leur, 2001). E(OR) = A/C B/D , VarB VarC 1 + — + BL C1 (2.28) Var(OR) = A/C B/D VarA VarB VarC VarD — ^ + — ^ + — r - + Bl Cl D (2.29) 6.3.2 Results Table 6.3 lists the resulting traffic calming C M F estimates for the three G V R D neighbourhoods using the two methods, with the overall averages for traffic calming C M F estimates of (-0.40) and (-0.39) for total and severe collision types, respectively. Both of the overall total and severe collision C M F results were within 20% of the original ICBC result (-0.40/-0.39 versus -0.47/-0.49) with similar variances, respectively, and statistically identical with 99% confidence using a T distribution statistic (n = 3). Table 6.3. Traffic Calming C M F Estimates.* Macro-Level CPMs Geddes et al (1996) Neighbourhoods Total Severe Total Severe Mount Pleasant -0.61 -0.11 -0.46 -0.25 Willingdon-Parker -0.56 -0.24 -0.60 -0.56 Kelvin North -0.35 -0.64 -0.34 -0.67 Average -0.40 -0.39 -0.47 -0.49 Variance 0.03 0.01 0.02 0.05 •Elvik (2001) found area-wide traffic calming reduced severe collision frequency by 15%. However, while the overall results were in reasonable agreement with the original ICBC results, it was noted that several of the individual total and severe collision C M F results calculated using individual OR calculations were markedly different. In particular, two of three of the severe results calculated from the macro-level CPMs appeared to be closer to the estimate of -0.15 found by Elvik (2001) than the -0.49 in the ICBC study. However, this result did not meet the T Lovegrove 158 statistic test for significance. On the other hand, there was a 90% level of confidence using that same statistic that the -0.15 Elvik severe result was statistically different from both the overall macro-level C P M -0.39 result, and original ICBC -0.49 severe result. Therefore, the results of this case study using macro-level CPMs suggest that G V R D traffic calming C M F estimates for total and severe collisions may lie closer to the ICBC estimates than to the severe -0.15 found by Elvik. 6.3.3 Lessons Learned 6.3.3.1 Complexity The macro-CMF estimates were derived using the guidelines without excessive complexity, with a level of effort not significantly different from micro-CMF estimation, suggesting that the guidelines work reasonably well for C M F estimation. 6.3.3.2 Individual versus Cumulative OR Calculations Using the recommended method described in the guidelines based on Sayed and de Leur (2001 a) produces significantly different results than those that would otherwise be obtained using simple averages of individual site results. 6.3.3.3 Use of Measure Models The ability to perform direct measurement of variables for use with measured CPMs was convenient, and removed zonal boundary and size restrictions from the analysis. This flexibility permitted the analysis of traffic calmed neighbourhood areas which were smaller than zonal areas. To do sub-zonal analysis, however, it was important that variables were normalized for area (i.e. densities), or adjusted for the reduced area, so that zonal area/study area mismatches did not introduce errors. 6.3.3.4 Uniformity When adjusting variables for sub-zonal analysis, an assumption of uniform density across the zone was required. As this impacted directly on model predictions, it was critical to double check using land use and road maps. 6.3.3.5 Different Study & Zonal Areas While conducting the analysis and having to adjust for differences between the sizes of the study area and the original traffic zone, it was found that Lovegrove 159 C P M predictions were much higher than observed collision counts. On closer inspection, it appeared that the boundaries of the ICBC traffic calming study areas were selected to not include perimeter arterial roads around the neighbourhood. Although using measured models adjusted for some of this non-perimeter road difference (i.e. smaller values for T L K M ) , additional research would be appropriate to ascertain an adjustment factor for using macro-level CPMs in study areas that do not include the perimeter arterials. 6.3.3.6 Reasonableness of C M F Estimates The C M F results presented in Table 6.3 appear to indicate that a traffic calming C M F estimate for the G V R D lies near -0.40, and is similar to earlier results found by ICBC. However, the sample size in this study was small (n = 3), and Elvik (2001) found, using a much larger sample of traffic calming project results, a traffic calming C M F estimate for severe collisions closer to -0.15. Therefore, these C M F estimates should be further examined with additional macro-reactive analysis. 6.4 Summary Macro-reactive model-use guidelines proposed in Chapter Five have been tested in two case studies, including: 1) Black Spot analysis of urban and rural neighbourhoods across the G V R D ; and, 2) derivation of traffic calming C M F estimates. The main objectives of these two safety applications were to test these recommended macro-reactive guidelines, and assess their potential to meet the needs of practitioners. Judging from the level of effort and complexity involved, and the reasonableness of both qualitative and quantitative results, the guidelines appear reasonable for use by practitioners. If adopted for use, macro-level CPMs appear able to provide a new and reliable decision-aid tool for planners and engineers involved in reactive road safety improvement programs. Comments specific to each case study are summarized below. 6.4.1 Black Spot Programs Following the recommended guidelines to adapt the conventional black spot methods for use with macro-level collision prediction models appears to be feasible and practical. The approach was not complex, and relatively similar to that already in use, with differences noted in Figure Lovegrove 160 5.1. CPMs from each of the four explanatory variable themes provided useful insights into road safety problems and possible solutions. They pointed to possible remedial safety treatments that might not otherwise have been considered by conventional methods. Thus, there appeared to be significant potential in using macro-level CPMs reactively. However, before widespread use can be recommended, additional future research needs have been identified regarding: other explanatory variables, wider consultation with practitioners, and macro-level C M F estimates. 6.4.2 Macro-CMFs The second case study tested the proposed guidelines and provided an initial estimate of a traffic calming macro-CMF. Using the proposed guidelines, macro-CMF estimation was relatively simple and similar to that using micro-CPMs, with estimated G V R D urban traffic calming CMFs for total and severe collisions in the range of -0.40 and -0.39, respectively. These estimates were in general agreement with earlier simple before and after studies performed for ICBC, but significantly higher than a recent area-wide traffic calming C M F estimate for severe collisions by Elvik (2001) using refined meta-analysis techniques. Therefore, additional research is needed to place confidence on these initial estimates, and to provide additional C M F estimates. Overall, despite promising initial results in these two safety applications, macro-level CPMs and guidelines for their use are at a very early stage in research, which suggests that the results of this initial research need to be used with great caution. In all cases, consultation with local municipal authorities is needed to ensure reasonable data quality and practical, professional results. Based on the results and lessons learned from these two initial macro-reactive safety applications, guidelines for the proactive use of macro-level CPMs were proposed in Section 5.4, with the results of their testing discussed in Chapter Seven. Lovegi'ovc 161 7. PROACTIVE SAFETY APPLICATIONS 7.1 Introduction In Chapter Six, macro-reactive model use guidelines were tested, with promising results. In this chapter, the results of testing proactive model use guidelines in four case studies are discussed. To discuss these four proactive case studies, this chapter has been split into three main parts. In section 7.2, the first case study is discussed, including the study approach, results, and lessons learned in applying the guidelines and models to evaluate the safety of a planned regional transportation project. In section 7.3, the second and third case studies are discussed, including the study approach, results, and lessons learned in two neighbourhood planning projects. In section 7.4, the last case study is discussed, including the study approach, results, and lessons learned in transferring the models between difference time-space regions. 7.2 Regional Road Safety Planning Case Study The first case study involved a test of the regional planning guidelines using GVRD-area data supplied from TransLink (2004). 7.2.1 Background The regional planning study involved an analysis of TransLink's 2005 to 2007 3-Year Plan, which was comprised mainly of bus service expansion and road improvements throughout the region. Bus improvements were located in growing areas and along main feeder corridors, with annual service hours increasing by 9.5%, from 3.9 million in 2004 to 4.3 million by 2007. With the exception of two major road projects, most of the road improvements were relatively minor and situated in the north-east quadrant of the Greater Vancouver region. Using this Plan for a case study was advantageous for several reasons. First its short-term nature provided the opportunity to apply the CPMs to a regional scenario, but with a time horizon not too far into the Lovegrove 162 future, thereby facilitating data extraction via the extrapolation of socio-demographic and road network data. Second, it had been already evaluated using the GVRD' s Emme/2 transportation model for a typical A M peak hour period in the 2007 planning horizon year. And third, a base do-nothing comparison scenario had already conveniently been modeled using Emme/2 with 2007 demand (demographics). A summary of the regional modeled exposure impacts from the Three-Year Plan scenario (Year 2007), as compared to the base do-nothing scenario (Year 2007), is contained in Table 7.1. Under the plan, the total kilometres travelled for automobiles (VKT) and for transit trips (TKT) increased by 0.6% from 2.370 million km to 2.383 km in the A M peak hour. Due to transit improvements, the regional transportation planning model (Emme/2) forecast a net decrease in volume-to-capacity (VC) values for all traffic zones by 2.1% with the regional average falling from 0.390 to 0.382. Table 7.1 . Three-Year Plan Regional Exposure Impacts. Scenarios (Year 2007) Difference Difference Regional Totals Base 3-Year Plan % Actual Network Length 5,130 5,221 1.8% 91 Average V / C 0.390 0.382 -2.1% -0.008 Transit V K T 14,319 15,557 8.6% 1,238 Auto V K T 2,355,335 2,367,409 0.5% 12,074 Total V K T 2,369,654 2,382,966 0.6% 13,312 Using this data, the objectives of this proactive safety application were to: • Test the recommended guidelines in a regional safety planning application, using any lessons learned to refine those guidelines; • Demonstrate proof-of-concept in using macro-level CPMs with Emme/2 in evaluating the safety of modelled scenarios; and, • Assess whether macro-level CPMs would provide an empirically reliable road safety tool for regional, strategic-level transportation planning processes; L o v e g r o v e 163 7.2.2 Approach Following the guidelines in Chapter Five, candidate model groups were screened, and the twenty modeled total and severe CPMs shown in Tables 4.1 to 4.4 were chosen for this case study, based on the following reasoning: A. Scope = Regional, therefore macro-level C P M use was warranted; B. Task = Regional roads and transit, but with a short planning horizon in order to use current land use and demographic data, therefore all sixteen model groups were candidates; C. Land Use = Both rural and urban neighbourhoods exist across the G V R D , so all sixteen groups could be used; D. Trigger Variables = As the analysis included a comparison of present and future conditions, all variables and model groups were potentially relevant; E. Collisions = With no specific issues to investigate, the recommended total and severe C P M types shown in Tables 4.1 through 4.4 were desired for this analysis; and, F. Data = As the modeled data year was 2007, that was chosen as the planning horizon for data extraction. While measured model groups were shown in Table 6.2 as optional candidates, to minimize extraction efforts only modeled groups were selected. Having selected the models, the application of the 20 modeled CPMs to TransLink's Three-Year Plan followed the recommended guidelines on zones of influence, data extraction, and interpretation of results. First, to determine zones of influence for the analysis, Emme/2 regional plots of zonal exposure changes as shown in Figure 7.1 (Lim, 2004) were reviewed, with zones of reduced V K T (green) and increased V K T (red) shown in Figure 7.1.a. The majority of traffic zones with V K T decreases occurred near traffic zones with V K T increases. This phenomenon suggests that traffic continued to travel within the same general corridor and merely redistributed to adjacent roadways as a result of the roadway improvements. The redistribution could also be due to reductions in traffic, which would reduce volume-to-capacity ratios (i.e. VC), as shown similarly shaded in Figure 7.1 .b. Given the wide ranging exposure impacts, all zones across the region were considered as zones of influence for data extraction and analysis. Lovegrove 164 Lovegi'ove 165 As the modeled exposure data from Emme/2 was in the planning year of 2007, datasets from Census (demographics), G V R D (land use), and road networks were extracted and extrapolated to 2007, based on historical and projected growth rates of population, households, and employment. After data extraction and assembly following the guidelines, the CPMs were run to predict the collisions for each traffic zone, relative to the Base scenario, for each of the 20 models, and the results interpreted by taking the differences of sums as recommended in the guidelines. T statistics were used with 95% confidence intervals to verify the hypothesis on whether each difference was significant. 7.2.3 Results A summary of results is contained in Table 7.2, which lists the C P M estimate differences between the two scenarios, categorized by model group, collision types, land use types, and regional totals, together with T-statistic test results. Overall, the results indicated that the 3-Year Plan was safer than the base scenario, with three specific observations. First, the results across all groups met with intuitive expectations, with all four regional summary figures statistically significant from zero. Second, the two model groups most relevant to the analysis (i.e. exposure w/VC, S-D) were consistent with intuitive expectations and showed significant net collision reductions (i.e. safety improvement) due to the 3-Year Plan. Third, there were some inconsistencies in the results, due to inclusion of exposure group models without V C variables, and, inclusion of optional model groups that were not relevant in the analysis. Regarding the exposure group models, while the group total showed a significant net collision decrease, the non-VC exposure models showed a net increase, which was intuitive given the overall changes in V K T (increased 0.6%) and V C (decreased 2.1%), both of which strongly influence model predictions. Only the S-D groups, both rural and urban, indicated a consistent net safety improvement, revealing the highest net severe and rural collision differences of all groups, a result likely due again to having the V C variable in their equations. Regarding the optional groups (i.e. T D M and network), of eight C P M predictions from these four groups, four were statistically insignificant. This result indicates the importance of Lovegrove 166 conducting statistical significance testing, as two of these insignificant results predicted a net increase in collisions and may otherwise have lead to misinterpretation of the results. Table 7.2. Differences in Collision Sums between Three-Year Plan & Base Scenarios (2007). Urban Zones (n = 342) Rural Zones (n = 72) Variable Themes Sum of Differences Avg. Zonal Difference t-Statistics (95%) Sum of Differences Avg. Zonal Difference t-Statistics (95%) Exposure Total (w/o vc> 853 1.9 1.59 -102 -1.4 1.09 Severe (w/o vc) 304 0.7 2.05 -39 -0.5 0.93 Total (w/ vc> -2,948 -6.7 5.16 -496 -6.9 3.44 Severe (w/ vc) -1,265 -2.9 7.36 -115 -1.6 2.25 Socio-Demographic Total -2,778 -6.8 3.75 -719 -10.0 5.38 Severe -1,324 -3.2 5.92 -191 -2.6 3.75 T D M Total -354 -0.8 0.60 -453 -6.3 3.27 Severe 16 0.0 0.11 -124 -1.7 2.04 Network Total 100 0.2 0.31 -342 -4.8 3.01 Severe -167 -0.4 2.09 -38 -0.5 1.06 Urban Zones Rural Zones Regional Net Summary Total Collisions per Model Group Collisions per Zone Average Net Total Collisions per Model Group Collisions per Zone Average Total -5128 -1026 -2.4 2.3 -2112 -422 -5.9 3.2 Severe -2437 -487 -1.2 3.5 -507 -101 -1.4 2.0 To provide a more effective means of visually analyzing the variability in collision predictions, theme mapping of the results was also produced. A sample of theme mapping for the urban total CPMs for each theme is shown in Figures 7.2 (Lim, 2004), using a uniform gradation interval for ease of comparison. 7.2.3.1 Urban Total collisions Figures 7.2.a through 7.2.e display, for urban zones, the three-year collision totals, all of which have similar patterns to the changes in Figure 7.1. However, the inclusion of the V C variable in the second exposure C P M group produced a significant change in results. While the non-VC model group (Figure 7.2.a) predicted a net regional increase in collisions of 853 (although not meeting statistical tests for significance), the V C -Lovegrove 167 inclusive model (Figure 7.2.b) showed a significant net decrease of 2,948 collisions region-wide. This change was also apparent graphically, with the distribution of collision total by traffic zone in the non-VC model showing little or no change in collisions (364 zones, +-/20 collisions), while the VC-inclusive model showed more traffic zones with a decrease in collisions. The three other model maps showed similar region-wide results, with the network model map (Figure 7.2.e) seeming to be the least sensitive. 7.2.3.2 Urban Severe collisions The majority of changes followed the same patterns as Figure 7.1, except for the network C P M , which forecast relatively low net collision changes across the region. The T D M model group exhibited a diverse range of collision totals throughout the region, but showed the lowest net change (which did not meet statistical tests for significance) of all the models in this group. The exposure models showed a relatively diverse distribution of medium to high range severe collisions, again reflective of the V K T thematic traffic zone distribution. Similar to the urban total CPMs, the non-VC models predicted a net increase in severe collisions, whereas the VC-inclusive model resulted in a significant net decrease in severe collisions. Of the four CPMs which predicted an increase in collisions, this was the only instance where it met statistical tests for significance. 7.2.3.3 Rural Total collisions Like the urban model results, the pattern of changes in the rural total collision CPMs were also similar to the change patterns in Figure 7.1. The impact of the Three-Year Plan was generally positive in terms of net regional collision decreases for all rural total CPMs, even though the non-VC model results were again considered insignificant. 7.2.3.4 Rural Severe collisions The resulting rural severe collision changes generally followed that of the rural total collision changes. Even though their predictions indicated a net decrease in collisions, both the network and non-VC exposure models predictions did not meet statistical tests for significance. A l l other group differences predicted a net decrease in severe collisions. The consistency of both these rural results suggested that the Three-Year Plan was beneficial, in terms of safety, for rural areas. Lovegrove Exp_Ur_T*1fc¥TZ • lOOto 500 (7) 0 20to 100 (57) 0 -Mto » (364) G-tOO to -20 (52) • •500 to-100 (4) a. Exposure (with VC) b. Exposure (without VC) c. Socio-Demographic • 500 to 1,700 • 100 to 500 • 20 to 100 • -20 to 20 • -100 to -20 • -500 to -100 • -1,700 to -500 e. Network Legend (collision total differences) Figure 7.2. Urban Total Collision Differences, by Zone. Lovegrove 169 7.2.4 Lessons Learned From these results of a regional planning application, a number of issues were identified and lessons learned that warrant further discussion and research for future application of these CPMs. 7.2.4.1 Influence of the VC variable The differences between all relevant CPMs that included the V C variable, regardless of collision type or land use type, reported a significant region-wide net decrease of collisions, attributable to the overall 2.1% drop in V C as compared to the Base scenario. This suggests that changes in the V C variable value may be used as a qualitative indicator for changes in zonal road safety. Moreover, CPMs with the V C variable also tended to have, on average, higher K values than CPMs without the V C variable, suggesting that V C may also contribute to model goodness of fit. A l l models with a V C variable provided collision prediction results that: 1) were in line with intuitive expectations, 2) met statistical tests for significance, and 3) came from better fit models. These results raise the question of whether the non-VC exposure models should continue to be used. For example, in this case study, removal of the non-VC variable models (and the 'optional' T D M and network models) would have increased prediction consistency. 7.2.4.2 Use of Individual, Optional, and/or Multiple CPMs The guidelines appear to be reasonable in directing which models to use for a particular planning scenario, and how multiple CPMs should be used, including interpretation of the results. However, the inclusion of models noted as optional in Table 5.5 should not be done without careful consideration of relevance, and statistical testing of results, in order to preclude possibly misleading interpretations. 7.2.4.3 Limitations from excluding Highways It was understood that the CPMs were calibrated using Emme/2 output from road and transit links with posted speeds of 60km/hr or less (i.e. non-highway links). This exclusion may cause problems when interpreting results of regional projects. Perhaps the zonal macro-level C P M results could be augmented by micro-level C P M predictions for each highway segment, to provide a more complete regional safety estimate in absolute terms. Alternatively, recalibration of the macro-level CPMs could be attempted including data with the highway links. Lovegrove 170 1.2 A A Data Intensiveness The assembly and processing of 14 different datasets needed for the 20 different modeled C P M calculations can be a challenging and complex task, if not impossible for practitioners that do not have access to the requisite data. To keep the level of effort practical while still pursuing reliable results, the proposed model selection guidelines presented in section 5.2 recommend that only the minimum number of CPMs need be used in each planning process. If the optional models in the study had not been used (i.e. T D M and Network), only two to five data extractions would have been required, thus addressing the issue of data intensiveness. The insignificance of many of their results also suggests that their optional (as opposed to recommended) classification in the guidelines for this type of regional study was appropriate. 7.2.4.5 Thematic Mapping Use of tools such as GIS to graphically map the results of the CPMs enhanced both the effectiveness and efficiency of analyzing the results. However, the use of thematic mapping can also have its drawbacks. An issue that can significantly affect the analysis of the results is the use of inappropriate or complex thematic ranges. The incorrect number of ranges, or bins, and their associated values can skew the interpretation of the results. For example, having too few ranges may diffuse or wash out the collision predictions, especially if the collision predictions have a large variation of low and high values. Likewise, using too many ranges may make the maps too difficult to read and identify trends. Therefore the practitioner should consider carefully the creation of thematic maps that are easy to interpret for non-practitioners, yet useful for unbiased analysis. 7.2.4.6 Proof of Concept The application of macro-level CPMs to a regional planning case study of TransLink's 3-Year Plan appeared to be successful. The models behaved as expected when used with a regional transportation planning model, and based on the resulting C P M thematic maps and tabulated results, demonstrated proof of concept. With the successful demonstration of a regional safety planning application completed, the next test of the C P M use guidelines concerned the use of models in a neighbourhood safety planning application. IjOvegrove 171 7.3 Neighbourhood Road Safety Planning Case Studies Given the difference in proposed guidelines for regional and neighbourhood planning, it was appropriate that neighbourhood-level safety planning case studies also be conducted to further test the proposed guidelines. 7.3.1 Background To test the guidelines and demonstrate macro-level C P M use in neighbourhood safety planning, two case studies were involved, the first on the relative safety of various road network patterns, and the second on the relative safety of various neighbourhood core sizes. 7.3.1.1 Road Network The first case study was conducted to verify earlier Dutch findings on the relative safety of neighbourhood road network patterns. Following from the Dutch national SRS guidelines approved in 1996 that call for separation of road functions (i.e. through, distributor, local access), researchers were studying ways to plan neighbourhood road networks to reduce collisions while maintaining accessibility and mobility (CROW, 2000). Three of the four road networks shown in Figure 7.3 ("a", "b", and "c") were analyzed by Dutch researchers. They found that the conceptual SRS network (Figure 7.3.c) appeared to be the safest while maintaining neighbourhood accessibility and mobility. While results of this earlier Dutch study were helpful for planning sustainably safer neighbourhoods for residents, the study results were based on use of collision rates, implying an assumed linear exposure-collision relationship. In addition to computational errors introduced when collision rates are used, more recent findings in this thesis suggest that other neighbourhood road network patterns may be safer, including the 3-way Offset network shown in Figure 7.3.d. 7.3.1.2 Core Size The second neighbourhood planning case study was conducted to verify earlier Dutch findings on the safest neighbourhood core size (CORE). To verify SRS core guidelines, van Minnen (1999) examined the safety of existing commercial and residential neighbourhoods. Dutch study results indicated optimum safety levels with core sizes in the range of 65 hectares for commercial and 80 hectares for residential neighbourhoods. Lovegrove 172 The objectives of the two neighbourhood planning case studies in this thesis were to: o Test guidelines for use of models in neighbourhood safety planning, suggesting refinements and future research areas to enhance user friendliness; o Estimate the optimum neighbourhood core size from a G V R D road safety perspective, for comparison with earlier Dutch results; and, o Estimate the optimum neighbourhood road network pattern from a G V R D road safety perspective, for comparison with earlier Dutch results. 7.3.2 Model Selection Following the recommended guidelines in chapter five, the first step in each case study was to select the appropriate models. In both cases, urban measured and modeled CPMs for total collisions were selected. For the network analysis, those from the network and T D M groups (Tables 4.3 and 4.4) were used. For the core analysis, those from the T D M groups (Table 4.3) were used. As the analysis was a relative comparison with no future forecasts desired, the time period most convenient to dataset availability was chosen (i.e. 1996), in order that both modeled and measured CPMs could be used. 7.3.3 Approach After selecting the most appropriate models, the approach for each of these neighbourhood safety planning case studies was tailored to try and emulate the original Dutch studies, but using the newly developed macro-level CPMs and guidelines. 7.3.3.1 Road Network Once models were selected, the network analysis was conducted in five steps. First, a scan was conducted across all G V R D municipalities to identify existing sample neighbourhoods that resembled the four test networks under review, including the three in the original Dutch study and the 3-way Offset network, as shown in Figure 7.3. As part of this region-wide scan, to control external factors as much as possible (e.g. traffic regulations, regional location, traffic engineering practises, etc.), the ideal was to find all four sample Lovegrove 173 networks in one municipality. If existing neighbourhoods with these network shapes could be found, they would offer a strong in-situ empirical case by simply observing their relative collision histories. While this search was not successful, several sample neighbourhoods having characteristics of at least one of the test networks were found, three in each of two municipalities. a. Grid Network 1 1 | b. Discontinuous Network (Neo-Traditional) (Culs-de-Sac) c. Limited Access d. Offset Network (3-way) (Dutch SRS Guidelines) Figure 7.3. Neighbourhood Access Road Network Options. In the second step, variable values for each of these sample neighbourhoods were used as control values for non-trigger C P M variables in evaluating each test network. For this analysis, non-trigger variables included: o Exposure - V K T , V C , T L K M ; o S-D - POPD, FS, U N E M P , W K G D ; and, o T D M - DRIVE, T C M , CORE, CRP, SIGD. By holding these control values the same across all four test networks throughout the analysis, the safety results of each network would then more accurately reflect changes in the network and T D M group trigger variables. Lovegiove 174 In the third step, based on the usual G V R D one-to-two block-width-to-block-length ratio, four scaled versions of each test network were plotted. To ensure that no bias was introduced due to different neighbourhood sizes, each test network was designed to be modular. In the fourth step, values for each trigger variables were measured from the plotted test networks, to provide the remaining input data to run the network and T D M CPMs. For this network analysis, trigger variables included shortcut capacity (SCC, SCVC), intersection density (INTD), and 3-way intersection percentage (13 WP). In the last step, as the initial C P M forecasts suggested that the recommended SRS road network (Figure 7.3.c) wasn't significantly different from the grid (Figure 7.3.a) or cul-de-sac (Figure 7.3.b) test networks, a modified SRS network was also created, by offsetting one leg of the four internal four-way intersections to produce eight internal three-way intersections (13 WP). 7.3.3.2 Core Size The second neighbourhood safety planning case study, involving a search for optimum core size, involved two steps. In the first step, each urban T D M group C P M equation was reviewed using differential calculus to identify any local maxima which might indicate the safest core size. However, it was found that safety was maximized asymptotically as core size grows, with no local maximum. In the second step, a scan was done of G V R D neighbourhood core sizes, to determine the range within which neighbourhood cores sizes existed, versus collision density (collisions per hectare), and by land use type (i.e. residential/commercial). The results were based on observations of actual neighbourhoods with a full range of core sizes in the G V R D below and above those found in the earlier Dutch study. Lovegrove 175 7.3.4 Results Following the recommended guidelines and using these approaches, the results of each case study are discussed below. 7.3.4.1 Road Network A summary of the safety evaluations for each of the four road networks is contained in Table 7.3, showing that the 3-way Offset network (Figure 7.3.d) was predicted to be safest overall, with a collision density (Cd) of 2.2 collisions/hectare in a 3 year period, and a surprisingly low variance of 0.2. The modified Dutch SRS network was projected to be second safest, at 2.4 collisions/hectare with a similarly low variance, followed by the original Dutch SRS network (Figure 7.3.c) and the cul-de-sac network (Figure 7.3.b), both at 5.7 collisions/hectare. The grid road network (Figure 7.3.a) was projected to be least safe at an estimated 16.5 collisions/hectare over a three year period, not surprisingly with the highest variance at 8.2. Table 7.3. Access Road Network Safety* GVRD Grid Culs-de-Sac Dutch SRS Offset Layout Average 4-way 3-way 3-way Collision Density (CH) 3.1 16.5 5.7 5.7 2.4 2.2 AdjustedCd (collisions/ha) 3.1 4.0 1.4 1.4 0.6 0.5 Variance Cd 4.3 8.2 1.0 1.0 0.2 0.2 INTD 0.46 0.50 0.38 0.38 0.50 0.88 13 WP 52% 0 33% 33% 75% 86% SCC 6.0 3.3 0.0 0.0 0.0 0.0 * V K T , V C , T L K M , POPD, FS, W K G D , C O R E , SIGD, L L K P , A L K P controlled at same level for each network For reference, the observed G V R D zonal averages have also been listed, showing a regional average of 3.1 collisions/hectare, which suggested that the predictions for each network were within realistic and reasonable ranges. However, the intuitive expectation was that this regional average would be between the predicted levels for culs-de-sac (i.e. 5.7 collisions / hectare), and grid road networks (i.e. 16.5 collisions / hectare), because most of the G V R D neighbourhood Lovegi'ove 176 networks were based on one of these two network concepts. One possible reason for this macro-C P M over-estimation was thought to be related to the use of control variable values and test network modules (i.e. theoretical versus real neighbourhoods). It was intuitive that using G V R D input data to derive C P M estimates for the grid and cul-de-sac test networks should produce results in line with observed G V R D averages for these network types. As initial C P M estimates appeared generally high in this regard, an adjustment factor was applied to the collision predictions across all four test networks so that results would be within realistic ranges, for comparison to actual neighbourhoods. The adjustment factor applied to all four test networks was based on a ratio of the G V R D average collision frequency to the predicted collision frequency for the theoretically created grid and cul-de-sac test networks, as shown in Equations 7.1 and 7.2. The calculation assumed that the proportions of G V R D neighbourhoods with grid networks versus cul-de-sac networks were split roughly 65% to 35%). c , ' = r < (7.i) Where: C 7 GVRD (7.2) (0.65 • CGrid + 0.35 • CCul_de_sac) i = test network type (Grid, Cul-de-sac, SRS, 3-way Offset) The predicted collisions for each network were compared to that of the safest network (C j way<), and the results have been summarized in Table 7.4, including the results of statistical tests for significance. Using a 90% confidence level to test for statistical significance, the 3-way Offset network appeared to be safer than all other networks other than the modified Dutch SRS network. The similar safety results of the 3-way Offset and modified Dutch SRS networks were fortunate in that, while the 3-way Offset was safest by a slight margin, it would be expensive to retrofit into existing neighbourhoods (i.e. more road relocations). However, the modified Dutch SRS could be retrofitted with fewer minor street closures and cul-de-sac creation. Another interesting result from the relative comparison results concerns the conventional use of cul-de-sac and grid street networks across the G V R D . According to these results, in relative terms, the safety of cul-de-sac neighbourhood road networks appeared to be significantly safer than grid networks by a factor of nearly three to one. Of course this three to one safety advantage needs to be weighed against the corresponding impacts that cul-de-sac networks usually have on L o v e g r o v e 177 community mobility and accessibility (e.g. for transit). Further research on neighbourhood road networks, including evaluation of safety, mobility, and accessibility would facilitate clarification on this issue. Table 7.4. Relative Comparison to 3-way Offset Network. Adjusted Ratio* t-Statistic Network Type ^d y-ri i ; s~<3-way i ^d 1 ^d (U, 90% = 2.3) Grid 4.0 7.4 3.4 Culs-de-sacs 1.4 2.5 2.4 Dutch SRS original 1.4 2.5 2.4 Modified Dutch SRS 0.6 1.1 0.3 G V R D Average 3.1 5.7 3.2 3-way Offset ( C ^ ' ) 0.5 1.0 -*Before rounding 7.3.4.2 Core Size The core size case study results served to reinforce earlier modelling results, that increasing neighbourhood safety was statistically associated with increasing core size. However, this case study did add to that earlier finding by illuminating the range within which this association occurs in the G V R D . From zero to 100 hectares, this association appears to include significant increases in safety (i.e. reduced collision frequency). However, TAZs with core sizes above 100 hectares experience a decreasing benefit on safety, with the largest and safest core size in the G V R D observed to be in the range of 400 hectares. The results in this study appear to be within the same order of magnitude as the Dutch findings (65 to 80 hectares). One possible explanation for why the Dutch study found a safety optimum core size (and this study did not) may be related to the Dutch approach using (linear) collision rates. 7.3.5 Lessons Learned Several lessons have been learned in undertaking these two neighbourhood safety planning case studies. Lovegiove 178 7.3.5.1 Selection of Models For neighbourhood planning, it appears that the most critical part of the approach is to select the appropriate C P M for the safety evaluation. For example, in the neighbourhood road network analysis, the first inclination was to choose only the network group models. However, on closer examination it was found that the optional T D M group models were also warranted, given that road network layout also impacted shortcutting capacity (SCC). Therefore, given the complex and not fully understood relationships between neighbourhood traits and road safety, it is critical that careful consideration be given to all possible trigger variables and model groups. 7.3.5.2 Three-Way Intersections In neighbourhood road safety planning, the use of three-way intersections appeared to have a dominating influence on road safety improvement. This mitigating influence was highlighted when the modification of the original Dutch SRS network, which included converting 33% of its four-way intersections to three-way intersections, produced a 57% reduction in predicted collisions. 7.3.5.3 Other (Non-Safety) Considerations While road safety impacts have been an enormous social and economic concern, they cannot be considered in isolation from other community priorities. Taking mobility and accessibility into account, the modified Dutch SRS network with its easier retrofit potential would appear to be the more sustainable neighbourhood road network in the long term. However, further research would be needed to verify this hypothesis. 7.3.5.4 Control Variables versus Sensitivity Analyses While the practise of adjusting one C P M variable while holding others constant (i.e. sensitivity analyses) has been previously cautioned against, the use in this case of control variables may offer one way to conduct a lower risk pseudo-sensitivity analysis with more realistic results. For example, in the network review study, control variables were chosen and set at values typical for the networks involved, in conjunction with simultaneous adjustment of several trigger variables (e.g. shortcutting, intersection density, three-way intersections). This approach tried to address earlier concerns regarding mono-variable sensitivity analyses by simultaneously working with several control and trigger variables within their generally accepted ranges. Moreover, recognizing their inter-Lovegiove 179 relationships, several trigger variables and models were used simultaneously. This pseudo-sensitivity technique using test networks may have merit and should be studied further in future safety research. 7.3.5.5 Interpretation of Results The use of the statistical tests in neighbourhood planning to verify significance of the results was awkward, because of the typically large predicted C P M variance relative to E(A) and n. As the calculated t-statistic varies with the square root of the sample size (Equation 5.1), a strategy to improve result reliability is to use more models in the analysis. This improves the estimate in two ways, first by increasing sample size to lower the t-test statistic, and, second, by increasing the sample t-value. While the minimum number of models is recommended to minimize data extraction effort in regional planning, this consideration should therefore be balanced by the need for reliable and statistically significant results in neighbourhood planning. 7.3.5.6 Optimum Core Size Contrary to earlier Dutch studies, no optimum core size was found. The mitigating safety effects of the CORE-size-to-collision-frequency relationship appear to hold true in a range between 0 and 400 hectares, with large increases in safety associated with increases in cores sizes up to roughly 100 hectares. However, the large variance and less than definitive results in this core study require further research, perhaps including identification of additional C P M variables. 7.4 Transferability Case Study The third safety application involved a test of the proposed transferability guidelines. 7.4.1 Background In a previous transferability study involving G V R D municipalities, Sawalha & Sayed (2005b) evaluated the transferability of micro-level CPMs developed using Vancouver intersection data, being calibrated for use in Richmond, with promising results. For the transferability case study in this thesis, a G V R D dataset was also available, consisting of the 1996 dataset used in the Lovegrove 180 original macro-level C P M development process described in Chapter Three. As only data from the 1996 dataset was available, only a geographic transfer was done. For comparison with the Sawalha & Sayed (2005b) results, this case study also looked at model transferability between Vancouver and Richmond, among others. 7.4.2 Approach The approach to this transferability case study involved four steps. The first step involved model selection. Given that the objectives of this case study were to test the guidelines and demonstrate that macro-level C P M transferability was feasible, model selection was based on ensuring a representative test. Consequently, measured and modeled total collision CPMs from all four major variable themes were selected. The second step in the case study involved development of macro-level CPMs using Vancouver data only. While adequate urban data existed in the City of Vancouver to develop urban CPMs, no data points were available to develop rural models. Table 7.5 contains a summary of the number of urban and rural zones in the largest G V R D municipalities, with Vancouver having the largest number of urban zones (112) but no rural zones, and Langley having the largest number of rural zones (32) but fewest urban zones (20). Therefore, while only the eight urban CPMs were used in this study, an additional transferability test was done to gauge whether models developed for the City of Vancouver's 112 urban zones could be transferred not just to Richmond's 40 urban zones, but also to Langley's 20 urban zones. This additional transferability test was intended to provide some indication of the influence of sample size on the associated statistical tests of significance related to transferability results. Table 7.5. G V R D Data Points Available for C P M Transferability G V R D Cities Rural Zones Urban Zones Vancouver 0 112 Surrey Burnaby 22 81 2 54 Richmond 3 40 Langley 32 20 Lovegrove 181 Following the proposed transferability guidelines in Section 5.4.3, the third step in the case study involved calibrating the transferred Vancouver models for Richmond and Langley. To do this calibration, all Vancouver model parameter estimates were retained except the a0's, and the G L M software was re-run using Richmond and Langley data to obtain new estimates for each CPM's a0 and tc parameters. For this study, the actual G L M process and calibration was done using the OFFSET command in GLIM4 software (NAG, 1996). The fourth and final step involved statistical verification of each calibrated model's goodness of fit, using z statistics calculated using Equation 2.44, repeated below for reference. \zl-E(zl) < 1.00 where: 2 ^ [ X - ^ ( A , ) ] 2 Pearson % =2-,' 1=1 Var(y,) [y.-EjA,)]2 £(Ajri + £ ( A , ) / K ] E(X2P) = N-Mxl) = , 2/VX1 + 3/0 + X -1 (2.44) £f£(A,)[l + £(A (.)/*:] N = the number of data points used to re-calibrate the model. £(A,), K 's are all from each re-calibrated C P M y;, Var(yi) are derived for each individual observation in the new data set (2.33) (2.45a, b) 7.4.3 Results Table 7.6 contains a summary of the results. A l l z statistics were zero, to two significant figures (z = 0.00), except for the modeled T D M C P M for Richmond, where z = 0.01. These near-zero z statistics suggested that all models were transferred successfully, meeting statistical tests with a 95% level of confidence. A number of observations can be made from these results. First, the observation that all but one of the z statistics were zero to two significant digits was surprising. A l l other goodness of fit measures appeared to echo the z statistic indicator, however, suggesting the fits were indeed truly good. Second, all of the K values in the transferred models were lower than in the original Vancouver models, meaning a relatively worse fit model. This worse fit was Lovegrove 182 not surprising given the forced nature of the G L M fitting process, wherein the transferred model begins the calibration with all but one of its parameters pre-set at the original model values. Hence, it is not surprising that the Langley models, with smaller sample size, had higher K values than the Richmond models in all but one case, due to the smaller Langley sample size requiring less forcing. Moreover, these higher Langley K values coupled with near-zero z values suggests that smaller sample size may not be a hindrance to model transferability (all sample t-values for the new a 0 's were well above the t-test statistic of 1.96). Table 7.6. Transferability Results - From Vancouver to Richmond, Langley.* /c's a0's xl Group Vcr Rmd Lgiy Vcr Rmd Lgiy Rmd Lg'y Rmd Lgiy Exposure 809 1 Modeled 2.4 1.8 2.1 1.8 1.2 0.90 43 18 2,393 2 Measured 2.3 1.5 1.5 58.7 33.3 16.8 43 20 2,461 999 Socio-Demographic 5 Modeled 2.5 1.7 2.1 0.50 0.37 0.31 44 18 3,183 890 6 Measured 2.6 1.5 1.7 71.6 56.4 25.7 44 21 2,649 1,000 T D M 9 Modeled** 2.7 1.8 2.2 2.0 1.5 1.3 44 19 2,492 839 10 Measured 2.6 1.4 1.5 29.3 22.2 15.7 59 16 2,990 1,446 Network 13 Modeled 3.0 2.2 2.5 0.69 0.77 0.65 38 19 2,095 707 14 Measured 3.0 2.0 1.8 35.4 28.0 14.7 48 21 2,061 916 "Using only Urban, Total CPMs, Richmond (N = 41, SD = 45), Langley (N = 20, SD = 21). **All transferred CPM z statistics were = 0.00, except for Richmond modeledTDM, with z = 0.01 Third, the decrease in K values upon transfer calibration was greater in measured models than in modeled models, with K values for Richmond and Langley models virtually the same. This may have been indicative of the inherent differences between how the values of V K T and T L K M exposure variables were generated. V K T values were estimated from a regional Emme/2 model, which tended to 'smooth' forecasts to meet regional screenline calibrations. Although this smoothing would likely introduce larger errors at the zonal level, it would tend to reduce data differences when transferring in the same geographic region. However, T L K M values were measured directly and hence more accurately, on a zone-by-zone basis. Lovegi'ovc 183 Fourth, the re-estimated a 0 's for all but one transferred model were lower than the original Vancouver CPMs, with those for Langley CPMs lower yet again than Richmond CPMs. This was likely due to the fact that Langley in general had lower traffic volumes (i.e. exposure) than Richmond and Vancouver, with correspondingly lower collisions. Given the forced nature of the fit, for the same C P M variable values the lead coefficient served to also force down the collisions predicted from busier Vancouver and Richmond to quieter Langley. Last, the near-zero z values were lower than those found by Sawalha & Sayed (2005b) when transferring micro-level CPMs between Vancouver and Richmond. Although these improved z values may have been due to differences in data quality between the different datasets used in the two studies, these results are encouraging for macro-level C P M transferability. However, as each study used datasets that were subsets of the same time-space region, future research should be done on transferability of models from Vancouver to a more distant municipality (i.e. a geographic location outside the G V R D , and/or a different time period than 1996). 7.4.4 Lessons Learned From these transferability results, several lessons were learned. 7.4.4.1 Transferability of Macro-Level CPMs The overall results from testing the proposed guidelines suggested that macro-level C P M transferability was feasible and no more complicated than when done for micro-level CPMs, subject to data availability. 7.4.4.2 Adequate Data Points A question on macro-level C P M transferability feasibility for a successful model refit concerned what number of adequate data points (say 15 to 20) was necessary. The successful Langley transferability results using only 20 data points suggests that this was not a limiting factor. Moreover, in the original model development only 92 rural zones were used in model development, yet rural CPMs still met goodness of fit tests. Additional research would be useful in confirming some recommended minimums related to data points for model transferability. Lovegi'ove 184 7.4.4.3 Goodness of Fit z Statistic The "0.00" values for all but one z statistic suggested that this good fit may have been due to the fact that the transfer data was a subset of the original model development database. However, it could also have meant that some statistical property of macro-level CPMs was screening the true fit of the transferred models. As this result seemed too good to be true, further research would be advised to confirm that the z statistic was indeed an appropriate goodness of fit measure for macro-level C P M transferability. 7.5 Summary The research on macro-level C P M development and macro-reactive applications in previous chapters of this thesis were the basis for development and testing of proposed proactive model-use guidelines, including the use of macro-level CPMs in three road safety planning applications. First, a regional-level safety evaluation of TransLink's 3 Year Transportation Plan was conducted by incorporating the macro-level CPMs into the G V R D Emme/2 regional transportation planning model. Second, two neighbourhood-level planning studies were conducted, one on the safest street pattern in a neighbourhood, and the other on the safest size of a neighbourhood's core. Third, a transferability study was done to calibrate Vancouver macro-level CPMs for use in Richmond and Langley. The results of these three proactive road safety applications are summarized below, including several lessons to be learned in appropriate use of the CPMs. Further research was also identified that could assist in the standardization of the application of macro-level CPMs, as well as advancing their use such that road safety is systematically included in the transportation planning process. 7.5.1 Regional Planning The results of the regional planning study were in line with intuitive expectations suggesting a significant regional reduction in road collisions, and demonstrating that integration of the macro-level CPMs into Emme/2 was feasible, with several lessons learned. First, the congestion variable (VC) significantly improved model fit and prediction results in line with intuitive expectations, suggesting that modeled CPMs developed without a V C variable should not be Lovegi'ovc 185 used. Second, excluding highways in the original C P M development meant that the predictions for regional analysis were not absolute estimates of regional levels of safety, and had to be augmented by an estimate of highway collisions i f an absolute collision estimate was needed. Further research is needed to verify the best way to incorporate highway data, until which time safety evaluations should be focused on only relative comparisons between scenarios. Third, it was important to use only the most appropriate models, to minimize data intensiveness. Fourth, thematic mapping using GIS software helped to gauge zones of influence when defining analysis boundaries, as well as gauging variability in the results, but needed to be carefully and consistently formulated with appropriate scale ranges. Fifth, the proposed guidelines on C P M use with Emme/2, as used in this study, utilized a multi-platform approach employing multiple GIS and spreadsheet software tools. A single integrated module where data assembly could be centralized and C P M results calculated to ensure a standardized and consistent approach to the application of CPMs in the planning process should continue to be pursued in future research. Finally, the proposed guidelines appeared sound, and provided a holistic, standardized process for the application of the CPMs, from data acquisition, to interpretation of collision prediction results. With further research and refinement, they should help to increase ease of model use and reliability, thereby helping to enhance the validity of C P M use in the regional planning process. 7.5.2 Neighbourhood Planning The results of the two neighbourhood planning case studies confirmed the feasibility of the recommended guidelines, with several lessons and areas for further research. First, it appeared that the two conceptual networks, labelled as the modified Dutch SRS and 3-way Offset networks, were safer than the conventional grid and cul-de-sac neighbourhood street patterns, due mostly to the increased use of three-way intersections. Further research on these network types is recommended to verify that they promote sustainable road safety in balance with other community goals such as retrofit cost, mobility, and accessibility. Second, selection of appropriate and adequate numbers of models outweighs concerns over data extraction efforts in small sample neighbourhood planning analyses, where statistically significant and absolute collision estimates are required more often than relative comparisons. Careful consideration of Lovegrove 186 all possible trigger variables at work in neighbourhood planning will help to ensure adequate numbers of appropriate models are selected. Third, the reasonableness of the pseudo-sensitivity results from creation of test networks, using control variables and simultaneous adjustment of multiple trigger variables, should be considered in future safety research. Fourth, although no optimum core size was found, the range of core sizes in the G V R D ranges from near zero to near 400 hectares, apparently with increasing safety benefits, and should be the subject of additional research. 7.5.3 Transferability The results of the transferability case study suggested that the proposed macro-level C P M transferability guidelines were reasonable, and that macro-level C P M transferability was feasible, with several lessons. First, a small number of available data points for model calibration did not have a significant impact on goodness of fits, with as few as 20 points being successfully used. Research is needed to verify and recommend minimum sample sizes for macro-level C P M transferability to rural areas, and to smaller communities. Second, further research should be done to verify that the z statistic is appropriate for macro-level CPMs, as the results were unexpectedly good, with all but one value of z equal to 0.00. Despite promising initial results in field testing, macro-level C P M development and use is at a very early stage. Therefore, the results of this research should be used with caution, pending additional research. Lovegrove 187 8. CONCLUSIONS, CONTRIBUTIONS & FUTURE RESEARCH 8.1 Introduction To discuss conclusions, contributions, and future research, this chapter has been split into three main sections. In Section 8.2, a thesis summary is presented together with the main research conclusions. In Section 8.3, three contributions are highlighted, including justifications on how they add to the current level of knowledge in the field of macro-level CPMs. In Section 8.4, future research topics are recommended, related to model development and use. 8.2 Summary & Conclusions The main purpose of this thesis was to develop macro-level CPMs, and guidelines for their proactive use by planners and engineers, so that road safety could be explicitly considered and reliably estimated in all stages of the road planning and design process. The motivation for this research arose from the need to reduce the number and severity of road collisions below the levels achieved using the traditional road safety engineering approach. The traditional approach has been to address road safety in reaction to existing collision histories, and has proven to be very successful in improving safety at individual sites. However, the requisite collision histories and often costly safety retrofits associated with this reactive approach continue as an enormous burden on existing communities, and the number and severity of collisions remain at unacceptably high levels. Therefore, to reduce these levels road safety authorities and researchers have begun to pursue more proactive engineering approaches. The proactive engineering approach to road safety improvement has focused on predicting and improving the safety of planned facilities, to preclude black spots from occurring. However, while reliable empirical tools have been developed to support the reactive engineering approach (e.g. micro-level CPMs), none exist for proactive road safety applications. Some researchers have tried to apply micro-level CPMs to conduct safety planning analyses, but the empirical limitations of micro-level models have precluded success. Therefore, a research gap existed Lovegrove 188 between what was needed and what was available in terms of reliable empirical tools to facilitate the proactive engineering approach to road safety improvement programs. To address this gap in empirical tools for proactive safety applications, research has been focusing on macro-level CPMs. While early Dutch and North American research efforts have shown promise, the gap has remained as a lack of reliable macro-level CPMs, and of guidelines for their use. Therefore, to fill that gap, three research objectives were identified for this thesis. A summary of each objective follows, together with research approach, results, and main conclusions. One objective of this research was to develop Macro-Level Collision Prediction Models and test their significance in response to two research problems identified in the literature, including 1) a lack of data, and 2) a lack of clear methodology. First, sufficient data of sound quality is a cornerstone of well-fit and reliable statistical models, and helps to ensure that the resulting statistical associations reflect underlying causal mechanisms. However, there are barriers that must be overcome to obtain adequate datasets for model development, including: disparate and non-integrated databases due to multi-jurisdictional boundaries; and, legal issues that discourage data sharing and central warehousing of road collision databases. To overcome these barriers, reviews of possible data sources and data extraction techniques were conducted with UBC, ICBC, and TransLink transportation engineers. Several data extraction sources were identified, including geo-coded databases at ICBC (1996-1998 collision claims), TransLink (1996 digital road map and exposure data), and the G V R D (1996 land use map), and demographic databases at Census Canada (1996 socio-demographics and mode splits by zone). To extract the data with a manageable level of effort, taking into account research objectives, geographic scope, and community-based focus, the 577 TAZs used in the GVRD's Emme/2 transportation planning model were selected as the data aggregation unit. Aggregation did introduce the risk of aggregation bias, which has been known to screen underlying causal mechanisms. In order to minimize this risk, data was stratified into the nine collision types shown in Table 3.1, and the sixteen model groupings shown in Table 3.3. Lovegrove 189 The second model development problem related to lack of a recognized methodology, due to a lack of research on this topic. Only two studies had been found on macro-level C P M development, each based on different model forms and regression methods. Moreover, the results from these two studies included few developed macro-level CPMs, with no demonstration of their use in safety planning applications. It was important to have a clear methodology to develop reliable models using relevant, practical, planning-level descriptors in order to ensure model effectiveness, and to facilitate use by practitioners. Therefore, as an initial step to overcoming this model development problem, an extensive literature review was conducted to find possible variables and best practises for model development and goodness of fit testing. Using a screening framework to select the most appropriate variables out of the 220 possible variables found in the literature, the sixty-three candidate variables shown in Tables 3.1 and 3.2 were identified for testing in model development. After extracting and aggregating the data to valuate these candidate variables, they were integrated into a single dataset, and a G L M regression process was followed for model development. The G L M process followed a forward stepwise procedure assuming an N B error distribution, and using a 95% desired level of confidence to assess model goodness of fit. Model fit was assessed using SD and Pearson % 2 statistical measures, and, where deficient, refined with an outlier analysis using C D statistical measures. Following this regression process, the 47 macro-level CPMs presented in Tables 4.1 through 4.6 were successfully developed, including at least one in each of the original stratifications. Each macro-level C P M uses one or more explanatory variables to estimate the three-year expected mean frequency in a neighbourhood (or TAZ) of a particular collision type - Total, Severe, A M Rush Hour, A M / P M Rush Hours, AM/PM-Severe, Non-Rush Hour, Pedestrian, and Bicycle. Based on these model development results, three main conclusions were drawn. First, the results showed that it was possible to quantify, on a macro-level or neighbourhood scale (i.e. an entire TAZ), a statistically significant association between road collisions (i.e. number, severity, and time periods), and the specific traits of a neighbourhood (i.e. traffic exposure, S-D, T D M , and road network). Second, models were developed to predict collisions in either urban or rural areas, which suggested that practitioners in most urban and rural communities should be able to Lovegrove 190 conduct proactive neighbourhood safety evaluations. Third, nearly half of these new CPMs were developed using only measured data, which suggested that practitioners without access to major transportation planning model resources (e.g. Emme/2) would be able to use the models. However, while data and methodological problems appeared to have been addressed in successful development and fitting of these 47 macro-level CPMs, additional research was needed to promote proper and consistent use of these resulting models, as empirical tools that planners and engineers could rely on in proactive safety planning. Therefore, a second objective of this research was to develop guidelines for the use of macro-level CPMs. In an initial review of the literature on model use, despite finding studies on model development, none were found on macro-level C P M use in road safety applications. However, practical guidelines for model use were important in order to promote consistency in proactive road safety planning practises. Moreover, consistent practises would then promote reliable and comparable estimates of the level of road safety, which would also provide valuable quality data for further research and refinement of proactive empirical tools. Therefore, based on a review of methods for micro-level C P M use in reactive safety applications, guidelines were proposed on macro-level C P M use in reactive applications (macro-reactive applications). The proposed guidelines contained recommendations on selection of appropriate models, summarized in Tables 5.1 through 5.5, and, recommendations on how to use models to enhance conventional black spot programs. To facilitate macro-reactive analyses, guidelines for estimation of collision modification factors (CMFs) for macro-level remedies were also proposed. Based on the lessons learned in development and testing of the proposed macro-reactive use guidelines, proactive model use guidelines were proposed for regional-level and neighbourhood-level safety planning analyses. In addition to model selection, the proactive use guidelines made recommendations on determining the zones of influence, on data extraction, and on interpretation of results. Finally, to facilitate model use in different space-time regions, guidelines were also proposed for model transferability. However, without testing the proposed guidelines, their practical applicability could not be verified. Therefore, a third objective of this research was Lovegrove 191 to test the proposed model use guidelines by demonstrating the use of macro-level CPMs in macro-reactive and proactive safety applications though several case studies. The testing of model use guidelines was important to do in order to demonstrate both their usefulness and that of the macro-level CPMs as effective and reliable empirical tools for planners and engineers in safety applications. The first tests were presented in Chapter Six, wherein the macro-level CPMs and guidelines were used in two macro-reactive case studies. In the first case study, a macro-level black spot analysis was conducted to identify CPZs in the G V R D , to diagnose zonal safety problems, and to recommend possible remedies, in a manner similar to but significantly different from the conventional micro-reactive approach, as highlighted in Figure 5.1. In the second macro-reactive case study, a macro-level C M F (macro-CMF) estimate was derived for the area-wide safety benefits of traffic calming. From these macro-reactive case studies, five main conclusions were drawn. First, the macro-reactive model use guidelines appeared relatively practical and straight forward for use by practitioners. Second, the models appeared capable of enhancing the conventional reactive approach by earlier identification of neighbourhood road safety problems, without the need for all roads and intersections in that neighbourhood to have a documented collision history. Third, the dual indicator diagnosis technique, which augmented the conventional over-represented collision pattern review with a second review including trigger variables, appeared to enhance identification of zonal safety problems and remedies. Moreover, CPMs from each of the four explanatory variable themes appeared to help identify effective remedial safety treatments that might not otherwise have been considered by conventional methods. Fourth, the guidelines on macro-CMF estimation appeared reasonable, with a level of effort and complexity similar to that using micro-level CPMs in the conventional OR method. Moreover, the ability to perform direct measurement of variables for use with measured CPMs was convenient, and removed zonal boundary and size restrictions from the analysis, permitting the analysis of neighbourhood areas different from zonal areas. Overall, it appeared that macro-level CPMs together with their macro-reactive use guidelines were able to provide another reliable decision-aid tool (i.e. in addition to micro-level CPMs) for planners and engineers involved in reactive road safety improvement programs. Lovegrove 192 In addition to demonstrating model and guideline use in macro-reactive safety applications, the primary objective of this research was to demonstrate their use in proactive safety applications. Therefore, in Chapter Seven, the models and proactive use guidelines were tested in four case studies. In the first case study, a regional planning application was carried out to gauge the ability of the guidelines and models to predict road safety risk levels as part of a strategic level, regional transportation planning process. In the second and third, two neighbourhood planning applications were carried out to verify the results of earlier Dutch SRS neighbourhood-level safety planning evaluations on road networks and core size. In the fourth case study, the transferability of macro-level CPMs between different time-space regions was tested. Based on those four proactive case studies, seven conclusions were drawn. First, as the results of the regional planning study were in line with intuitive expectations, it appeared that integration of the macro-level CPMs into a regional transportation planning model such as Emme/2 was feasible. Second, as the congestion variable (VC) significantly improved model fit and prediction results in line with intuitive expectations, it appeared that modeled CPMs developed without a V C variable should not be used. Third, it appeared that the use of thematic mapping using GIS software as a technique to gauge zones of influence when defining planning analysis boundaries, as well as gauging variability in the results, might be a useful new safety planning technique. Fourth, the results of excluding highways in the original C P M development revealed that the predictions for regional analysis were not absolute estimates of regional levels of safety. To be accurate, they needed to be augmented by estimates of highway collisions. However, in neighbourhood planning analyses, the results appeared to indicate that zonal predictions were reasonable estimates of the absolute safety level when calculated for a small number of zones. Fifth, it was important to minimize data intensiveness in regional planning analyses, by selecting only the most appropriate models (i.e. if in doubt, select fewer CPMs). However, in neighbourhood planning analyses, where statistically significant and absolute collision estimates were required more often than relative comparisons, the need to select appropriate and adequate numbers of models appeared to outweigh concerns over data extraction efforts (i.e. if in doubt, select more CPMs). Sixth, the use of more than one macro-level C P M in a neighbourhood safety planning study allowed for pseudo-sensitivity safety analyses of test networks, revealing a technique that appeared to have potential in future safety planning research. Seventh, based on Lovegrove 193 results of the transferability case study, it appeared that macro-level C P M transferability was feasible. Overall, it was concluded that the proposed guidelines appeared sound as a means to encourage more practitioners to use these CPMs, advancing the standardized consideration of road safety, and enhancing the validity of macro-level CPMs for use in the planning process. However, despite reasonable initial results in this field testing, it was also recognized that macro-level C P M development and use was at a very early stage, and that the results of this research needed to be viewed as initial contributions only, pending additional research. 8.3 Research Contributions Three main contributions of this research are offered, pertaining to the previously identified knowledge gaps. 8.3.1 Development of macro-level collision prediction models as improved and reliable empirical tools for use by planners and engineers in road safety planning Whereas micro-level CPMs have proven invaluable when used in a reactive engineering approach to road safety, they have been limited to uses with a micro-level or single-location focus, which has limited RSIP effectiveness on a wider scale. Therefore, researchers and road authorities recognized that a more effective approach to road safety in the long term was to focus on techniques and tools that facilitated a proactive approach, where road safety was assessed and improved at all stages in the road planning process. In initial proactive studies, researchers tried to use micro-level CPMs with planning models (e.g. Emme/2). However, empirical limitations of the micro-level CPMs required prohibitively complex and extensive programming steps for use with planning models and analyses, which did not endear itself to practitioners. Moreover, the associated study results were illogical and did not fit with test observations. Research to solve these problems lead researchers to consider other empirical safety planning tools, including macro-level CPMs. Two studies were found on macro-level CPMs, each using different model development methodologies. Although these two studies provided some useful insights into macro-level C P M Lovegrove 194 development, significant issues remained regarding data, model form, and variables that required further research to resolve. In view of these research problems, this thesis has proposed that the development of the 47 macro-level CPMs listed in Tables 4.1 to 4.6 constitutes a significant contribution to the current level of knowledge towards addressing the previously stated proactive empirical tool research gap, for three reasons. First, additional variables have been identified which can act as zonal proxy variables in lieu of exposure data on inner roads, including the original list of 220 possible variables in Tables 3.1 and 3.2 that was narrowed to the 63 candidate variables for testing in this research. Second, for each of the 63 candidate variables identified, an extensive data extraction process was undertaken to maximize data quality and model accuracy, including extensive stratification and use of geo-coded data. In addition to the sixteen stratifications listed in Table 3.3, the collision data was also stratified, making a total of over 200 possible model types that were pursued. Using geo-coded collisions claim data from ICBC to develop the new CPMs was considered a great advance in overcoming many of the traditional data problems associated with local municipal collision databases. This improved data quality benefits other metropolitan areas in providing improved CPMs with enhanced transferability to and comparison with the many other jurisdictions across North America that have also developed central collision databases. Third, using the improved datasets, and following G L M processes recommended in the literature, many new macro-level CPMs were developed, with many enhancements beyond those in previous studies. In addition to revealing nineteen more explanatory variables associated with collisions, four were quantified with inverse associations to collision frequency, including: family size (FS), core size (CORE), three-way intersections (13 WP), and local-road-lane-kilometres (LLKP). Moreover, macro-level CPMs for use in rural and urban TAZs, and with measured and/or modeled data were developed, which further expand their safety planning applicability. For these reasons, the development of these forty-seven macro-level CPMs as part of this research has been proposed as being a significant contribution towards addressing the previously identified research gap, that of providing reliable empirical tools for use in road safety planning. Lovegrove 195 The next step towards addressing the identified research gap was to develop guidelines for model use. 8.3.2 Development of Recommended Guidelines for the Use of Macro-Level Collision Prediction Models in Road Safety Planning, in ways which complement and enhance traditional road safety improvement programs As noted previously, no guidelines had been found in the literature regarding how to use macro-level CPMs in road safety planning applications. However, guidelines are important to promote standardization of data extraction, model development, result interpretation, empirical comparison, and technology transfer, so that researchers and practitioners can continue to refine empirical tools towards more effective road safety improvement programs. In view of the limited research on macro-level C P M use and the lack of model-use guidelines, this thesis has proposed that the macro-level C P M model use guidelines in Chapter Five constitute a significant contribution to the current level of macro-level C P M knowledge. These proposed model use guidelines include both macro-reactive and proactive uses. Regarding macro-reactive uses, several modifications to conventional micro-reactive techniques have been made, as highlighted in Figure 5.1, related to: model selection, zonal unit of analysis, a new ranking score, a dual indicator diagnostic process, area-wide safety strategies, multi-thematic safety remedies, and macro-CMF estimation. Regarding proactive model use guidelines, three reasons are offered to justify the contribution's significance. First, the development of documented proactive model use guidelines was in itself new, as there has been none found previously. Second, guidelines were developed for regional and neighbourhood safety planning, with documented differences related to zones of influence, data extraction, and interpretation of results. Third, to address issues related to macro-level model transferability, this research also developed recommended guidelines for calibration of transferred macro-level CPMs for use in other time-space regions, including a slightly refined process to that previously recommended in conventional (i.e. micro-level CPM) transferability guidelines. For these reasons, the development of these model-use guidelines has been proposed as being a significant contribution. The next step towards addressing the identified research gap was to test Lovegrove 196 these guidelines, together with the newly developed models, in several safety applications, in order to demonstrate their validity as practical empirical tools for use by practitioners. 8.3.3 Demonstration of the Validity of Macro-Level Collision Prediction Models by Testing of Recommended Model-Use Guidelines in Macro-Reactive and Proactive Safety Applications To conduct an effective demonstration of the newly developed macro-level CPMs and their proposed model-use guidelines, it was necessary to apply them in actual road safety contexts, ideally those involving road safety practitioners familiar with conventional empirical tools and methods. In view of the limited research on demonstrating macro-level C P M use, this thesis has proposed that the results of conducting safety application case studies involving guidelines in Chapter Five, and macro-level CPMs in Chapters Six and Seven constitute a significant contribution, for several reasons. First, this research conducted six case studies that covered most of the safety application areas for which model use guidelines were developed, providing a comprehensive test of validity. Second, each case study used actual data acquired from ICBC, TransLink, Census Canada, and G V R D agencies, to simulate as closely as possible actual conditions. Third, the results of these six case studies demonstrated that the models and proposed model use guidelines appeared to be practical for use by practitioners. Fourth, each particular case study provided significant results, including: early warning of CPZs, innovative multi-thematic safety countermeasure strategies, recognition of the congestion variable (VC) importance, and, safer neighbourhood road networks. For these reasons, these case studies have been proposed as a significant contribution. However, as with any road safety research, the intent of this thesis was to provide sound research contributions on which to base future research efforts, in order to further reduce the social and economic burdens associated with road collisions. Lovegrove 197 8.4 Recommendations for Future Research In addition to proposed research contributions, several topics have been recommended for future road safety research, related to model development and model use. 8.4.1 Macro-Level Collision Prediction Model Development The many newly developed models appeared to have great potential to be applied in a wide variety of safety evaluations and planning contexts. However, while this research may have confirmed the existence of a statistically significant mathematical relationship between the level of road safety and certain community descriptors, several ways have been identified to improve these newly developed empirical tools. First, while the model goodness of fits were reasonable, they could be improved. It would be reasonable to assume that there were systematic as well as random effects affecting model goodness of fit. While the contribution to model fit by random effects cannot usually be accounted for in models, systematic effects can be. Systematic effects can be accounted for in models by considering contributions from factors related to the quality of data, and the omission of explanatory variables. Regarding data quality, in model development it is important to systematically extract data and verify its quality, in order to maximize consistency in valuation of variables used in the multivariate regression process. The extraction and aggregation processes used in this research for both collision and explanatory data used several methods, including manual, modelled (i.e. Emme/2), and automated (i.e. GIS) techniques. Where no automated techniques were available, manual extraction was done, including valuation for core size (CORE) and shortcut capacity (SCC, SCVC) variables, and, to determine which zones were predominantly urban or rural. This manual extraction was a very time consuming and somewhat subjective process that could be shortened and improved by automation i f GIS extraction techniques could be developed. Moreover, although the modeled exposure data was acquired in a semi-automated manner using Emme/2 and GIS software, the process was complex and multi-platform, involving several Lovegrove 198 software packages and manual interventions to integrate the dataset. Ideally, to maximize data quality, all data extraction would involve automated techniques, including those currently done by manual and modelled techniques. Automation would reduce at least one avenue for error in that it would minimize manual intervention in the data extraction process. Therefore, it is recommended that one of the future research focuses be on the identification and refinement of automated methods for the data extraction process. Regarding omitted variables, based on a review of the number of explanatory variables used (29) versus the over 200 possible identified, there is a high likelihood that this systematic factor did influence the model fit results. With further efforts to extract and refine the data for these additional variables, including proper stratification, several additional causal associations may be revealed. From the literature and the results of this research, four variables were identified as having promise for future research, including: o Average home-work travel time (HWT) and trip distance (HWD) were suggested to be inversely associated with collisions, and had been collected in the 1996 Census; however, the data had not been released in time for this research; o The number of bicycle commuters (BIKE), was found in this research to be associated with increased bicycle collisions, but no association with vehicular collisions has yet been revealed. Additional effort to extract data and develop additional CPMs that include B I K E would verify whether this result also holds in urban areas, and whether there is an association between zonal bicycle use and total collisions (i.e. including vehicular). This topic is already being debated widely in sustainable transportation research programs (e.g. Kyoto G H G reduction and T D M programs). o Network variables related to the number and degree of horizontal and vertical road curves in each zone were explored in this research, and initially found to be significant. However, their valuation required extensive manual data extraction and in-situ observations. As the models were refined, curve-related variables were one of the last variables dropped to meet goodness of fit tests. Further research could be pursued to refine and automate road curve data extraction, and to confirm whether a C P M using road curve data is possible. Lovegrove 199 o The geographic scope and scale of TAZs has been identified as having the potential to significantly influence collision models. Specifically, a measure of the efficiency of community development patterns is needed to assess how significant this collision association is, and whether there are more or less efficient community development patterns with regards to road collisions. The benefits of improved data quality and added variables include further revealing of underlying causal mechanisms, which could then be applied in more refined road safety applications. 8.4.2 Macro-Reactive Applications Although initial results in macro-reactive applications suggested that the guidelines and the models had potential to complement and enhance conventional black spot programs, this research was limited by two factors related to a lack of 1) actual assessments, and 2) macro-level CMFs. First, although the limited actual assessments done in this research did confirm some potential value for macro-reactive uses, future research is recommended to expand these actual assessments to more practitioners inside and outside the G V R D . The focus of these assessments could be on the practical value of using the models, and on the clarity of the guidelines, with suggestions for their improvement, including other possible applications and variables. Second, to augment model use, these actual assessments could also identify opportunities to pursue research on additional macro-CMF estimations, perhaps starting with further refinement of the traffic calming C M F estimates calculated in this thesis. 8.4.3 Proactive Applications Although the results of initial case studies appeared to confirm the potential of the developed models and proposed guidelines as reliable empirical tools for road safety planning, three main safety planning areas were identified for further research, related to regional planning, Lovegrove 200 neighbourhood planning, and vulnerable road users. Regarding regional safety planning, in addition to pursuing a simpler modeled exposure data extraction process as was suggested previously, it is recommended that future research also focus on incorporation of highway collisions in regional planning estimates. To estimate all regional safety impacts in absolute terms, additional collision estimates for regional highways need to be added into the analyses. One approach to do this would be to simply augment the zonal /wacro-level C P M results with micro-level C P M predictions for each highway segment in the zone. Alternatively, calibration of the macro-level CPMs could be attempted including data from the highway links. Regarding neighbourhood safety planning, research efforts should be focused on verifying the results from this research found in the road network safety study. Although those results suggested that a neighbourhood using 3-way offset road network could have significantly fewer collisions than neighbourhoods built with conventional road network patterns, additional research would help to verify not only these results, but also whether the 3-way offset network is sustainable with regards to other community objectives such as mobility, accessibility, and liveability. Regarding vulnerable road user levels (i.e. the number of pedestrians and bicyclists), recent research not associated with this thesis using subjective methods found an inverse association between auto insurance claims and bicycle and walking trip levels (Litman, 2002). That is, for each 1.0 % that walking and bicycle mode splits increased, auto collisions decreased by 1.2 %. While that research and others have suggested an inverse association in support of sustainable transportation initiatives, results of the models and significant variables developed in this research suggested that increased bicycle use was associated with increased auto-bicycle collisions. One reason for the contradictory results may have been data-related, as the data used to valuate bicycle use (BIKE) and walking (WALK) variables in this research was based on census surveys, which focused on commute modes to work only (i.e. only 25% of the total daily trips made by residents in a community). To resolve this apparent contradiction, further research could focus on quantifying an association between total (vehicular) collisions and vulnerable road users. One approach could involve improved data extraction methods, including different data stratifications. Another approach might look at a variation in model form, for example, as Lovegrove 201 was done in developing the lone bicycle C P M in this research, wherein the lead exposure variable was dropped. With further research, macro-level CPMs and their guidelines for use can be further refined, leading to enhanced proactive road safety planning, and to continued reductions in the number and severity of road collisions, for the long term benefit of all communities. Lovegrove 202 REFERENCES American Association of State Highways & Transportation Officials (AASHTO, 2001). "A policy on geometric design of highways and streets, Fourth Edition, AASHTO, Washington, D.C., USA. Abdelwahab, W., and Sayed, T. (1993). "Some observations on the use of accident rates in the identification of accident prone locations", Internal Report, Highway Safety Branch, Ministry of Transportation and Highways, British Columbia, Canada. ARRB Transport Research, (1999). "Evaluation of Road Safety Audits, Progress Report 1: Literature Review", Draft report, unpublished, ARRB, Australia. Austroads (2001). "Road Safety Audit", 2nd Edition, Sydney, Australia. Baker, R. J., and Nelder, J.A. (1978). "Generalized Linear Interactive Modeling", Release 3.77 Manual, Oxford: Royal Statistical Society - GLIM Manual. Benjamin, Jack R., and Cornell, C. Allin (1970). "Probability, Statistics, and Decision for Civil Engineers", McGraw-Hill Book Company, New York, Chapters 4, 5, pp. 370-594. Bonneson, J.A., and McCoy, P.T. (1993). "Estimation of safety at two-way stop-controlled intersections on rural highways", Transportation Research Record 1401, Journal of the Transportation Research Board (TRB), TRB, National Research Council, Washington, D.C., pp. 83-89. Box, Paul C. (1976). "Accident Pattern Evaluation And Countermeasures", Traffic Engineering, 46(8), pp. 38-43. Brown, Ivan (1992). "Conflicts Between Mobility, Safety and the Environmental Preservation Expressed as a Hierarchy of Social Dilemmas", IATSS Research, 16(2), pp. 124-8. Buchanan, Colin, (1963). "Traffic in Towns; a study of the long term problems of traffic in urban areas", Reports of the steering group and working group appointed by the Minister of Transport, Ministry of Transport, London, Great Britain. Bureau of Transportation & Communications Economics (BTCE, 1992). "Report 90, Appendix VII: Accident Migration," Commonwealth of Australia, Canberra, Australia, pp. 251-264 Canadian Council of Motor Transport Administrators (CCMTA), (1998). "Road Safety Vision: Making Canada's Roads the Safest in the World", Annual Report, Transport Canada, Ottawa, Ontario, 1998. Centre for Sustainable Transportation (CST), (1998). "Sustainable Transportation: Reflections on the movement of people and of freight, with special attention to the role of the private automobile", Ottawa, Canada. Census Canada (1996), Statistics from the 1996 Census, Government of Canada, Ottawa, Canada. Centre for Research and Contract Standardization in Civil Engineering (CROW, 1997). "Functional requirements for road categorisation", Functionele eisen voor de categorising van wegen, Ede, Netherlands. Centre for Research and Contract Standardization in Civil Engineering (CROW, 1998). "Record 15: Recommendations for traffic provisions in built-up areas", Ede, Netherlands. Lovegrove 203 Centre for Research and Contract Standardization in Civil Engineering (CROW, 2000). "Sustainable Safety in Built-Up Areas", Duurzaam veilige inrichting van wegen binnen de bebouwde kom een gedachtevorming, Ede, Netherlands. Chatterjee, Arun, Everett, Jerry, Reiff, Bud, Schwetz, Thomas, Seaver, William, and Wegmann, Frederick (2003). "Tools for Assessing Safety Impact of Long-Range Transportation Plans in Urban Areas", Report prepared for USDOT by Centre for Transportation Research, University of Tennessee, Knoxville, US. Cox, W. (2003). "How higher density makes traffic worse", The Public Purpose, No. 57, May 2003, www.publicpurpose.com/pp57-density.htm.Novemberl 0,2003. Cross, Frank B. (1998), "Facts and values in risk assessment", Reliability Engineering and System Safety, Volume 59, Elsevier, Northern Ireland, pp. 27-40. Curtis, C, and Aulabaugh, B. (2001). "Does zero road toll make un-liveable neighbourhoods?," Australasian Transport Research Forum, 24th, Tasmania Department of Infrastructure, Energy and Resources, Hobart, Tasmania, Australia, 17 pages. Davis, Gary A. (2004). "Possible aggregation biases in road safety research and a mechanism approach to accident modeling", Accident Analysis & Prevention, Vol. 36, Elsevier Ltd, Amsterdam, The Netherlands, pp. 1119-1127. de Leur, Paul (1998). "Managing Places: Engineering Safety", Recovery, 9(1), Insurance Corporation of British Columbia, North Vancouver, Canada, pp. 8-9. de Leur, Paul (2001). "Improved Approaches to Manage Road Safety Infrastructure", Thesis submitted for Degree of Doctor of Philosophy, Faculty of Graduate Studies, Department of Civil Engineering, University of British Columbia, Vancouver, Canada. de Leur, Paul, and Sayed, Tarek (2001). "The Development of an Auto Insurance Claim Prediction Model for Road Safety Evaluation in British Columbia", Proceedings of The Fourth International Conference on Accident Investigation, Reconstruction, Interpretation and the Law, Navin, F.P.D. editor, August 13-16, University of British Columbia, Vancouver, B.C. de Leur, Paul, and Sayed, Tarek (2002). "Development of a Road Safety Risk Index", Transportation Research Record 1784, Washington, D.C., pp. 33 - 42. de Leur, Paul, and Sayed, Tarek (2003) "A framework to proactively consider road safety within the road planning process", Canadian Journal of Civil Engineering, Volume 30(4), National Research Council, Canada, pp. 711-719. Depue, Leanna, Zogby, John J., Knipling, Ron R., and Werner, Thomas C. (2000). "Transportation Safety Issues", TRB Committee A3B01: Transportation Safety Management, Millenium Paper, TRB, Washington, DC. Dobson, Annette J. (1990), "An Introduction to Generalized Linear Models", 2nd Edition, Chapman & Hall/CRC, Washington, D.C. Elvik, Rune (2001). "Improving road safety in Norway and Sweden: analysing the efficiency of policy priorities", tec, Road Safety, January, pp. 9-16. Elvik, Rune (2001). "Area-Wide Urban Traffic Calming Schemes: A Meta-Analysis of Safety Effects", Accident Analysis & Prevention, Vol. 33 (www.elsevier.com/locate/aap'). Elsevier Ltd, Amsterdam, The Netherlands, pp. 327-336. Lovegrove 204 Evans, Leonard, (1999). "Transportation Safety", Handbook of Transportation Science, Chapter 4, Randolph W. Hall (ed.), Kluwer, pages 63 - 108. Ewing, Reid, and Cervero, Robert (2001). "Travel and the Built Environment: A Synthesis", Transportation Research Record 1780, Washington, D.C., pp. 87-114. Ewing, Reid, Pendall, Rolf, and Chen, Don (2003). "Measuring Urban Sprawl and Its Transportation Impacts", Transportation Research Record 1831, Journal of the Transportation Research Board, TRB, Washington, D.C., pp. 175-183. Federal Highway Administration (FHWA) (1981). "Highway Safety Engineering Studies Procedural Guide", U.S. Department of Transportation, Washington, D.C., pp306-326. Fotheringham, A.S., and Wegner, M. (2000). "Spatial Models and GIS: New Potential and New Models", Taylor & Francis Inc., London, U.K. Freedman, D. (1997). "From association to causation via regression", In: McKim, V., Turner, S. (Eds.), Causality in Crisis, University of Notre Dame Press, Notre Dame, IN, pp. 113-162. Fricker, Jon D., and Whitford, Robert K. (2004). "Fundamentals of Transportation Engineering: A Multimodal Systems Approach", Prentice Hall, New Jersey, USA, pages 305-368. Fridstrom, Lasse, Ifver, Jan, Ingebrigtsen, Siv, Kulmala, Risto, and Thomsen, Lars K. (1995). "Measuring the Contribution of Randomness, Exposure, Weather, and Daylight to the Variation in Road Accident Counts", Accident Analysis & Prevention, Vol. 27 (1), Elsevier Ltd, Amsterdam, The Netherlands, pp. 1-20. Gaspers, Karen, (2004). "On the road to danger: WHO call traffic injuries a 'global public health problem'", Safety + Health, June, National Safety Council, Itasca, Illinois, USA. Geddes, Erica, Hemsing, Suzanne, Locher, Brian, and Zein, Sany (1996). "Safety Benefits of Traffic Calming", Report prepared by Hamilton Associates for the Insurance Corporation of British Columbia, Hamilton Associates, ICBC, Vancouver, BC, Canada. Greater Vancouver Regional District (GVRD, 1993). "TRANSPORT 2021 Report: A Long-Range Transportation Plan for Greater Vancouver", Communications & Education Department, Burnaby, Canada. Greater Vancouver Regional District (GVRD, 1998). "GVRD Emme/2 Transportation Planning Manual", Strategic Planning Department, Burnaby, Canada. Greater Vancouver Regional District (GVRD, 2002). "Geo-coded land use files," Strategic Planning, Burnaby, Canada. Greibe, Poul (2003). "Accident prediction models for urban roads", Accident Analysis & Prevention, Vol. 35, Elsevier, Amsterdam, Netherlands, pp. 273-285. Hadayeghi, Alireza, Shalaby, Amer S., and Persaud, Bhagwant N. (2003). "Macro-Level Accident Prediction Models for Evaluating the Safety of Urban Transportation Systems", Presented at Transportation Research Board Annual Meeting, January, TRB, Washington, D.C. Hadayeghi, Alireza (2002). "Accident Prediction Models for Safety Evaluation of Urban Transportation Network", Masters Thesis, Graduate Department of Civil Engineering, University of Toronto, Toronto, Canada. Lovegrove 205 Hauer, Ezra (1982). "Traffic Conflicts and Exposure", Accident Analysis & Prevention, 14, Elsevier Ltd, Amsterdam, The Netherlands, pp. 352-362. Hauer, E., Ng, J.C.N., and Lovell, J. (1988). "Estimation of Safety at Signalized Intersections", Transportation Research Record 1185, Journal of the TRB, TRB, Washington, D.C., pp. 48-61. Hauer, Ezra (1992). "Empirical Bayes Approach to the Estimation of "Unsafety": The Multivariate Regression Method", Accident Analysis & Prevention, 24(5), Elsevier Ltd, Amsterdam, The Netherlands, pp. 457-477. Hauer, E. (1995). "On exposure and accident rate", Traffic Engineering and Control, 36(3), pp. 134-138. Hauer, Ezra, and Persaud, B. (1996). "Safety Analysis of Roadway Geometric and Ancillary Features", Research Report prepared for the Transportation Association of Canada (TAC), Ottawa. Hauer, Ezra (1997). "Observational Before-After Studies in Road Safety - Estimating the Effect of Highway and Traffic Engineering Measures on Road Safety", Elsevier Science Incorporated, Tarrytown, NY, USA. Hauer, E., Harwood, D.W., Council, F.M., and Griffith, M.S. (2002a). "Estimating Safety by the Empirical Bayes Method: A Tutorial", Transportation Research Record 1784, Journal of the TRB, TRB, Washington, D.C., pp. 126-131. Hauer, E, Kononov, J., Allery, B., and Griffith, M.S. (2002b). "Screening the Road Network for Sites with Promise", Transportation Research Record 1784, Journal of the TRB, TRB, Washington, D.C., pp. 27-32. Herbel, S.B. (2004). "Planning It Safe to Prevent Traffic Deaths and Injury", North Jersey Planning Authority Inc., ("http://www.nitpa.org/planning/rtp2030/safety study/safety study documents/safety_article.pdf), pp. 7-27. Higle, J.L., and Witkowski, J.M. (1988). "Bayesian Identification of Hazardous Locations", Transportation Research Record 1185, Journal of the TRB, TRB, Washington, D.C., pp. 24-36. Hinde, J. (1996). "Negative Binomial Macro Description", Version 1.1, GLIM 4. MSOR Department, University of Exeter, UK. Hirst, W.M., Mountain, L.J., and Maher, M.J. (2004), "Sources of error in road safety scheme evaluation: a method to deal with outdated accident prediction models", Accident Analysis and Prevention, 36, Elsevier Ltd, Amsterdam, The Netherlands, pp. 717-727. Ho, Geoffrey, and Guarnaschelli, Marco (1998). "Developing a Road Safety Module for the Regional Transportation Model, Technical Memorandum One: Framework", ICBC, December, Vancouver, Canada. Ho, Geoffrey, Nepomuceno, John, and Zein, S.R. (1998). "Introducing Road Safety Audits and Design Safety Reviews", Draft Discussion Paper prepared for ICBC, August, ICBC, Vancouver, Canada. INRO Consultants Inc. (INRO) (2003). "Emme/2 User's Guide Manual", Release 9, July, Montreal, Canada. Insurance Corporation of British Columbia (ICBC, 2003), "Crash Claim Statistics for 1996, 1997, 1998: ArcGIS file format", ICBC Road Safety Program, North Vancouver, Canada. Insurance Corporation of British Columbia, (2004). "Traffic Collision Statistics: Police-attended Injury and Fatal Collisions", British Columbia, Canada. Johnson, R.A. (2005). "Miller & Freund's Probability & Statistics for Engineers", 7th edition, Pearson Prentice Hall, Upper Saddle River, NJ, USA. Lovegrove 206 Jordan, Phillip, (2001). "Road Safety Audit - Low Costs, Big Benefits", Editor F.P.D. Navin, Proceedings from the Fourth International Conference on Accident Investigation, Reconstruction, Interpretation and the Law, hosted by the Department of Civil Engineering, University of British Columbia, Vancouver, Canada, pp. 179-183. Khisty, C. Jotin, and Lall, B. Kent (1998). "Transportation Engineering: An Introduction", 2nd Edition, Prentice Hall, Upper Saddle River, NJ, USA, p. 664-696. Kim, Karl, and Yamashita, Eric (2002). "Motor Vehicle Crashes and Land Use: Empirical Analysis from Hawaii", Transportation Research Record, 1784, Journal of the TRB, TRB, Washington, D.C., pp. 73-79. Kmet, Leanne, Brasher, Penny, and Macarthur, Colin (2003). "A small area study of motor vehicle fatalities in Alberta, Canada", Accident Analysis & Prevention, Volume 35, Elsevier Ltd, Amsterdam, The Netherlands, pp. 177-182. Koltzow, K. (1993). "Road Safety Rhetoric Versus Road Safety Politics", Accident Analysis and Prevention, 25(6), Elsevier Ltd, Amsterdam, The Netherlands, pp. 647-657. Koornstra, Matthijs J. (1992). "The Evolution of Road Safety and Mobility", IATSS Research 16(2), 1992, p. 129-148. Krishnamurthy, Sriram, and Kockelman, Kara Maria (2003). "Propagation of Uncertainty in Transportation Land Use Models", Transportation Research Record 1831, Journal of the TRB, TRB, Washington, D.C., pp. 219-229. Kulmala, R., and Roine, M., (1988). "Accident prediction models for two-lane roads in Finland", Proceedings from a Conference on traffic safety theory and research methods, April 1988, hosted by SWOV, Session 4: Statistical analysis and models. SWOV, Amsterdam, Netherlands, pp. 89-103. Kulmala, R.(1995). "Safety at rural three- and four-arm junctions: Development and application of accident prediction models", Dissertation for the degree of Doctor of Technology, Technical Research Centre of Finland, VTT Publication 233, Espoo, Finland. Ladron de Guevara, Felipe, Washington, Simon P., and Oh, Jutaek (2004). "Forecasting Crashes at the Planning Level: A Simultaneous Negative Binomial Crash Model Applied in Tucson, Arizona", Presented at Transportation Research Board 2004 Annual Meeting, January, TRB, Washington, D.C. LaScala, Elizabeth A., Gruenewald, Paul J., and Johnson, Fred W. (2003). "An ecological study of the locations of schools and child pedestrian injury collisions", Accident Analysis & Prevention, Volume 36, Elsevier Ltd, Amsterdam, The Netherlands,, pp. 569-576. Lawless, J.F., (1987). "Negative binomial and Poisson regression", The Canadian Journal of Statistics, 15(3), Statistical Society of Canada, Edmonton, Canada, pp. 209-225. Levine, N., Kim, K.E., and and Nitz, L.H. (1995). "Spatial Analysis of Honolulu Motor Vehicle Crashes: II. Zonal Generators," Accident Analysis & Prevention, 27(5), Elsevier Ltd, Amsterdam, The Netherlands, pp. 675-685. Lim, Clark (2004). "Application of the Lovegrove/Sayed Macro Collision Prediction Models", Unpublished term paper prepared for UBC Civil Engineering 582 course, UBC Civil Engineering, Vancouver, Canada. Lin, Fred C, and Navin, Francis P. D. (1999). "Errors in Transportation Planning", Unpublished research paper, Civil Engineering Department, University of British Columbia, Vancouver, Canada. Lovegrove 207 Litman, Todd (2002). "Transportation Demand Management As A Traffic Safety Strategy", unpublished research paper, Victoria Transport Policy Institute, Victoria, Canada. Lord, Dominique, and Persaud, Bhagwant (2000). "Accident Prediction Models With and Without Trend", Transportation Research Record 1717, Journal of the TRB, TRB, Washington, D.C., pp. 102-108. Lord, Dominique, and Persaud, Bhagwant N. (2004). "Estimating the safety performance of urban road transportation networks", Accident Analysis & Prevention, 36, Elsevier Ltd, Amsterdam, The Netherlands,, pp. 609-620. Luk, James, Rosalion, Natalia, Brindle, Ray, and Chapman, Raeburn (1998). Reducing road demand by land-use changes, public transport improvements and TDM measures - a review. ARR 313, January, ARRB Transport Research Ltd, Vermont South Victoria, Australia Mackay, Murray, (1993). "The Safety Dimension of the Mobility, Environment, Safety Triangle", Proceedings from a Conference on Safety, Mobility and the Environment: Striking the Balance, hosted by the Parliamentary Advisory Council for Transport Safety, U.K., pp. 25-37. Miaou, S., and Lum, H. (1993). "Modelling vehicle accident and highway geometric design relationships", Accident Analysis & Prevention, 25(6), Elsevier Ltd, Amsterdam, The Netherlands, pp. 689-709. Maiou, S.P. (1996). "Measuring the Goodness-of-Fit of Accident Prediction Models", Report No. FHWA-RD-96-040, Federal Highway Administration, McLean, VA. McCullagh, P., and Nelder, J.A. (1989). "Generalized Linear Models", Chapman and Hall, New York. McGuigan, D.R.D. (1982). "Non-junction accident rates and their use in 'black-spot' identification", Traffic Engineering & Control, 23(February), pp. 60-65. Mercer, W. (1995). "Traffic Crash Frequencies and Costs: British Columbia, 1995", submitted to the Value of Life Committee, B.C. Ministry of Transportation & Highway, Victoria, British Columbia, Canada. Miller, Ted R. (1988). "Benefit-Cost Analysis: Past and Future Directions", Proceedings of a Conference on Highway Safety: At the Crossroads. Sponsored by the Highway Division of the American Society of Civil Engineering, Texas, March. Miller, Ted R. Crash Costs for British Columbia (1992). "Economic Analysis Working Paper Prepared for Planning Services Branch", B.C. Ministry of Transportation & Highways, February, Victoria, Canada. Mountain, L., Maher, M., and Bachir, F. (1998). "The Influence of Trend on Estimates of Accidents at Junctions", Accident Analysis and Prevention, 30(5), Elsevier Ltd, Amsterdam, The Netherlands, pp. 641-649. National Association of Australian State Road Authorities (NAASRA) (1988). "Guide to traffic engineering practise", Part 4: Road Crashes, pp.23-33. Navin, F.P.D., Ho, E., and Johnson, M. (1999). "A Model for Road Safety Planning: The Theory", Department of Civil Engineering, University of British Columbia, Vancouver, Canada, pp. 1-16. Navin, F.P.D. (1999). "Model for Road Safety Planning: Theory and Policy Example", Transportation Research Record 1695, Journal of the TRB, TRB, Washington, D.C., pp. 49-54. Nicholson, A.J. (1980). "Identification of hazardous locations", PRU Newsletter No. 66, Ministry of Transport, New Zealand. Lovegrove 208 Nijkamp, Peter (1994). "Roads Toward Environmentally Sustainable Transport", Transportation Research - A, 28 (4), Elsevier, Amsterdam, Netherlands, p. 261-271. Noland, Robert B. (2002). "Traffic Facilities and Injuries: The Effect of Changes in Infrastructure and Other Trends", Dept. of Civil & Environmental Engineering, Imperial College of Science, Technology & Medicine, London, U.K. Norden, Monroe, Orlansky, Jesse,and Jacobs, Herbert (1956). "Application of Statistical Quality-Control Techniques to Analysis of Highway-Accident Data", Highway Research Board, Bulletin 117, National Research Council, pp. 17-31. Numerical Algorithms Group (NAG), (1994). "The GLIM System, Release 4 Manual", Royal Statistical Society, Oxford, Great Britain. Openshaw, S. (1984). "The Modifiable Areal Unit Problem", GeoBooks, Norwich, U.K. Oppe, S. (1982). "Detection and analysis of black spots with even small accident figures", SWOV, Leidschendam, The Netherlands. Oppe, S. (1992) "A comparison of some statistical techniques for road accident analysis", Accident Analysis and Prevention, 24(4), Elsevier Ltd, Amsterdam, Netherlands, pp. 397-423. Persaud, B.N., Lord, D., and Palmisano, J. (2002). "Calibration and Transferability of Accident Prediction Models for Urban Intersections", Transportation Research Record 1784, Journal of the TRB, TRB, Washington, D.C., pp. 57-64. Petch, R.O., and Henson, R.R. (2000). "Child road safety in the urban environment", Journal of Transport Geography, Volume 8, Pergamon, pp. 197-211. Poppe, F. (1995). "Risk Figures in the Traffic and Transport Evaluation Module (EVV): A Contribution to the Definition Study Traffic Safety in EVV", SWOV Report No. R-95-21, Leidschendam, Netherlands. Poppe, F. (1997a). "Traffic Models: Inner Areas and Road Dangers", SWOV Report No. R-97-10, Leidschendam, Netherlands. Poppe, Frank (1997b). "Cost-benefit analysis of a sustainably safe road traffic system", SWOV Report D-97-17, Leidschendam, Netherlands. Poppe, Frank (1997c). '"Sustainably safe' traffic system and accessibility: a pilot project for the Central Netherlands", SWOV Report R-97-40, Leidschendam, Netherlands. Poppe, Frank, and Galjaard, Robert (1997). "A Sustainably Safe Network Structure and Accessibility", Een 'duurzaam-veilige' netwerkstructuur en bereikbaarheid, Colloquium Vervoersplanologisch Speurwerk, SWOV, Rotterdam, DEEL III 1997, Netherlands. Qin, Xiao, Ivan, John N., and Ravishanker, Nalini (2004). " Selecting exposure measures in crash rate prediction for two-lane highway segments," Accident Analysis & Prevention 36, Elsevier Science Ltd, Amsterdam, Netherlands, pp. 183-191. Richardson, Barbara C. (1999). "Toward a Policy on a Sustainable Transportation System", Transportation Research Record 1670, Journal of the TRB, TRB, Washington, D.C., p. 27-34. Lovegrove 209 Roberts, Kelvin, (1998). "Designing Places: Planning Consciousness", Recovery, 9(1), Insurance Corporation of British Columbia, North Vancouver, Canada, pp. 9-10. Rodriguez, L.F. (1998). "Accident Prediction Models for Unsignalized Intersections", Dissertation for the degree of Master of Applied Science, Faculty of Graduate Studies, Department of Civil Engineering, University of British Columbia, Vancouver, Canada. Rothengatter, J.A. (1992). "Controlling the Consequences of Mobility", IATSS Research, 16 (2), p. 158-161. Sawalha, Z., and Sayed, T., (1999). "Accident Prediction and Safety Planning Models for Urban Arterial Roadways", Draft Report for the Insurance Corporation of British Columbia by the Department of Civil Engineering, University of British Columbia, Vancouver, Canada. Sawalha, Z., and Sayed T., (2001). "Evaluating Safety of Urban Arterial Roadways", Journal of Transportation Engineering, 127(2), ASCE, USA, March/April, pp 151-158. Sawalha, Z.A. (2002). "Traffic Accident Modeling: Statistical Issues and Safety Applications", Dissertation for the degree of Doctor of Philosophy, Faculty of Graduate Studies, Department of Civil Engineering, University of British Columbia, April, Vancouver, Canada. Sawalha, Ziad, and Sayed, Tarek (2005a). "Traffic Accident Modeling: Some Statistical Issues", (in publication) CSCE Journal, CSCE, Canada. Sawalha, Ziad, and Sayed, Tarek (2005b). "Transferability of accident prediction models", (in publication) Safety Science, Elsevier Ltd. Sayed, T. (1995). "The Highway Safety Expert System: A New Approach to Safety Programs", Doctoral Thesis, Faculty of Graduate Studies (Civil Engineering), University of British Columbia, Vancouver, Canada. Sayed, T, Abdelwahab, W., and Navin, F. (1995). "Application of fuzzy pattern recognition to the identification of accident prone locations", Journal of Transportation Engineering, ASCE, USA, 121(4), pp. 352-358. Sayed, Tarek (1997). "The Highway Safety Expert System: Diagnosing Accident Prone Locations", Civil Engineering Systems, 14, Overseas Publishers Association, Amsterdam, Netherlands, pp. 251-267. Sayed, T., and Abdelwahab, W. (1997). "Using Accident Correctability to Identify Accident-Prone Locations", Journal of Transportation Engineering, 123(2), ASCE, USA, pp. 107-113. Sayed, T., Navin, F., and Abdelwahab, W. (1997). "A countermeasure-based approach for identifying and treating accident prone locations", Canadian Journal of Civil Engineering, 24, CSCE, Canada, pp. 683-691. Sayed, Tarek, (1998). "New and Essential Tools, Part 1: Selecting Sites for Treatment", Proceedings from Seminar on New Tools for Traffic Safety, Section 3, hosted by the Institute of Transportation Engineers, August 9th, ITE, Toronto, Canada, pp. 1-13. Sayed, T., and Rodriguez, F. (1999). "Accident Prediction Models for Urban Unsignalized Intersections in British Columbia", Transportation Research Record 1665, Journal of the TRB, TRB, Washington; D.C., pp. 93-99. Sayed, T., and de Leur, P. (2001a). "Program Evaluation Report: Road Improvement Program," Prepared for the Insurance Corporation of British Columbia, ICBC, North Vancouver, Canada. Lovegrove 210 Sayed, T., and de Leur, P. (2001b). "Forecasting Traffic Safety", Proceedings from the Fourth International Conference on Accident Investigation, Reconstruction, Interpretation and the Law", hosted by the Civil Engineering Department at the University of British Columbia, August 13-16, 2001, Vancouver, Canada, pp. 187-196. Sayed, Tarek (2002). "Measuring and Evaluating Safety", Proceedings from Course on Traffic Safety Evaluation Techniques, Session 1: Background, hosted by the Insurance Corporation of British Columbia, August, ICBC, Vancouver, Canada, pp. 1-4. Sayed, Tarek, and de Leur, Paul (2004). "Predicting the Safety Performance Associated With Highway Design Decisions: A Case Study of the Sea to Sky Highway", Presented at the 2004 Annual Meeting of the Transportation Research Board, January, TRB, Washington, D.C. Schermers, G. (1999). "Sustainable Safety - A preventative road safety strategy for the future", Ministry of Transport, Rotterdam, Netherlands, December 3. Shen, Joan, and Gan, Albert (2003). "Development of Crash Reduction Factors: Methods, Problems, and Research Needs", Transportation Research Record 1840, Journal of the TRB, TRB, Washington, D.C, pp. 50-56. Swedish National Road Administration (SNRA) (1997). "Zero Vision - from concept to action", SNRA Article No. 88223, Borlange, Sweden. Tanner, J.C. (1953). "Accidents at rural 3-way junctions", Journal of Institution of Highway Engineers, Volume 2, Issue 11, pp. 56-67. Taylor, Michael A.P., and Ampt, Elizabeth S. (2003). "Travelling smarter down under: policies for voluntary travel behaviour change in Australia", Transport Policy, Volume 10, Pergamon, pp. 165-177. TransLink (2002). "Regional Transportation Model Input / Output Files and Digital Road Atlas, geo-coded", Scenario 1996, Strategic Planning Department, Burnaby, Canada. TransLink (2004). "TransLink 2005-2007 Three-Year Plan", Burnaby, B.C., Canada Transport Canada, Health Canada (2004). "Fact Sheet: Road Safety in Canada - An Overview", March, ISBN 0-662-36440-6, Ottawa, Canada. Transportation Research Board (TRB) (1998). "TCRP Report 39: The Costs of Sprawl - Revisited", Washington, D.C. Turner, S. (1997). "Net effects: a short history", in: McKim, V., Turner, S. (Eds.), Causality in Crisis, University of Notre Dame Press, Notre Dame, IN, pp. 23-46. Turner, S., and Nicholson, A. (1998). "Intersection Accident Estimation: The Role of Intersection Location and Non-Collision Flows", Accident Analysis and Prevention, 30(4), Elsevier Ltd, Amsterdam, The Netherlands, pp. 505-517. US Department of Transportation (USDOT), (2000). "Traffic Facts 2000-Overview", National Highway Traffic Safety Administration, Washington, D.C. US Department of Transportation (USDOT, 2001). "Safety Conscious Planning", Parts I, II, and Appendices, Federal Highway Administration, www.fhwa.dot.gov/planning/scp/ec041 scp 1 .htm. US Department of Transportation (USDOT, 2003a). "Traffic Facts 2003-Overview", National Highway Traffic Safety Administration, Washington, D.C. Lovegrove 211 US Department of Transportation (USDOT, 2003b). "Tools for Assessing Safety Impact of Long-Range Transportation Plans in Urban Areas", Office of Metropolitan Planning and Programs, Federal Highway Administration, Washington, D.C. van Minnen, J. (1999). "The suitable size of residential areas: a theoretical study with testing to practical experiences", SWOV Report No. R-99-25, Leidschendam, Netherlands. van Schagen, Ingrid, and Theo Janssen (2000). "Managing Road Transport Risks: Sustainable Safety in the Netherlands", Risk Management in Transport, IATSS Research, volume 24(2), publisher, place, country, pp. 18-27. van Wee, Bert (2000). "Land use and transport: challenges for research and policy making", National Institute of Pubic Health and the Environment (RIVM), Utrecht University, Geographical Sciences, Utrecht, Netherlands. Victoria Transport Policy Institute (VTPI), (2002). "Online TDM Encyclopedia", Victoria, Canada, (www.vtpi.org). Volk, Kevin, Felipe, Emmanuel, Ho, Geoffrey, and Guarnaschelli, Marco (1999). "Developing a Road Safety Module for the Regional Transportation Model", ICBC, November, Vancouver, Canada. Wallace, Brett, Mannering, Fred, and Rutherford, G. Scott. Evaluating Effects of Transportation Demand Management Strategies on Trip Generation by Using Poisson and Negative Binomial Regression. Transportation Research Record 1682, Journal of the TRB, TRB, p. 70 Waller, P. (2000). "Introduction to Safety", Presented at the Transportation Research Board Safety-Conscious Planning Meeting, TRB, Washington, D.C. Wedley, W.C. (1990). "Combining Qualitative and Quantitative Factors: An Analytical Hierarchy Approach", Socio-Economic Planning Science, Vol. 24 (1), Elsevier, Amsterdam, Netherlands, pp. 57 - 64. Weich, Gotz (1992). "Is it possible to manage mobility?", (IATSS) Research 16(2), Journal of the International Association of Traffic and Safety Sciences, Tokyo, Japan, pp. 167-169. Wegman, Fred, (1996). "Sustainable Safety in the Netherlands," SWOV Institute for Road Safety Research, Leidschendam, The Netherlands. Wegman, F. (1997a). "The Concept of a Sustainably Safe Road Traffic System", SWOV Report D-97-2, SWOV Institute for Road Safety Research, Leidschendam, Netherlands. Wegman, F. (1997b). "Cost effectiveness of a sustainably safe road traffic system in the Netherlands", SWOV, D-97-23, Leidschendam, Netherlands. World Health Organization (WHO) (2004). "World report on road traffic injury prevention: summary", Geneva, (www.who.int/world-health-day/2004/infomaterials/world report/en/). Zegeer, C.V., and Dean, R.C. (1977). "Identification of hazardous locations on city streets", Traffic Quarterly 31, ENO Foundation for Transportation, Connecticut, USA, pp. 549-570. Zegeer, Charles V. (1982). "Highway accident analysis systems", NCHRP, 91, Transportation Research Board, Washington, D.C. Zein, Sany R., Geddes, Erica, Hemsing, Suzanne, and Johnson, Mavis (1998). "Safety Benefits of Traffic Calming", Transportation Research Record 1578, Journal of the TRB, TRB, Washington, D.C, pp. 3-7. Zein, Sany R., and Navin, Frank (2000). "Road Safety Engineering: Role for Insurance Companies?" Transportation Research Record 1734, Journal of the TRB, TRB, Washington, D.C, pp. 7-11. Lovegrove 212 APPENDIX A . LIST OF POSSIBLE M A C R O - L E V E L C P M V A R I A B L E S Rating System 2=Gcc4 l=Neutral, 0=8ad A Collisions Souroe Decision 1996 Cheap Quality Srrde PredctarJe UsefU 1 Total Cdlison^yr ICBC Yes 2 2 2 2 2 2 2 Severe (Elisions ICBC Yes 2 2 2 2 2 2 3 Rush hour cdlisions (6:30-9:30 am; 3-6 pm) ICBC Yes 2 2 2 2 2 2 4NorrRushCdlision3 ICBC Yes 2 2 2 2 2 2 5AMTotalCblisionsM ICBC Yes 2 2 2 2 2 2 6 Bcyde/veride Cdlisions ICBC Yes 2 2 2 2 2 2 7 Pedestrian/vfeHde Cdlisions ICBC Yes 2 2 2 2 2 2 8 AM Severe Gdlisions/yr ICBC No 2 2 2 2 2 2 9 Injuy Collisions Police? No 0 0 1 2 2 2 lOMotorcydeCdlisicns ICBC No 2 1 2 2 2 1 11 Fatal Cdlisicns Police? No 0 0 1 2 2 2 12 Acrident location ccen soace fkrrP) Mri? No 0 0 1 0 2 2 B. Scdcr-Derrrjcjraphics Souroe Decision 1996 Cheap Quality Srrde Predictable Ufeeftj 1 Total Population (#, density) Census Yes 2 2 2 2 2 2 2 Households (#, density, per road km) Census Yes 2 2 2 2 2 2 3 Total husbendvvife families by fanrily structure Census Yes 2 2 2 2 2 2 4 Average nunrber of persons per census farrily Census Yes 2 2 2 2 2 2 5Tctal population 15 years and a w l^labcu-force activity Census Yes 2 2 2 2 2 2 6 In the labour force Census Yes 2 2 2 2 2 2 7 6rdcyed(#,% density) Census Yes 2 2 2 2 2 2 8 Uherrdoyed Census Yes 2 2 2 2 2 2 9 Net in trelabour force Census Yes 2 2 2 2 2 2 10 Participation rate Census Yes 2 2 2 2 2 2 11 Urenplo/rtHt rate Census Yes 2 2 2 2 2 2 12Tctalerrrjlcyed labour force 15 years are) ever by rrcde Census Yes 2 2 2 2 2 2 13 Car,tnjck,vanasoh'ver Census Yes 2 2 2 2 2 2 14 Car, mxk, van as passenger Census Yes 2 2 2 2 2 2 15 Public transit Census Yes 2 2 2 2 2 2 16Total (RJITirre&PartTime) Census Yes 2 2 2 2 2 2 17Ftert1ime Census Yes 2 2 2 2 2 2 18 RJI Time Census Yes 2 2 2 2 2 2 19 Average income $ Census Yes 2 2 2 2 1 2 20 Ntedanincorre$ Census Yes 2 2 2 2 1 2 21 wakedtowork Census Yes 2 2 2 2 2 2 22 Bcyde Census Yes 2 2 2 2 2 2 23 Total popJatJon 15 years and ever Census No 2 2 1 2 2 1 24 rjTidcyrrBt-pcpdaticn ratio Census No 2 1 1 2 1 2 25 Total income of pcpJaticn 15 years and ever Census No 2 2 2 0 1 26 Ccnstructjcn (Major Group F) GVRD cry No 1 2 1 1 1 2 27 Retail (Major Group J) GVRDorN No 1 2 1 1 1 2 28 GcMemrrert Services (Major Group N) GVRDorN No 1 2 1 1 1 2 29AcaxnPx4^0^Util(MajaQxx|BQ+R) GVRDorF No 1 2 1 1 1 2 30 Dvarjed(%) Census No 2 2 2 0 0 31 Matorcyde Census No 2 1 1 2 0 2 32 Rirrery (Major Groups A4£tOtO) GVRDaK No 1 1 1 1 1 2 33 Mrufaduring (Major Group E) GVRDorN No 1 1 1 1 1 2 34 Transp /Storage /Cerrmn / Other Util (Nfeocr Groups G+H) GVRDor^ No 1 1 1 1 1 2 35 Wxlesale (Major Group I) GVRDcr^ No 1 1 1 1 1 2 36 Finance, Insurance, Real Hate (Major Groups K-H.) GVRDaN No 1 1 1 1 1 2 37 Business Services (Major Group M) GVRDorN No 1 1 1 1 1 2 38 Population under Age 17/rdW Census No 2 0 0 2 2 1 39 Total famlies of rrwm^rried couples Census No 2 2 2 1 0 0 40 Tctalfam'liesrfcorrrrcrHawoauples Census No 2 2 2 1 0 0 41 Total lone-parent ferrilies by sex of parent Census No 2 2 2 1 0 0 42 high Incorre Rarrilies(%) Census No 1 1 1 1 1 1 Lovegrove 213 Rating System 2=Gaod, l=Neutral, 0=Bad Source Dedsicn 1996 Cheap Cuality Sirrple Predctable Usefu TransLink Yes 2 2 2 2 2 2 TransLink Yes 2 2 2 2 2 2 TransLink Yes 2 2 2 2 2 2 TransLink Yes 2 2 2 2 2 2 TransLink Yes 2 2 2 2 2 2 TransLink Yes 2 2 2 2 2 2 TransLink Yes 2 2 2 2 2 2 TransLink Yes 2 2 2 2 2 2 TransLink Yes 2 2 2 2 2 2 TransLink Yes 1 2 2 2 1 2 TransLink Yes 2 0 1 2 2 2 TransLink Yes 2 0 1 2 2 2 TransLink Yes 2 0 1 2 2 2 TransLink Yes 2 0 1 2 2 2 TransLink Yes 2 1 1 2 2 2 field Trip Yes 2 0 1 2 2 2 TransLink Yes 2 1 1 2 1 2 TransLink Yes 2 1 2 2 2 1 TransLink Yes 2 1 2 2 2 1 Muni's Na 2 1 2 2 1 1 TransLink No 1 1 1 1 2 2 Mri's No 0 0 0 2 2 2 Held Trip No 0 0 2 2 1 2 TransLink No 2 0 1 1 1 2 field Trip No 2 0 1 1 1 2 Mri's No 1 1 1 1 1 2 field Trip No 1 0 2 2 1 1 MJI'S No 1 1 1 0 1 2 field Trip No 2 0 • 1 0 2 2 field Trip No 1 0 1 2 0 2 Muni's No 1 0 0 2 0 1 field Trip No 0 0 1 2 0 2 Mri's No 0 0 0 2 0 2 field Trip No 0 0 2 2 0 1 TransLink No 1 0 1 1 1 1 field Trip No 0 0 0 1 0 2 field Trip No 0 0 2 1 0 1 field Trip No 0 0 2 1 0 1 CNetvok 1 Total Read Klorrelres 2 Undassified Road Klorretres 3 Classified Road Wlorretres 4 Major Road Klometres 5 Mies of Urban Cdlectar (%) 6 Mnor Read KilorrEtres 7 IrtErsecBcn Density (/area) 8Urban reads 9 RLRI roads 10 Sojrals(#, density) 11 Nurrtierof Intersections 12 Irtersecrjcns per road kilcrretre 13 Na of Intersection legs (3 legs) 14 No. of Intersection legs (4 legs) 15 Horizontal Curves (rr/lcn 2V#) 16No./Typeoflanes 17 Major/Mnor intersections IS Soft Horizontal Curve (< 45 degrees) 19 Hard Horizontal Curve (> 45 aegises) 20 Cnr^a//b/\o-way traffic 21 Qid, rrn-gid network within zone 22 No. of rcuTdabcut5/3+ arm inr/ns 23 Scrt dstance == 300,400m (%) 24 Vertical Curves (nylon 7P/d#) 25 Approach grade (%) 26 Road lighting (durrrry) 27 Madan (width, raised, dumrry) 28 Dstance between street lights 29 Topography 30 No. of m'd c^arJ<cro3SV\alks(sigi3li2ed, Uhsigialized) 31Paverrerttype 32 Driveways (#) 33 Lane, Shoulder, Approach widths 34Stop5gns 35 Intersecticn angje 36 Roadside hazard rating 37 Yields 38|Ccutesy Comers D. Exposure EsibteSar Dad si on Cheap Cuality Srrrjle Predictable Useful Known 1 Accessibility (speed cn road segrrerts) TransLink Yes 2 2 2 2 2 2 2 Operating Speed (from Brme/2) TransLink Yes 2 2 2 2 2 2 3Area TransLTnk Yes 2 2 2 2 2 2 4 VKT- AM Peak Period only per day Muri orTn Yes 1 1 2 2 2 2 5M±ility(veW<rrB) TransLink Yes 1 1 2 2 2 2 6 Level of Service (oxgesrjcn, v/Q MricrTn Yes 2 1 1 2 2 2 7 VehdeKDistenoe-Travelled / person TransLink Yes 1 1 2 2 2 1 8 Rual lard use adjacent MriorQ/ Yes 2 1 2 2 2 0 9 Urban land use adjacent MriorQi Yes 2 1 2 2 2 0 10 Jarre/ Length Census No 0 1 2 2 2 1 11 Posted Speed Mri orTn No 0 1 2 1 2 2 12 VeHde Klcmstres Travelled per yeer MricrTn No 0 2 2 2 2 13 Vehde-Dsta-ce-Travelled / household TransLink No 1 1 1 2 2 1 14 Daly buses (%) MricrTn No 1 1 2 2 2 1 15AADT MriaTr; No 0 1 2 1 2 2 16 Regjcnal location |TransLink No 1 1 1 1 1 1 17 Accessibility (delays at intns) MriorTr; No 0 1 1 2 2 1 Lovegrove 214 Rating System: 2 = Good, l = Neutral, 0 = Bad E. T D M ssible Sourc Decision 1996 Cheap Quality Sim pie Predictable Useful 1 Improved Transportat ion Choices 1 Mode split - transit Muni, Cens Yes 2 2 2 2 2 2 2 Mode split - carpool passenger Muni, Cens Yes 2 2 2 2 2 2 3 Mode split - drive Muni, Cens Yes 2 2 2 2 2 2 4 Bus stops TransLink Yes 1 2 1 2 2 2 5 Vehicle Occupancy Field Trip Yes 2 2 2 2 1 1 6 Transit rides/capita TransLink, No 2 2 0 2 2 1 7 Mode split - bikes ^uni, Cens No 2 2 2 2 0 1 8 Mode split - walker Muni, Cens No 2 2 2 2 0 1 9 Bike/transit integration TransLink No 0 2 2 2 2 10 Number or Vehicles ITE No 2 0 0 2 1 11 Bicycle routes, facilities Muni Field No 1 1 1 2 1 1 12 Transit service level CMBC? No 1 2 1 1 1 1 13 Rail substitutes for trucks CN, TransL No 1 2 1 1 1 1 14 Car sharing CAN No 1 2 2 2 0 1 15 Ridership per bus stop, rail station CMBC? No 0 0 0 2 1 1 16 Taxi mode share ?? No 0 1 1 1 0 1 17 Pedestrian path continuity, convenience Field Trip No 1 0 0 1 1 1 18 Shuttle services TransLink No 1 0 1 1 1 1 19 Flexible work weeks Employers No 0 0 1 1 0 1 20 Street frontage with trees, empty lots Field Trip No 0 0 0 0 0 1 21 Sidewalk width, benches Field Trip No 0 0 0 0 0 1 22 Tele-work, tele-shop, tele-study, tele-conference ?? No 0 0 0 0 0 1 23 Guaranteed ride home Employers No 0 0 1 1 0 0 Pricing Incent ives 24 Congestion pricing TransLink No 2 2 2 2 1 2 25 Fuel tax Increases TransLink No 2 2 2 2 0 0 26 High-Occupancy-Toll lanes TransLink No 0 2 2 2 0 1 27 Vehicle Sales Tax by size, efficiency Province No 0 2 2 2 0 0 28 U-Pass, universal transportation pass programs TransLink No 0 2 2 1 0 1 29 Parking pricing TransLink, No 2 0 0 2 0 1 30 Parking cash out Employers No 0 0 0 1 0 1 31 Pay-as-you-drive vehicle insurance ICBC No 0 0 0 0 0 1 Land Use M a n a g e m e n t 32 Core residential area TransLink Yes 2 2 1 2 2 1 33 Shortcut capacity TransLink Yes 2 2 1 2 1 2 34 Shortcut attractiveness TransLink Yes 2 2 1 2 1 2 35 Jobs/Housing ratio within zone GVRD No 2 2 1 1 1 2 36 LU • Industrial / residential / neighbourhood GVRD No 2 1 2 2 2 37 Residential low-speed zones (<50km/h) Muni, Field No 2 2 1 1 1 2 38 Cars per sq. km ITE, Censu No 1 1 1 2 1 2 39 Number of Vehicles per Household ITE No 1 1 1 2 1 2 40 Car free zones Muni No 1 1 2 1 2 41 Smart Growth Muni No 1 2 2 1 1 2 42 Parking spaces/worker Muni No 1 1 1 2 1 2 43 Parking spaces/resident Muni No 1 1 1 2 1 2 44 Bicycle/Pedestrian-friendly development Muni, Field No 1 1 2 2 1 2 45 Arterial-km near rail station TransLink No 1 1 2 2 1 1 46 Quadrilateral block shape (%) TransLink No 2 2 2 1 1 47 Transit oriented development Muni No 1 2 2 1 2 48 Traffic calming Muni, Field No 1 1 1 1 1 2 49 Intersections near bus stops TransLink No 1 0 1 2 1 1 50 Cul-de-sacs/dead-ends near bus stops TransLink No 1 0 1 2 1 1 51 Grid street network near bus stop TransLink No 1 0 1 2 1 1 52 Discontinuous street network near bus stop TransLink No 1 0 1 2 1 1 53 On-street Parking (length, % , dummy) TransLink No 0 0 1 2 1 1 54 No-car households (%) ?? No 0 0 0 2 1 2 54 No-car households (%) ?? No 0 0 0 2 1 2 55 Compact, dense, mixed-use Development GVRD No 0 0 1 0 0 2 56 Homes, workers, jobs, stores near bus, rail Field Trips No 0 0 1 0 0 2 57 LU - intensity, mix, building height(proxy for density) Muni, field No 1 0 1 1 0 1 58 Trips/person - drive, bus, bike, pool, walk ITE NO 0 0 1 0 0 2 59 Distance to nearest park, store, gas station Field Trip NO 0 0 0 0 0 2 60 Employment within 30 min by auto, transit Field Trip No 0 0 0 0 0 2 61 Services within Vi , 3 miles of home Field Trip NO 0 0 0 0 0 2 62 Employees living within Vi , 3 miles of work Field Trip, i No 0 0 0 0 0 2 63 Location-efficient development Muni No 0 1 1 0 0 1 64 LU - campus, downtown, tourist Field trip? No 0 0 0 0 0 1 1 Education & Vehicle Technology 65 New Fuels NRC No 2 2 2 0 0 0 66 Driver Education ICBC, Yello No 1 0 1 1 1 2 67 GPS / GIS automated monitoring TransLink, No 2 1 1 1 0 1 68 Transit marketing TransLink No 1 1 1 1 0 1 69 Commute trip reduction programs TransLink No 1 1 1 1 0 1 70 Non-motorized transportation encouragement Muni No 1 1 1 1 0 1 71 Advanced Traveller Information Systems TransLink No 0 1 2 1 0 1 72 New Vehicle Technologies Automaker No 1 0 0 1 0 1 1 Management Systems 73 Intelligent Transportation Systems TransLink No 2 2 2 0 0 2 74 Smart Highways Province, T No 2 2 2 0 0 2 75 Collision/Incident Management TransLink, No 2 2 2 0 0 2 76 Smart growth policy reforms Muni's No 2 2 2 1 0 2 77 HOV parking, road priority Muni's, Pro No 2 1 1 2 0 2 78 Least-cost planning Muni's No 2 2 2 0 0 1 79 Peak spreading TransLink No 1 1 2 0 0 2 80 Access Management Muni's, Pro No 1 0 0 1 1 2 81 Traffic Flow Improvements Muni's NO 1 0 1 1 1 2 82 Smart Vehicles Automaker No 0 0 1 0 0 2 83 Freight transport management Employers No 1 0 0 1 0 1 Lovegrove 215 APPENDIX B. M O D E L D E V E L O P M E N T GLIM4 OUTPUT S A M P L E . For Model Development in Group 2: Urban, Measured, Exposure Note the input file included commands for two runs, calculating Cook's Distance to refine models, using first a Theta value fixed at previous run value, and then Theta = 0 for final run to confirm goodness of fit and final parameter values. Comments added by G R L into this log file (i.e. not part of G L I M run commands) are shaded in yellow. o] G L I M 4, update 8 for I B M etc. 80386 PC / DOS on 23-Apr-2004 at 17:36:29 o] (copyright) 1992 Royal Statistical Society, London o] e] $C Model "c" Refinement: Exposure, Measured, s3$ e] $C $ o] M L Estimate of THETA = 1.246 o] Std Error = ( 0.0755) o] o] 2 x Log-likelihood = 524302. on 475 df o] 2 x Full Log-likelihood = -5458. o] o] Scaled deviance is 545. on 475 d.f. from 479 observations o] change is -37968. for 0 d.f. o] o] estimate s.e. parameter o] 1 1.128 0.2857 1 o] 2 0.7632 0.06939 L T L K M o] 3 1.099 0.1752 INTD o] 4 4.841 0.8012 SIGD o] scale parameter 1.000 Final model formulation: , , „ „ „ m 0.7632 (1.099W77J+4.84157GD) AM Crashes / 3 Years =3.089471 TLKM e Note: Ln(constant) = 1.128, so constant - ExpfLn(constant)] = 3.089471 in above equation [o] [o] %PE %SE TSTAT CHL2 T A R G E T [o] 1 1.1278 0.28566 3.948 467.9 526.8 [o] 2 0.7632 0.06939 10.998 467.9 526.8 [o] 3 1.0993 0.17524 6.273 467.9 526.8 [o] 4 4.8407 0.80116 6.042 467.9 526.8 [o] correlations between parameter estimates " C " C O M M A N D [o] 1 1.0000 [o] 2 -0.9584 1.0000 [o] 3 -0.2181 -0.0177 1.0000 [o] 4 -0.3213 0.4028 -0.5194 1.0000 [o] 1 2 3 4 Lovegrove 216 [o] [o] linear model: [o] terms: 1+LTLKM+INTD+SIGD OUTLIER A N A L Y S I S , USING COOK'S DISTANCE [o] + + + + [o] I % [o] [o] [o] [o] 0.1 + [o] [o] [o] [o] [o] 0.0 + [o] +— [o] 0. [o] % C D [o] 1 1.134e-03 1000. [o] 2 2.954e-03 1010. [o] 3 2.270e-03 1030. [o] 4 1.107e-03 1040. [o] 5 2.305e-03 1050. [o] 6 2.630e-03 1060. [o] 7 6.915e-04 1070. -+-%2 | 22 %2 % % %%% | %99 99 896 4999999999929%9999588997 36999997958742 + + + + + 2000. 4000. 6000. 8000. ZONE (Remaining outliers have been removed from this appendix, but are still in original log file) [o] 473 8.918e-04 8530. [o] 474 2.499e-04 8540. [o] 475 2.657e-04 8550. [o] 476 3.533e-03 8620. [o] 477 8.493e-04 8690. [o] 478 2.282e-04 8830. [o] 479 5.899e-04 8840. [e] $Finish$ Lovegrove 217 APPENDIX C. S A M P L E LISTING OF U R B A N CPZ RANKINGS. Exposure Exposure SocD SocD TDM TDM Network Network CPZs Modelled Measured Modelled Measured Modelled Measured Modelled Measured Urban 1 2 5 6 9 10 13 14 2 3 4 5 6 7 8 9 10 11 12 2170 13 14 15 16 17 18 19 20 21 22 23 j 24 24 6400 1060 4820 1610 6230 1610 4720 4000 23 3990 6240 3970 2920 6280 5730 4710 5850 22 4820 4340 6230 2220 2920 4250 6360 4230 21 7100 2150 4140 2270 6260 6140 4250 2210 20 5510 5890 4080 4250 6240 7720 6260 2280 19 3780 1560 7100 1010 3010 6230 3130 8300 18 4140 6280 5540 6220 4710 1050 3240 3030 17 1690 4250 2040 2250 6150 6930 6340 2240 16 4150 2310 3060 4230 3430 6270 6000 2160 15 6000 6350 4560 2040 3030 1560 2240 2920 14 4030 6340 3110 8620 3140 6320 2310 5890 13 3110 2240 4710 2000 6400 6330 3430 6260 12 4710 6330 7110 5890 6360 1680 2040 3240 11 2230 2210 2090 6230 4150 1580 3140 6230 10 3410 2230 7160 2090 3270 1060 2160 2120 9 6190 2160 1690 3430 3240 8620 2210 2200 j 8 6480 2200 2060 6280 3110 3240 2230 7  7 4080 2140 6000 6330 3060 6280 2200 6360 6 7110 6320 6480 2260 6480 5890 2120 8620 5 5540 2120 2300 2020 2910 3030 2270 4250 4 7160 2270 2070 2010 1110 3060 6320 2290 3 2000 2170 2010 2060 2000 1010 2170 2170 2 2300 2320 1110 3420 4140 1600 2290 2320 1 1110 2290 2000 2070 2320 2320 2320 6320 Urban 5 6 9 10 13 14 17 18 Exposure Exposure SocD SocD TDM TDM Network Network SPZ's Modelled Measured Modelled Measured Modelled Measured / Lovegrove 218 APPENDIX D. S A M P L E G L M SOFTWARE OUTPUT ON TRANSFERABILITY. $C MODEL: Exposure - modelled$ M L Estimate of THETA = 2.217 estimate s.e. parameter 0.3714 %PE 0.3714 ao 1.450 s.e. 0.06358 1 %SE TSTATA 0.06358 5.842 CHI2 93.45 TARGET 136.6 $C MODEL: Exposure - measured$ M L Estimate of THETA = 2.132 estimate s.e. parameter 5.031 %PE 5.031 153.1 0.06481 1 %SE TSTATB 0.06481 77.63 CHI2 84.11 TARGET 136.6 $C MODEL:Socio-Demographic - modelled$ M L Estimate of THETA = 2.127 estimate s.e. parameter 0.7903 %PE 0.7903 3o 2.204 s.e. 0.06490 1 %SE TSTATC CHI2 TARGET 0.06490 12.18 89.45 136.6 $C M O D E L :Socio-Demographic - measured$ M L Estimate of THETA = 1.931 estimate s.e. parameter 4.601 %PE 4.601 3o 99.57 s.e. 0.06810 1 %SE TSTATD 0.06810 67.56 CHI2 92.11 TARGET 136.6 $C M O D E L : T D M - modelled$ M L Estimate of THETA = 2.166 estimate s.e. parameter 0.5833 %PE 0.5833 1.792 0.06432 1 %SE TSTATE CHI2 TARGET 0.06432 9.069 98.37 136.6 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            data-media="{[{embed.selectedMedia}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0063261/manifest

Comment

Related Items