UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Using big data and machine learning for microscopic delay modelling at construction work zones Morshedzadeh, Yeganeh

Abstract

Work zones are lane closures essential to protect workers during highway maintenance and construction activities; however, they lead to congested traffic movement. To mitigate delay effectively, it is essential to understand factors that impact delay and develop the ability to predict delays accurately. Given the rapid availability of big data, machine learning (ML) models are becoming increasingly popular in traffic flow prediction. Their ability to capture complex non-linear patterns can enhance predictions obtained using traditional tools. This thesis aims to employ ML models to predict delays observed at work zones. This is achieved through developing data-driven predictive models that leverage the growing usage of large-scale data in transportation. The vast dataset was assembled by integrating 15 million travel times observations from probe vehicles over a 700-km corridor with various factors such as weather conditions, work zone design, traffic conditions, and temporal elements gathered from a wide range of sources. The data was used to achieve two primary objectives: (i) identify relationships between various factors and work zone delay and (ii) compare usage of ML algorithms in accurately predicting work zone delays. To achieve the first objective, unsupervised clustering as well as ordinal logistic and mixed-effects regression were used. The findings highlight the proportional impact of the length of transition areas and work zones and the significant effects of environmental factors on work zone delays. To assess the use of ML models in predicting delays, a comparative evaluation of multiple techniques was conducted, highlighting the strengths and weaknesses of each approach. This included artificial neural networks (ANN), decision trees, gradient boosting regression, support vector regression, and ridge regression. Among the models evaluated, the decision tree and ANN achieved the best overall performance with the lowest RMSE of 0.05 and 0.150 minutes and the highest R-squared value of 0.99 and 0.95, respectively. By leveraging accurate delay predictions, transportation agencies can devise effective traffic management plans, such as implementing detours or adjusting signal timings before delays occur. Furthermore, understanding delay-influencing factors enables agencies to better mitigate delays after they occur through targeted measures tailored to the specific influencing variables.

Item Citations and Data

Rights

Attribution-NonCommercial-ShareAlike 4.0 International