UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Egocentric monocular construction worker pose estimation for intelligent construction Suen, Wun Ki (Christine)

Abstract

The hazardous working environment of construction workers leads to fatal accidents, whereas it is challenging to balance worker safety and productivity. Egocentric (i.e., first-person view) pose estimation is effective in identifying, localizing, and tracking worker poses under construction scenarios. It can be used as a human-centric approach to reduce fatalities and increase productivity through various downstream applications within Intelligent Construction, such as safety monitoring and human-robot collaboration. However, most of the egocentric pose estimation models are trained with daily life poses and have limited generalization ability to construction poses. The scarcity of publicly available datasets designated for egocentric worker pose estimation further impedes the training and evaluation of existing and newly developed egocentric worker pose estimation models. In this regard, this thesis presents a dual-modules egocentric worker pose estimation approach, which effectively addresses the major issues of self-occlusion and depth ambiguity in worker pose estimation. The proposed approach employs a supervised 2D module to address self-occlusion in construction poses and an unsupervised 3D module to handle depth ambiguity using only monocular-view input. It not only achieves a PCKh@0.5 of 98.7% in 2D pose estimation and a N-MPJPE of 96.43mm in 3D pose estimation, but also demonstrates strong generalization ability in the construction domain. To address the significant gap in egocentric pose estimation dataset for construction environments, this thesis also introduces ICON-Pose, the first open dataset includes thousands of egocentric images and corresponding 2D and 3D worker poses data across 63 action types. ICON-Pose accurately depicts complex construction worker poses with the proposed approach and shows a high level of diversity in action types, data forms, participants, and construction site settings. This dataset has the potential to serve as a general benchmark dataset not only for subsequent analyses in Intelligent Construction but also in the broader field of egocentric vision, particularly for capturing and analyzing out-of-distribution poses. It is expected to support artificial intelligence research in the construction domain along with the egocentric worker pose estimation approach.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International