UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

3D pose estimation from 2D images with applications in Parkinson's disease Bidulka, Luke

Abstract

Estimating 3D human body and hand poses from 2D images is a foundational technology with wide-ranging downstream applications in robotics, human-computer interaction, and healthcare. While the advent of Deep Learning methods has revolutionized the field, real-world applications are impeded by poor performance on in-the-wild data. This thesis proposes an innovative method for improving 3D pose model predictions, particularly on out-of-distribution data, and investigates applying 3D pose estimation to automated motor severity analysis for Parkinson's Disease (PD). Previous works have proposed Test-Time Adaptation (TTA) to bridge the train-test domain gap by refining network parameters at inference, but the absence of ground-truth 3D annotations makes this highly challenging and existing methods typically increase inference times by one or more orders of magnitude. We propose a self-consistency loss which requires no inference-time annotations and introduce a lightweight correction and selective adaptation framework which applies a fast correction on most data, reserving costly TTA for OOD data. Our approach is significantly faster than existing methods, and we demonstrate that it significantly improves the performance of multiple popular pose estimation models on benchmark datasets. Subsequently, we apply 3D pose estimation to video-based PD severity scoring. Previous approaches have mostly proposed handcrafted, clinically-interpretable features in conjunction with traditional machine learning algorithms, as deep learning's potential has not yet been successfully realized in video-based PD motor severity analysis due to the high cost of obtaining clinical expert annotations and the resulting acute lack of public datasets. We first investigate a weakly-supervised approach, combining classical triangulation methods with a multi-view network to fine-tune a 3D body pose estimation model on "timed-up-go" videos. We obtain preliminary results, provide thorough discussion on the difficulties encountered, and outline future recommendations. Next, we consider the "hand movement" PD evaluation task, extracting pose sequences with a state-of-the art 3D hand pose estimation method and fusing black-box deep learning models with medically-interpretable handcrafted features. In addition to fully-supervised methods, we investigate a recent self-supervised time-series modeling approach. While the deep learning methods show promise, we find the traditional handcrafted feature machine learning approach performs best, and provide a discussion on our findings.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International