3D pose estimation from 2D images with applications in Parkinson's disease

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

3D pose estimation from 2D images with applications in Parkinson's disease Bidulka, Luke

Abstract

Estimating 3D human body and hand poses from 2D images is a foundational technology with wide-ranging downstream applications in robotics, human-computer interaction, and healthcare. While the advent of Deep Learning methods has revolutionized the field, real-world applications are impeded by poor performance on in-the-wild data. This thesis proposes an innovative method for improving 3D pose model predictions, particularly on out-of-distribution data, and investigates applying 3D pose estimation to automated motor severity analysis for Parkinson's Disease (PD). Previous works have proposed Test-Time Adaptation (TTA) to bridge the train-test domain gap by refining network parameters at inference, but the absence of ground-truth 3D annotations makes this highly challenging and existing methods typically increase inference times by one or more orders of magnitude. We propose a self-consistency loss which requires no inference-time annotations and introduce a lightweight correction and selective adaptation framework which applies a fast correction on most data, reserving costly TTA for OOD data. Our approach is significantly faster than existing methods, and we demonstrate that it significantly improves the performance of multiple popular pose estimation models on benchmark datasets. Subsequently, we apply 3D pose estimation to video-based PD severity scoring. Previous approaches have mostly proposed handcrafted, clinically-interpretable features in conjunction with traditional machine learning algorithms, as deep learning's potential has not yet been successfully realized in video-based PD motor severity analysis due to the high cost of obtaining clinical expert annotations and the resulting acute lack of public datasets. We first investigate a weakly-supervised approach, combining classical triangulation methods with a multi-view network to fine-tune a 3D body pose estimation model on "timed-up-go" videos. We obtain preliminary results, provide thorough discussion on the difficulties encountered, and outline future recommendations. Next, we consider the "hand movement" PD evaluation task, extracting pose sequences with a state-of-the art 3D hand pose estimation method and fusing black-box deep learning models with medically-interpretable handcrafted features. In addition to fully-supervised methods, we investigate a recent self-supervised time-series modeling approach. While the deep learning methods show promise, we find the traditional handcrafted feature machine learning approach performs best, and provide a discussion on our findings.

Item Metadata

Title	3D pose estimation from 2D images with applications in Parkinson's disease
Creator	Bidulka, Luke
Supervisor	Wang, Z. Jane; McKeown, Martin J.
Publisher	University of British Columbia
Date Issued	2024
Description	Estimating 3D human body and hand poses from 2D images is a foundational technology with wide-ranging downstream applications in robotics, human-computer interaction, and healthcare. While the advent of Deep Learning methods has revolutionized the field, real-world applications are impeded by poor performance on in-the-wild data. This thesis proposes an innovative method for improving 3D pose model predictions, particularly on out-of-distribution data, and investigates applying 3D pose estimation to automated motor severity analysis for Parkinson's Disease (PD). Previous works have proposed Test-Time Adaptation (TTA) to bridge the train-test domain gap by refining network parameters at inference, but the absence of ground-truth 3D annotations makes this highly challenging and existing methods typically increase inference times by one or more orders of magnitude. We propose a self-consistency loss which requires no inference-time annotations and introduce a lightweight correction and selective adaptation framework which applies a fast correction on most data, reserving costly TTA for OOD data. Our approach is significantly faster than existing methods, and we demonstrate that it significantly improves the performance of multiple popular pose estimation models on benchmark datasets. Subsequently, we apply 3D pose estimation to video-based PD severity scoring. Previous approaches have mostly proposed handcrafted, clinically-interpretable features in conjunction with traditional machine learning algorithms, as deep learning's potential has not yet been successfully realized in video-based PD motor severity analysis due to the high cost of obtaining clinical expert annotations and the resulting acute lack of public datasets. We first investigate a weakly-supervised approach, combining classical triangulation methods with a multi-view network to fine-tune a 3D body pose estimation model on "timed-up-go" videos. We obtain preliminary results, provide thorough discussion on the difficulties encountered, and outline future recommendations. Next, we consider the "hand movement" PD evaluation task, extracting pose sequences with a state-of-the art 3D hand pose estimation method and fusing black-box deep learning models with medically-interpretable handcrafted features. In addition to fully-supervised methods, we investigate a recent self-supervised time-series modeling approach. While the deep learning methods show promise, we find the traditional handcrafted feature machine learning approach performs best, and provide a discussion on our findings.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2024-10-15
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0445563
URI	http://hdl.handle.net/2429/89407
Degree (Theses)	Master of Applied Science - MASc
Program (Theses)	Electrical and Computer Engineering
Affiliation	Applied Science, Faculty of; Electrical and Computer Engineering, Department of
Degree Grantor	University of British Columbia
Graduation Date	2024-11
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

3D pose estimation from 2D images with applications in Parkinson's disease Bidulka, Luke

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights