Targeted feature extraction : a deep learning approach

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Targeted feature extraction : a deep learning approach Tsai, Yiting

Abstract

This thesis details the progressive development of a Machine Learning workflow, aimed towards multi-class problems in the engineering and clinical fields. We select Deep Learning as the basis of this modelling framework, as their universal approximation property renders them agnostic to different types of underlying data structures. We propose an optimal Deep Learning model which extracts interpretable features, capturing the decisive, salient characteristics of each data class. This is accomplished by revising the traditional Deep Learning objective, introducing an additional term which enhances class separation and identity. Using mathematical properties of the discovered latent space, we introduce a Feature Extractor based on weight traceback, which connects the decisive class-specific neurons to the raw variables in the input layer. The efficacy and necessity of the proposed strategy is demonstrated across six total case studies. The first two studies highlight the inconsistency across clusters discovered by traditional Unsupervised Learning models, as well as the misconception of traditional Deep Learning as a magical solution to every problem. The following two studies demonstrate proof-of-concept for the proposed strategy on two Machine Learning benchmark datasets, showing visible improvements in both classification accuracy and feature extraction compared to baseline models. Finally, the remaining two studies explore clinical applications concerning the diagnosis of COVID-19 and Scleroderma patients. In each case, the proposed Machine Learning strategy is compared against traditional, state-of-art models, with respect to class cluster separability, prediction accuracy, and biomarker discovery. The results show clear improvements in each aforementioned area; moreover, computational complexity analysis shows that our method scales linearly with the number of samples in the dataset, and in a linearithmic fashion with respect to the number of raw variables. The main practical contributions of this thesis include a significant improvement in prediction accuracy through the reduction of false discovery rates, as well as the discovery of signature variables which allow for targeted mitigation of undesired conditions.

Item Metadata

Title	Targeted feature extraction : a deep learning approach
Creator	Tsai, Yiting
Supervisor	Gopaluni, Bhushan
Publisher	University of British Columbia
Date Issued	2023
Description	This thesis details the progressive development of a Machine Learning workflow, aimed towards multi-class problems in the engineering and clinical fields. We select Deep Learning as the basis of this modelling framework, as their universal approximation property renders them agnostic to different types of underlying data structures. We propose an optimal Deep Learning model which extracts interpretable features, capturing the decisive, salient characteristics of each data class. This is accomplished by revising the traditional Deep Learning objective, introducing an additional term which enhances class separation and identity. Using mathematical properties of the discovered latent space, we introduce a Feature Extractor based on weight traceback, which connects the decisive class-specific neurons to the raw variables in the input layer. The efficacy and necessity of the proposed strategy is demonstrated across six total case studies. The first two studies highlight the inconsistency across clusters discovered by traditional Unsupervised Learning models, as well as the misconception of traditional Deep Learning as a magical solution to every problem. The following two studies demonstrate proof-of-concept for the proposed strategy on two Machine Learning benchmark datasets, showing visible improvements in both classification accuracy and feature extraction compared to baseline models. Finally, the remaining two studies explore clinical applications concerning the diagnosis of COVID-19 and Scleroderma patients. In each case, the proposed Machine Learning strategy is compared against traditional, state-of-art models, with respect to class cluster separability, prediction accuracy, and biomarker discovery. The results show clear improvements in each aforementioned area; moreover, computational complexity analysis shows that our method scales linearly with the number of samples in the dataset, and in a linearithmic fashion with respect to the number of raw variables. The main practical contributions of this thesis include a significant improvement in prediction accuracy through the reduction of false discovery rates, as well as the discovery of signature variables which allow for targeted mitigation of undesired conditions.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2023-06-22
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0433729
URI	http://hdl.handle.net/2429/85137
Degree	Doctor of Philosophy - PhD
Program	Chemical and Biological Engineering
Affiliation	Applied Science, Faculty of; Chemical and Biological Engineering, Department of
Degree Grantor	University of British Columbia
Graduation Date	2023-11
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Targeted feature extraction : a deep learning approach Tsai, Yiting

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights