UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Predicting new disease activity from deep grey matter on MRI in early multiple sclerosis using random forests and neural networks : feature selection and accounting for class label uncertainty Tayyab, Maryam


Multiple sclerosis (MS) is an autoimmune disease of the central nervous system with a heterogeneous disease course, making it difficult to predict patient-specific clinical outcomes. Machine learning can potentially improve predictions by feature extraction and/or learning complex relationships. Morphological change in deep grey matter (DGM) structures is a consistent feature in all MS phenotypes, yet the value of DGM imaging features for clinical prediction is largely unexplored. In this thesis, I evaluated the contribution of DGM volumes and deep-learned (DL) features for predicting new disease activity within 24 months of a first clinical demyelinating event. Our data set had two challenging characteristics: highly heterogeneous feature types, requiring a thoughtful exploration of feature selection methods, and 32 out of 140 patient samples had uncertain ground truth labels. We implemented and evaluated 1) random forest (RF) models trained on clinical, demographic and DGM volumes, 2) four feature selection methods for RF training and their impact on model performance, 3) DL models trained on 3D segmentations of DGM nuclei with and without user-defined features, and 4) strategies to account for label uncertainty while training an RF. In a 7-fold nested cross-validation experiment, our best result without accounting for uncertainty (F1-score = 77.57%, SD= 6.60%) was achieved with an RF trained on manually selected features, which outperformed common automated feature selection methods, such as iterative RFs (F1-score = 72.35%, SD=6.91%). The neural network using only deep-learned DGM features achieved a slightly lower F1-score = 73.02% (4.70%), which decreased further when adding user-defined features. When accounting for label uncertainty, the highest performance achieved in the 108 confirmed labels was produced by a probabilistic RF (F1-score = 89.62%, SD=4.90%) trained on all available samples, which was higher than training only on the confirmed labels.

Item Citations and Data


Attribution-NonCommercial-NoDerivatives 4.0 International