- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Evaluating the performance of between-sample heterogeneity...
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Evaluating the performance of between-sample heterogeneity identification algorithms in large-scale flow cytometry data analysis Chen, Yixuan
Abstract
The development and application of machine learning (ML) models for automated flow cytometry (FCM) data analysis necessitate the efficient handling of large datasets. Heterogeneity detection and removal of highly redundant information in FCM training sets are crucial for decreasing the computational time for ML training and increasing the performance of ML algorithms by reducing overfitting. Our research introduces "flowTypeFilter" and "flowSim," novel computational tools aimed at improving the detection of heterogeneity and reducing data redundancy in FCM training sets. To address shortcomings in the state-of-the-art in this area, I contributed to the development and evaluation of flowTypeFilter and the evaluation and application of flowSim. By optimizing the flowType algorithm into flowTypeFilter, we enhanced cell subset detection in line with the HIPC Project's automated gating requirements. Additionally, flowSim is engineered for near duplicate detection (NDD) in bivariate FCM images, focusing on heterogeneity crucial for cell population identification. After testing on annotated and co-mixed datasets and extensive data collection, our algorithms demonstrated high accuracy, sensitivity, and efficiency. Notably, flowSim's filtering capability effectively removed 92.6% of half a million entries due to these being flagged as redundant, underscoring the importance of our computational strategies in large-scale FCM analysis.
Item Metadata
Title |
Evaluating the performance of between-sample heterogeneity identification algorithms in large-scale flow cytometry data analysis
|
Creator | |
Supervisor | |
Publisher |
University of British Columbia
|
Date Issued |
2024
|
Description |
The development and application of machine learning (ML) models for automated flow cytometry (FCM) data analysis necessitate the efficient handling of large datasets. Heterogeneity detection and removal of highly redundant information in FCM training sets are crucial for decreasing the computational time for ML training and increasing the performance of ML algorithms by reducing overfitting.
Our research introduces "flowTypeFilter" and "flowSim," novel computational tools aimed at improving the detection of heterogeneity and reducing data redundancy in FCM training sets. To address shortcomings in the state-of-the-art in this area, I contributed to the development and evaluation of flowTypeFilter and the evaluation and application of flowSim. By optimizing the flowType algorithm into flowTypeFilter, we enhanced cell subset detection in line with the HIPC Project's automated gating requirements. Additionally, flowSim is engineered for near duplicate detection (NDD) in bivariate FCM images, focusing on heterogeneity crucial for cell population identification. After testing on annotated and co-mixed datasets and extensive data collection, our algorithms demonstrated high accuracy, sensitivity, and efficiency. Notably, flowSim's filtering capability effectively removed 92.6% of half a million entries due to these being flagged as redundant, underscoring the importance of our computational strategies in large-scale FCM analysis.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2024-08-28
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0445201
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2024-11
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International