- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Towards accurate compound annotation in mass spectrometry-based...
Open Collections
UBC Theses and Dissertations
UBC Theses and Dissertations
Towards accurate compound annotation in mass spectrometry-based global metabolomics Xing, Shipei
Abstract
Metabolomics is an emerging omics study that aims to characterize the entire metabolome in a biological system. Mass spectrometry (MS) is a preferred analytical technique for metabolomics research owing to its high sensitivity and highly specific structural information content. However, it remains a longstanding challenge to accurately translate MS signals into chemical language, thus hindering the downstream biological interpretation. This dissertation presents computational strategies contributing to tandem mass (MS/MS) spectral interpretations with the aid of machine learning and statistical approaches. Chapter 1 provides a holistic introduction to MS-based metabolomics and the developed bioinformatic tools for uncovering the unidentified metabolic features in untargeted metabolomics. Chapter 2 describes a novel MS/MS spectral comparison algorithm, Core Structure-based Search (CSS), which searches for structural analogs of unknown MS/MS spectra within the existing MS/MS reference libraries. CSS shows improved correlations with structural similarity in large-scale benchmarking. In Chapter 3, a deep learning-based tool is developed for automated extraction of steroid-like metabolic features from the untargeted metabolomics data by classifying MS/MS fragmentation patterns. This biology-driven metabolomics pipeline enables metabolite characterization and discovery on the compound class level. Chapter 4 depicts the purification of chimeric MS/MS spectra using a random forest model. Purified MS/MS spectra are demonstrated to yield better spectral matching results against MS/MS reference libraries. Chapter 5 describes the systematic analysis of radical fragment ions in MS/MS through MS/MS database mining. Larger than expected percentages of radical ions are present in collision- induced dissociation-based MS/MS; relationships between radical ion percentages and compound classes, chemical substructures and collision energies are also investigated. Chapter 6 discusses a standalone platform, BUDDY, for molecular formula discovery via bottom-up MS/MS interrogation and experiment-specific global peak annotation. BUDDY further integrates machine-learned ranking and significance control, showing improved formula annotation accuracy and lower computational cost than other benchmarking tools. Applying BUDDY on repository- scale recurrent unidentified MS/MS spectra, we discovered >5,000 chemical database-unarchived molecular formulae with high confidence. Overall, this dissertation demonstrates computational contributions to enriching structural insights into MS-based untargeted metabolomics data, thus paving the way for understanding biological mechanisms behind various health disorders and diseases from the perspective of small molecules.
Item Metadata
Title |
Towards accurate compound annotation in mass spectrometry-based global metabolomics
|
Creator | |
Supervisor | |
Publisher |
University of British Columbia
|
Date Issued |
2023
|
Description |
Metabolomics is an emerging omics study that aims to characterize the entire metabolome in a biological system. Mass spectrometry (MS) is a preferred analytical technique for metabolomics research owing to its high sensitivity and highly specific structural information content. However, it remains a longstanding challenge to accurately translate MS signals into chemical language, thus hindering the downstream biological interpretation.
This dissertation presents computational strategies contributing to tandem mass (MS/MS) spectral interpretations with the aid of machine learning and statistical approaches. Chapter 1 provides a holistic introduction to MS-based metabolomics and the developed bioinformatic tools for uncovering the unidentified metabolic features in untargeted metabolomics. Chapter 2 describes a novel MS/MS spectral comparison algorithm, Core Structure-based Search (CSS), which searches for structural analogs of unknown MS/MS spectra within the existing MS/MS reference libraries. CSS shows improved correlations with structural similarity in large-scale benchmarking. In Chapter 3, a deep learning-based tool is developed for automated extraction of steroid-like metabolic features from the untargeted metabolomics data by classifying MS/MS fragmentation patterns. This biology-driven metabolomics pipeline enables metabolite characterization and discovery on the compound class level. Chapter 4 depicts the purification of chimeric MS/MS spectra using a random forest model. Purified MS/MS spectra are demonstrated to yield better spectral matching results against MS/MS reference libraries. Chapter 5 describes the systematic analysis of radical fragment ions in MS/MS through MS/MS database mining. Larger than expected percentages of radical ions are present in collision- induced dissociation-based MS/MS; relationships between radical ion percentages and compound classes, chemical substructures and collision energies are also investigated. Chapter 6 discusses a standalone platform, BUDDY, for molecular formula discovery via bottom-up MS/MS interrogation and experiment-specific global peak annotation. BUDDY further integrates machine-learned ranking and significance control, showing improved formula annotation accuracy and lower computational cost than other benchmarking tools. Applying BUDDY on repository- scale recurrent unidentified MS/MS spectra, we discovered >5,000 chemical database-unarchived molecular formulae with high confidence. Overall, this dissertation demonstrates computational contributions to enriching structural insights into MS-based untargeted metabolomics data, thus paving the way for understanding biological mechanisms behind various health disorders and diseases from the perspective of small molecules.
|
Genre | |
Type | |
Language |
eng
|
Date Available |
2023-04-20
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0431329
|
URI | |
Degree | |
Program | |
Affiliation | |
Degree Grantor |
University of British Columbia
|
Graduation Date |
2023-05
|
Campus | |
Scholarly Level |
Graduate
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International