- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- BIRS Workshop Lecture Videos /
- Principal Component Analysis for microbiome data by...
Open Collections
BIRS Workshop Lecture Videos
BIRS Workshop Lecture Videos
Principal Component Analysis for microbiome data by correcting the measurement errors and sequencing depths Gu, Hong
Description
Data exploratory methods, such as Principal Component Analysis (PCA), cannot properly be directly applied on microbiome data due to the issues of sampling errors and sequencing depths. Under the assumption of Poisson sampling errors, we study the problem of computing a PCA of the underlying Poisson means or a nonlinear transformation of the latent Poisson means. We develop a semiparametric approach to correct the bias of variance estimators, both for untransformed and transformed (with particular attention to log-transformation) Poisson means without any assumptions on the underlying distribution of these means. Furthermore, we incorporate methods for correcting diï¬ erent exposure or sequencing depth in the data. In addition to identifying the principal components, we also address the non-trivial problem of computing the principal scores in this semiparametric framework. Most previous approaches tend to take a more parametric line. For example the Poisson-log-normal (PLN) model approach. We compare our method with the PLN approach and find that our method is better at identifying the main principal components of the latent log-transformed Poisson means, and as a further major advantage, takes far less time to compute. Comparing methods on real data, we see that our method also appears to be more robust to outliers than the parametric method.
Item Metadata
Title |
Principal Component Analysis for microbiome data by correcting the measurement errors and sequencing depths
|
Creator | |
Publisher |
Banff International Research Station for Mathematical Innovation and Discovery
|
Date Issued |
2019-09-17T09:06
|
Description |
Data exploratory methods, such as Principal Component Analysis (PCA), cannot properly be directly applied on microbiome data due to the issues of sampling errors and sequencing depths. Under the assumption of Poisson sampling errors, we study the problem of computing a PCA of the underlying Poisson means or a nonlinear transformation of the latent Poisson means. We develop a semiparametric approach to correct the bias of variance estimators, both for untransformed and transformed (with particular attention to log-transformation) Poisson means without any assumptions on the underlying distribution of these means. Furthermore, we incorporate methods for correcting diï¬ erent exposure or sequencing depth in the data. In addition to identifying the principal components, we also address the non-trivial problem of computing the principal scores in this semiparametric framework. Most previous approaches tend to take a more parametric line. For example the Poisson-log-normal (PLN) model approach. We compare our method with the PLN approach and find that our method is better at identifying the main principal components of the latent log-transformed Poisson means, and as a further major advantage, takes far less time to compute. Comparing methods on real data, we see that our method also appears to be more robust to outliers than the parametric method.
|
Extent |
39.0 minutes
|
Subject | |
Type | |
File Format |
video/mp4
|
Language |
eng
|
Notes |
Author affiliation: Dalhousie University
|
Series | |
Date Available |
2020-03-16
|
Provider |
Vancouver : University of British Columbia Library
|
Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
DOI |
10.14288/1.0389575
|
URI | |
Affiliation | |
Peer Review Status |
Unreviewed
|
Scholarly Level |
Faculty
|
Rights URI | |
Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International