- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- BIRS Workshop Lecture Videos /
- Bayesian matrix factorization with side information...
Open Collections
BIRS Workshop Lecture Videos
BIRS Workshop Lecture Videos
Bayesian matrix factorization with side information and application to drug-target activity prediction and gene prioritization Moreau, Yves
Description
Matrix factorization/completion methods provide an attractive framework to handle sparsely observed data, such as the prediction of biological activity of chemical compounds against drug targets, where only 0.1% to 1% of all compound-target pairs are measured. Matrix factorization searches for latent representations of compounds and targets that allow an optimal reconstruction of the observed measurements. These methods can be further combined with linear regression models to create multitask prediction models. In our case, fingerprints of chemical compounds are used as â side informationâ to predict target activity. By contrast with classical Quantitative Structure-Activity Relationship (QSAR) models, matrix factorization with side information naturally accommodates the multitask character of compound-target activity prediction. This methodology can be further extended to a fully Bayesian setting to handle uncertainty optimally, which is of great value in this pharmaceutical setting where experiments are costly. We have developed a significant innovation in this setting, which consists in the reformulation of the Gibbs sampler for the Markov Chain Monte Carlo Bayesian inference of the multilinear model of matrix factorization with side information. This reformulation shows that executing the Gibbs sampler only requires performing a sequence of linear regressions with a specific noise injection scheme. This reformulation thus allows scaling up this MCMC scheme to millions of compounds, thousands of targets, and tens of millions of measurements, as demonstrated on a large industrial data set from a pharmaceutical company. We have developed a Python/C++ library, called Macau, implementing this method and which can be applied to many modeling tasks, well beyond our pharmaceutical setting. We discuss the application of our method to drug-target activity prediction using compound structure fingerprints as side information. We also discuss the application of this method to drug-target activity prediction using high-content imaging assays as side information. Our results suggests that high-content imaging assays can be broadly repurposed for drug-target activity prediction and the broad exploration of chemical space.
Item Metadata
| Title |
Bayesian matrix factorization with side information and application to drug-target activity prediction and gene prioritization
|
| Creator | |
| Publisher |
Banff International Research Station for Mathematical Innovation and Discovery
|
| Date Issued |
2017-03-28T14:25
|
| Description |
Matrix factorization/completion methods provide an attractive framework to handle sparsely observed data, such as the prediction of biological activity of chemical compounds against drug targets, where only 0.1% to 1% of all compound-target pairs are measured. Matrix factorization searches for latent representations of compounds and targets that allow an optimal reconstruction of the observed measurements. These methods can be further combined with linear regression models to create multitask prediction models. In our case, fingerprints of chemical compounds are used as â side informationâ to predict target activity. By contrast with classical Quantitative Structure-Activity Relationship (QSAR) models, matrix factorization with side information naturally accommodates the multitask character of compound-target activity prediction. This methodology can be further extended to a fully Bayesian setting to handle uncertainty optimally, which is of great value in this pharmaceutical setting where experiments are costly. We have developed a significant innovation in this setting, which consists in the reformulation of the Gibbs sampler for the Markov Chain Monte Carlo Bayesian inference of the multilinear model of matrix factorization with side information. This reformulation shows that executing the Gibbs sampler only requires performing a sequence of linear regressions with a specific noise injection scheme. This reformulation thus allows scaling up this MCMC scheme to millions of compounds, thousands of targets, and tens of millions of measurements, as demonstrated on a large industrial data set from a pharmaceutical company. We have developed a Python/C++ library, called Macau, implementing this method and which can be applied to many modeling tasks, well beyond our pharmaceutical setting. We discuss the application of our method to drug-target activity prediction using compound structure fingerprints as side information. We also discuss the application of this method to drug-target activity prediction using high-content imaging assays as side information. Our results suggests that high-content imaging assays can be broadly repurposed for drug-target activity prediction and the broad exploration of chemical space.
|
| Extent |
27.0
|
| Subject | |
| Type | |
| File Format |
video/mp4
|
| Language |
eng
|
| Notes |
Author affiliation: KU Leuven
|
| Series | |
| Date Available |
2019-03-10
|
| Provider |
Vancouver : University of British Columbia Library
|
| Rights |
Attribution-NonCommercial-NoDerivatives 4.0 International
|
| DOI |
10.14288/1.0376752
|
| URI | |
| Affiliation | |
| Peer Review Status |
Unreviewed
|
| Scholarly Level |
Faculty
|
| Rights URI | |
| Aggregated Source Repository |
DSpace
|
Item Media
Item Citations and Data
Rights
Attribution-NonCommercial-NoDerivatives 4.0 International