BIRS Workshop Lecture Videos

Banff International Research Station Logo

BIRS Workshop Lecture Videos

Statistical methods for automated analysis of high- throughput protein localization data Moses, Alan


Advances in high-throughput genetics and automated microscopy have led to biological image collections of unprecedented size and scale. These data are biologically rich, but also high-dimensional and highly heterogeneous. Open data analysis questions include: How do we compare quantitative measurements between experiments? How do we identify rare patterns where few (or no) known examples are available for training data? How do we obtain statistical confidence in non-independent measurements? I will describe our efforts to used unsupervised approaches to address several of these challenges. We have developed quantitative, biologically interpretable image-based measurements (features) that we can make for each cell in a microscope image, which allows us to quantitatively compare patterns and perform analysis in the feature space. We found that most previously known subcellular localization patterns can be identified in unsupervised analysis, and that rare, complex patterns of localization can also be identified. We have also explored kernel-based approaches to model cell-cell variability in image data, and use these to perform the first systematic search for genes with cell-cell variability in subcellular localization. In general, I believe that putting the subcellular localization data from images in an unbiased quantitative framework will facilitate discovery and integration with other large- scale biological data.

Item Media

Item Citations and Data


Attribution-NonCommercial-NoDerivatives 4.0 International