UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Feature analysis and in silico prediction of lower solubility proteins in three eukaryotic model systems Chan, Gerard


Regulation of protein solubility, or the ability of proteins to remain soluble within the cell, is an important part of protein homeostasis. This is highlighted with the disruption of protein homeostasis and dysregulation of solubility being associated with various neurodegenerative diseases. Using quantitative mass spectrometry and computational analyses, we identify low solubility proteins under unstressed conditions in three eukaryotic model systems: yeast cells, human neuroblastoma cells, and mouse brain tissue. Using an internal reference, we account for protein abundance, and allow for the analysis of proteins based on their partitioning between the soluble and insoluble fractions, rather than purely on their abundance within the insoluble fraction. We identified several intrinsic traits such as length, disorder, abundance, molecular recognition features, and low complexity regions which are correlated with protein solubility. These features have been previously shown to be associated with protein-protein interactions. This suggests that, under unstressed conditions, lower solubility in proteins may be linked to functional aggregation, rather than aberrant aggregation. We then present two predictors which may be used to predict the in vivo solubility of proteins, built using the many traits examined in this work. The linear regression model is able to give estimates of protein solubility, although proteins near the threshold between low and normal solubility may be misclassified. The Support Vector Machine is able to reliably distinguish between low and high solubility proteins, but is unable to reliably distinguish low and normal solubility proteins. We have identified several traits that distinguish low solubility proteins from other proteins, as well as developed two models that are able to estimate the solubility of proteins.

Item Media

Item Citations and Data


Attribution-NonCommercial-NoDerivs 2.5 Canada