Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Meta-analyses of expression profiling data in the postmortem human brain Mistry, Meeta 2012

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


24-ubc_2012_fall_mistry_meeta.pdf [ 4.09MB ]
JSON: 24-1.0072945.json
JSON-LD: 24-1.0072945-ld.json
RDF/XML (Pretty): 24-1.0072945-rdf.xml
RDF/JSON: 24-1.0072945-rdf.json
Turtle: 24-1.0072945-turtle.txt
N-Triples: 24-1.0072945-rdf-ntriples.txt
Original Record: 24-1.0072945-source.json
Full Text

Full Text

META-ANALYSES OF EXPRESSION PROFILING DATA IN THE POSTMORTEM HUMAN BRAIN by Meeta Mistry  B.Sc., McMaster University, 2005  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF  DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE STUDIES (Bioinformatics)  THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver)  July 2012  © Meeta Mistry, 2012  Abstract Schizophrenia is a severe psychiatric illness for which the precise etiology remains unknown. Studies using postmortem human brain have become increasingly important in schizophrenia research, providing an opportunity to directly investigate the diseased brain tissue. Gene expression profiling technologies have been used by a number of groups to explore the postmortem human brain and seek genes which show changes in expression correlated with schizophrenia. While this has been a valuable means of generating hypotheses, there is a general lack of consensus in the findings across studies. Expression profiling of postmortem human brain tissue is difficult due to the effect of various factors that can confound the data. The first aim of this thesis was to use control postmortem human cortex for identification of expression changes associated with several factors, specifically: age, sex, brain pH and postmortem interval. I conducted a meta-analysis across the control arm of eleven microarray datasets (representing over 400 subjects), and identified a signature of genes associated with each factor. These genes provide critical information towards the identification of problematic genes when investigating postmortem human brain in schizophrenia and other neuropsychiatric illnesses. The second aim of this thesis was to evaluate gene expression patterns in the prefrontal cortex associated with schizophrenia by exploring two methods of analysis: differential expression and coexpression. Seven schizophrenia microarray studies of prefrontal cortex were combined for a total of 153 subjects with schizophrenia and 153 healthy controls. Meta-analysis was conducted with careful consideration for the effects of covariates, revealing a robust list of 98 differentially expressed ‘schizophrenia genes’. Using the same seven schizophrenia datasets, coexpression networks were generated for control and schizophrenia cohorts within each dataset and then combined across studies using a rank aggregation approach. Topological properties of our ‘schizophrenia genes’ were evaluated in the context of each network, highlighting differences in correlation structure of these genes in the control and schizophrenia brain. Together these results converge towards a general conclusion, emphasizing that the integration of postmortem human brain expression profiling data improves statistical power and is particularly useful in detecting subtle yet consistent changes in expression associated with schizophrenia.  ii  Preface Together with my supervisor, Paul Pavlidis, I was responsible for the identification and design of the research program described in this thesis. I was the primary author for every chapter and corresponding publications. My supervisor, Paul Pavlidis contributed study design, supervision, concepts, text and editorial suggestions for all chapters.  A version of Chapter 2 has been published. (Mistry M, Pavlidis P (2010). A cross-laboratory comparison of expression profiling data from normal postmortem human brain. Neuroscience 167:2. 384-95 doi:10.1016/j.neuroscience.2010.01.016).  A version of Chapter 3 has been published (Mistry M, Gillis, J, and Pavlidis P (2012). Genome-wide expression profiling of schizophrenia using a large combined cohort. Molecular Psychiatry. doi: 10.1038/mp.2011.172). Jesse Gillis contributed to Chapter 3 and is a co-author of the corresponding publication. Specifically, Jesse contributed network analysis, interpretation and editorial suggestions for Chapter 3.  For Chapter 4, Jesse Gillis was responsible for the construction of the rank aggregated coexpression matrices. Jesse also contributed significantly to this chapter by advising on subsequent analyses and interpretation of results and providing guidance and editorial suggestions for this chapter.  iii  Table of Contents Abstract ........................................................................................................................................................ ii Preface ........................................................................................................................................................ iii Table of Contents ....................................................................................................................................... iv List of Tables ............................................................................................................................................ viii List of Figures ............................................................................................................................................. x List of Abbreviations and Gene Definitions ........................................................................................... xii Acknowledgements .................................................................................................................................. xv Chapter 1: Introduction .............................................................................................................................. 1 1.1 Thesis Overview ................................................................................................................................... 2 1.2 Neuropsychiatric Illness ..................................................................................................................... 3 1.3 Schizophrenia ........................................................................................................................................ 4 1.3.1 Theories of Pathophysiology ........................................................................................................... 5 1.3.2 Genetic and Environmental Factors ................................................................................................ 7 1.3.3 Insights from Human Brain Studies ............................................................................................... 10 1.4 Postmortem Human Brain Tissue ..................................................................................................... 14 1.4.1 Tissue Heterogeneity ..................................................................................................................... 15 1.4.2 Tissue Quality ................................................................................................................................ 15 1.4.3 Clinical Quality ............................................................................................................................... 16 1.4.4 Demographic Data ......................................................................................................................... 17 1.5 Gene Expression Profiling ................................................................................................................. 18  iv  1.5.1 Profiling Technologies ................................................................................................................... 18 1.5.2 RNA Quality ................................................................................................................................... 19 1.5.3 Limitations of Microarrays .............................................................................................................. 19 1.5.4 Differential Expression ................................................................................................................... 21 1.5.5 Coexpression ................................................................................................................................. 22 1.5.6 Network Analysis ........................................................................................................................... 23 1.6 Meta-analysis ....................................................................................................................................... 24 1.6.1 Meta-analysis of Differential Expression ....................................................................................... 25 1.6.2 Meta-analysis of Coexpression Networks ..................................................................................... 27 1.7 Thesis Chapters Summary ................................................................................................................. 32 Chapter 2: Meta-analysis of the normal human postmortem brain ..................................................... 34 2.1 Introduction ......................................................................................................................................... 34 2.2 Methods ............................................................................................................................................... 35 2.2.1 Data Collection .............................................................................................................................. 35 2.2.2 Regression Analysis ...................................................................................................................... 36 2.2.3 Meta-analysis of Differential Expression ....................................................................................... 36 2.2.4 Validation Analysis......................................................................................................................... 37 2.3 Results ................................................................................................................................................. 37 2.4 Discussion ........................................................................................................................................... 55 Chapter 3: Genome-wide expression profiling of schizophrenia using a large combined cohort ... 60 3.1 Introduction ......................................................................................................................................... 60 3.2 Methods ............................................................................................................................................... 62  v  3.2.1 Data Collection .............................................................................................................................. 62 3.2.2 Data Pre-processing ...................................................................................................................... 63 3.2.3 Data Quality Control ...................................................................................................................... 63 3.2.4 Statistical Modeling ........................................................................................................................ 64 3.2.5 Literature-derived Signatures ........................................................................................................ 64 3.2.6 Enrichment Analysis ...................................................................................................................... 65 3.2.7 Network Analysis ........................................................................................................................... 65 3.3 Results ................................................................................................................................................. 66 3.4 Discussion ........................................................................................................................................... 85 Chapter 4: Gene coexpression network analysis of schizophrenia .................................................... 89 4.1 Introduction ......................................................................................................................................... 89 4.2 Methods ............................................................................................................................................... 92 4.2.1 Data Processing and Quality Control ............................................................................................ 92 4.2.2 Gene Coexpression Networks ....................................................................................................... 92 4.2.3 Random Coexpression Networks .................................................................................................. 93 4.2.4 Network Properties ........................................................................................................................ 93 4.2.5 Schizophrenia Meta-signature Network Analysis .......................................................................... 94 4.2.7 Network Clustering ........................................................................................................................ 96 4.2.8 Enrichment Analysis ...................................................................................................................... 97 4.3 Results ................................................................................................................................................. 98 4.4 Discussion ......................................................................................................................................... 119 Chapter 5: Conclusion ............................................................................................................................ 123  vi  5.1 Summary of Major Findings ............................................................................................................. 123 5.2 Contribution to Field of Study ......................................................................................................... 125 5.3 Strengths and Limitations ................................................................................................................ 127 5.4 Interpretation of Findings................................................................................................................. 129 5.5 Potential Applications and Future Directions ................................................................................ 132 References ............................................................................................................................................... 135 Appendix .................................................................................................................................................. 150 Appendix A: ‘Core’ meta-signature lists for age, brain pH, PMI and sex ............................................. 150 Age Down-regulated ............................................................................................................................. 150  vii  List of Tables Table 1: DSM-IV-TR Diagnostic criteria for schizophrenia ......................................................................... 29  Table 2: Candidate genes in schizophrenia ................................................................................................ 30  Table 3: Human postmortem brain datasets included in control brain meta-analysis ................................ 43  Table 4: Sample characteristics for control human postmortem brain datasets ......................................... 44  Table 5: Significant genes (q<0.01) identified within each individual dataset ............................................ 44  Table 6: Top meta-signature genes for age, pH, sex and PMI ................................................................... 45  Table 7: Comparison of meta-signature profiles against validation gene sets ........................................... 47  Table 8: Schizophrenia candidate gene analysis ....................................................................................... 47  Table 9: Rank correlations between sample information ............................................................................ 48  Table 10: Evaluating gene overlap between meta-signatures .................................................................... 48  Table 11: Schizophrenia datasets ............................................................................................................... 72  Table 12: Summary of demographic variables across combined cohort .................................................... 72  Table 13: Probe model selection across schizophrenia signatures ............................................................ 73  Table 14: Schizophrenia meta-signatures .................................................................................................. 74  Table 15: Comparison of meta-signatures with findings from original studies ........................................... 80  viii  Table 16: Evaluating meta-signatures against brain-specific gene coexpression modules ....................... 81  Table 17: Whole network properties of the control and schizophrenia brain networks ............................ 105  Table 18: Schizophrenia gene set network properties .............................................................................. 105  Table 19: Gene Ontology enrichment of modules identified by WGCNA-based clustering ..................... 106  Table 20: Gene Ontology enrichment of modules identified by MCODE clustering ................................. 107  ix  List of Figures Figure 1: A simple schematic of mesolimbic and mesocortical circuitry ..................................................... 31  Figure 2: Distribution of dataset p-values across meta-signature q-values ................................................ 50  Figure 3: Distribution of dataset p-values for individual genes: a magnified view ...................................... 51  Figure 4: Top genes down-regulated with age............................................................................................ 52  Figure 5: GO enrichment analysis .............................................................................................................. 54  Figure 6: Investigating the relationship between age and brain pH ............................................................ 54  Figure 7: Example of consistent expression changes for a gene across data sets .................................... 82  Figure 8: Expression changes in the ‘core signatures’ ............................................................................... 84  Figure 9: Connectivity distribution of control and schizophrenia networks ............................................... 108  Figure 10: Shared edges between networks ............................................................................................ 109  Figure 11: Comparison to random network distributions .......................................................................... 110  Figure 12: Comparison of gene set properties to functional GO groups .................................................. 111  Figure 13: Jackknifed network measures ................................................................................................. 112  Figure 14: Network representation of within gene set interactions for schizophrenia meta-signature genes .................................................................................................................................................................. 114  x  Figure 15: Comparison of modules between networks (WGCNA) ........................................................... 115  Figure 16: Enrichment of cell type markers in WGCNA modules ............................................................. 116  Figure 17: Enrichment of cell type markers in MCODE modules ............................................................. 117  Figure 18: Cluster comparison between WGCNA and MCODE clustering algorithms ............................ 118  xi  List of Abbreviations and Gene Definitions ABCA1  ATP-binding cassette, sub-family A (ABC1), member 1  AIC  Akaike information criterion  AMPA ANOVA  α-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid analysis of variance  ANP32A  acidic nuclear phosphoprotein 32 family, member A  APBA2  amyloid beta (A4) precursor protein-binding, family A, member 2  APOD  apolipoprotein D  ATP5C1  ATP synthase, H+ transporting, mitochondrial F1 complex, gamma polypeptide 1  AUC  area under the curve  BA47  Brodmann area 47  BA9  Brodmann area 9  BAZ1A  bromodomain adjacent to zinc finger domain, 1A  BBX  bobby sox homolog (Drosophila)  BDNF  brain-derived neurotrophic factor  CBFA2T2  core-binding factor, runt domain, alpha subunit 2; translocated to, 2  cDNA  complimentary DNA  CNS  central nervous system  CNV  copy number variation  COMT  catechol-O-methyltransferase  COPS7B  COP9 constitutive photomorphogenic homolog subunit 7B  COQ4  coenzyme Q4 homolog (S. cerevisiae)  CRHR  corticotropin releasing hormone receptor 1  CRYM  crystallin, mu  CYP26B1  cytochrome P450, family 26, subfamily B, polypeptide 1  DAOA  D-amino acid oxidase activator  DCAF8  DDB1 and CUL4 associated factor 8  DISC1  disrupted in schizophrenia 1  DLGAP1 DNA  disks large-associated protein 1 deoxyribonucleic acid  DSM-IV  diagnostic and statistical manual of mental disorders version 4  DTI  diffusion tensor imaging  DTNBP1  dystrobrevin binding protein 1  EIF2C3  eukaryotic translation initiation factor 2C, 3  EIF3E  eukaryotic translation initiation factor 3, subunit E  eQTL  expression quantitative trait loci  ERBB3  v-erb-b2 erythroblastic leukemia viral oncogene homolog 3  FBXO9  F-box protein 9  FDR  false discovery rate  FEM  fixed effects model  xii  fMRI  functional magnetic resonance imaging  GABA  gamma-aminobutyric acid  GABBR1  gamma-aminobutyric acid (GABA) B receptor, 1  GABRG2  gamma-aminobutyric acid (GABA) A receptor, gamma 2  GAD65  glutamate decarboxylase 2 (pancreatic islets and brain, 65kDa)  GAD67  glutamate decarboxylase 1 (brain, 67kDa)  GAPDH  glyceraldehyde-3-phosphate dehydrogenase  GBA  guilt-by-association  gcRMA  robust multiarray averaging with GC-content background correction  GEO  gene expression omnibus  GFAP  GNB1L  glial fibrillary acidic protein guanine nucleotide binding protein (G protein), alpha activating activity polypeptide, olfactory type guanine nucleotide binding protein (G protein), beta polypeptide 1-like  GNB5  guanine nucleotide binding protein (G protein), beta 5  GO  gene ontology  GPCR  G-protein-coupled receptor  GSR  gene score resampling  GTPase  enzyme that hydrolyzes guanosine triphosphate  hiPSCs  human induced pluripotent stem cells  KCNK1  potassium channel, subfamily K, member 1  LCM  laser capture microdissection  LC-MS  liquid chromatography-tandem mass spectrometry  LPL  lipoprotein lipase  MAG  myelin associated glycoprotein  MAPK1  mitogen-activated protein kinase 1  MAQC  microarray quality control  MAS5.0  microarray analysis suite 5  MEM  mixed effects model  MHC  major histocompatibility complex  MPSS  massively parallel signature sequencing  MRI  magnetic resonance imaging  mRNA  messenger ribonucleicacid  NGS  next-generation sequencing  NMDA  N-methyl-D-aspartate  NOVA1  neuro-oncological ventral antigen 1  NRG1  neuregulin 1  NSF OLIG2  N-ethylmaleimide-sensitive factor oligodendrocyte lineage transcription factor 2  OPCML  opioid binding protein/cell adhesion molecule-like  OPN3  opsin 3  ORA  over-representation analysis  GNAL  xiii  PAIP2B  poly(A) binding protein interacting protein 2B  PCP  phencyclidine  PCSK2  proprotein convertase subtilisin/kexin type 2  PFC  prefrontal cortex  PKP4  plakophilin 4  PLOD2  procollagen-lysine, 2-oxoglutarate 5-dioxygenase 2  PLP1  proteolipid protein 1  PMI  postmortem interval  PPI  protein-protein interaction  RGS12  regulator of G-protein signaling 12  RGS17  regulator of G-protein signaling 17  RGS4  regulator of G-protein signaling 4  RGS6  regulator of G-protein signaling 6  RGS7  regulator of G-protein signaling 7  RIN  RNA intergrity number  RMA  robust multiarray averaging  RNA  ribonucleicacid  RNS  reactive nitrogen species  ROC  receiver operating characteristic  ROS  reactive oxygen species  rRNA  ribosomal ribonucleic acid  RT-PCR  reverse transcriptase PCR  SAGE  serial analysis of gene xpression  SAM  Significance analysis of microarrays  SLC25A12  solute carrier family 25 (mitochondrial carrier, Aralar), member 12  SLC25A15  solute carrier family 25 (mitochondrial carrier; ornithine transporter) member 15  SMG1  smg-1 homolog, phosphatidylinositol 3-kinase-related kinase (C. elegans)  sMRI  structural magnetic resonance imaging  SMRI  Stanley medical research institute  SNN  stannin  SNP  single nucleotide polymorphism  SYN2  synapsin II  SYNJ1  synaptojanin 1  TACC2  transforming, acidic coiled-coil containing protein 1  TOM  topological overlap matrix  USP19  ubiquitin specific peptidase 19  VCFS  velo-cardio-facial syndrome  VTA  ventral tegmental area  WGCNA  weighted gene coexpression network analysis  WNK1  WNK lysine deficient protein kinase 1  XIST  X (inactive)-specific transcript (non-protein coding)  ZDHHC8  zinc finger, DHHC-type containing 8  xiv  Acknowledgements I would like to thank my graduate supervisor Paul Pavlidis, whose kindness, support and guidance throughout the duration of my graduate studies has been invaluable to me. With his encouragement, patience and his great efforts to explain things clearly, he has helped make this a wonderful learning experience for me. Thanks also to my thesis committee members: – Robert Holt, Clare Beasley and Wyeth Wasserman, whose support and advice have successfully guided me through my graduate career.  I am greatly indebted to the groups and institutions that made their data available, including Karoly Mirnics (Vanderbilt), Vahram Haroutunian (Mt Sinai), the Stanley Medical Research Institute (SMRI) and the Harvard Brain bank. This thesis would not have been possible without their generosity. I would also like to thank Nicole Berchtold (University of California, Irvine), Mehmet Somel (Max Planck Institute for Evolutionary Anthropology), Alice Chen-Plotkin (Center for Neurodegenerative Disease Research), and Elizabeth Thomas (Scripps Research Institute) for providing additional information on their data sets.  I would particularly like to acknowledge Jesse Gillis for his seasoned advice and guidance for much of the work done in Chapter 4. His insightful thoughts on my work and his ability to explain network theory have been incredibly helpful and a large contribution towards the completion of this thesis. Thank you to all my fellow labmates and friends at CHiBi for helping me through the process. The support and encouragement of many fellow graduate students and friends has been indispensable, specifically thanks to: Kelsey Hamer, Leon French, Thomas Sierocinski, Audrey Houillier, Olena Morozova, Shabnam Tavassoli, Katayoon Kasaian, Misha Kanji, Shreena Desai, Audrey Cherryl Mogan, Warren Cheung, Simon Chan, Luke McCarthy, Carri-Lyn Mead, Ben Vandervalk, Reena Grewal and Anna Johnson. I am also very grateful to the Canadian Institute for Health Research and the MIND Foundation of BC for awarding me funding throughout my graduate career.  Finally, heartfelt thanks are owed to my family for their continued understanding and support throughout the process. I would like to thank my sister and her partner Sonal and Neil Ghosh, my brothers Mandip  xv  Mistry and Bhavik Mistry and most of all to my parents for their love and support throughout everything I do in my life.  xvi  Chapter 1: Introduction Like any other biological organ system, the function of the human brain is ultimately determined by the function of the genome, in a complex interplay with the organism’s environment and life history. Therefore it has long been known that normal and disease processes in the brain reflect processes which are either under direct control of the genome or are influenced by genetic variability. As molecular neuroscience has matured, we are able to make the transition from high level structural organization of the brain to detailed maps of genetic influences. The power to investigate and discover genetic and genomic mechanisms underlying the health and disease of the human brain has increased dramatically. Gene expression profiling by use of microarrays has frequently been applied to studying the postmortem human brain [1, 2]. These studies enable detailed investigation of gene expression patterns related to human brain circuitry and aid in our understanding of the precise spatiotemporal regulation of the brain’s transcriptome [3, 4]. Establishing a better understanding of normal human brain processes is critical for elucidating the pathophysiology and etiology of schizophrenia amongst other major mental illnesses. In this thesis, I sought to explore gene expression changes in the normal and diseased human brain by leveraging the large amounts of available microarray data. Combining datasets across studies enabled more powerful statistical analyses to generate findings not observable in single dataset studies. In the diseased human brain, I focused on expression alterations associated with subjects diagnosed with schizophrenia. Schizophrenia is a complex psychiatric disorder that is highly heritable with a strong genetic component [5, 6]. From existing gene expression studies of schizophrenia we are faced with a mixture of diverse and concordant findings. While microarrays provide a valuable means of generating hypotheses, there are challenges that persist in identifying truly reliable gene expression changes associated with psychiatric illnesses. Because the neuropathology being sought is not obvious, we must correctly control for possible factors confounded with the disease to carefully distinguish signal from noise. In my analysis of the normal human brain, I sought to evaluate expression changes associated with such factors.  In this chapter, I introduce the background for my research which combines two challenging topics: (1) the analysis of expression profiling data in the face of substantial sources of noise and small signals and  1  (2) the continuing pursuit to uncover the etiological components of schizophrenia. This introduction reviews: the issues of expression profiling and postmortem human brain, the statistical methods for integration of profiling studies, as well as a summary of past and current research on the pathophysiology of schizophrenia.  1.1 Thesis Overview In this thesis, I apply statistical methods for integrating microarray data across studies, with a particular focus on human postmortem brain tissue. The current literature identifies numerous studies using postmortem brain tissue, and the number of studies that attempt to integrate this data is also on the rise [7-13]. I emphasize that the integration of data across studies increases statistical power and facilitates the identification of more robust changes in expression. The first part of this thesis (Chapter 2) focuses on the normal human cortex, to examine gene expression patterns associated with various factors that can be confounding across eleven independent microarray studies. Expression changes associated with age, sex, postmortem interval, and brain pH are evaluated within each dataset and then combined across datasets using meta-analytical methods. I aim to identify lists of genes significantly associated with each factor as they are of interest in their own right, but careful consideration of these effects will be useful for researchers in the interpretation of future postmortem brain studies. The second part of this thesis (Chapter 3) focuses on evaluating gene expression changes in the prefrontal cortex associated with schizophrenia. My goal was to find statistically significant ‘schizophrenia genes’ that show consistent patterns across seven independent microarray studies. Expression changes associated with schizophrenia are subtle and often masked by expression changes due to extraneous factors. I hypothesized that by combining data across studies, I could increase statistical power to identify small changes associated with the illness. I also incorporate results from the normal brain analysis to ensure the correct control of various factors that can be confounded with the disease effect. While the first two projects of this thesis are concerned with differential expression, in the last section (Chapter 4) I turn the focus to coexpression. Using the same seven schizophrenia datasets used in Chapter 3, I perform a meta-analysis of gene coexpression. Separate networks are generated for the control and schizophrenia  2  brain, and the network properties of each are evaluated. The differentially expressed ‘schizophrenia genes’ obtained through meta-analysis in Chapter 3 are also interrogated in the context of each network. I aim to identify characteristic network properties of the ‘schizophrenia genes’ and evaluate how they differ from other functional gene groups. I hypothesize that evaluating the interactions of these genes in the context of each network, will reveal unknown relationships between them and possibly highlight functional differences in the control and schizophrenia human brain.  1.2 Neuropsychiatric Illness Neuropsychiatric illnesses are complex brain disorders that arise from a combination of genetic and environmental influences. Examples of such illnesses include schizophrenia, bipolar disorder, major depressive disorder and autism spectrum disorder. The development of new treatment approaches for psychiatric illnesses is grounded in a fundamental understanding of disease etiology and pathophysiology, something which we have long been in search for. Over the past decade there has been a vast amount of research conducted in this area, but despite these efforts, there has been little success in psychiatry compared to other areas of medicine [14]. Difficulty in isolating the exact cause for these disorders stems from a number of challenges. A first obstacle occurs at the stage of diagnosis due to the phenotypic heterogeneity of the affected population. There is often a lack of definitive symptoms that consistently manifest in all affected subjects. In general, disease is investigated as a set of binary traits where people are considered to be with or without disease, but this simple binary system does not apply well to psychiatric illness which can consist of a spectrum of phenotypes. A second challenge is that it has been established that psychiatric disorders are generally not single gene disorders and do not show a simple Mendelian pattern of inheritance [15]. They implicate a large number of genes, suggestive of multiple affected pathways which result in selective failures of normal brain function. Finally, the brain is a highly complex organ, with proper function dependent on the coordinated activity of different cell types across several brain regions. We are still at an early stage of understanding normal brain function and the precise inner workings at the level of neuronal circuits. Thus, a critical barrier to finding better diagnosis  3  and treatment for psychiatric illness lies within these challenges. By addressing these issues we can gain a deeper understanding of the causal pathways and make progress with these debilitating disorders.  1.3 Schizophrenia Schizophrenia is a complex psychiatric illness that affects about one percent of the population worldwide [16]. The disorder presents itself with a combination of signs and symptoms that vary across affected patients, but is predominantly defined by observed signs of psychosis. Diagnosis of schizophrenia is based on a set of specified criteria found in the Diagnostic and Statistical Manual of Mental Disorders fourth edition (DSM-IV-TR) [17]. These criteria are also listed in Table 1 of this thesis. Five categories of clinical symptoms are assessed and the patient must have at least two for a 1-month period. Further, these symptoms must associate with disturbance in work, interpersonal relations or self-care of the patient for at least a 6-month period. Symptoms are typically clustered into three domains: positive, negative and cognitive. Positive symptoms are those that appear to be in excess of the individual’s normal functioning. Some common examples, and perhaps the most dramatic clinical aspects, include paranoid delusions and auditory hallucinations. In contrast, negative symptoms reflect the absence of certain functions that are present in the normal individual; for example lack of emotion or motivation, and social isolation. The cognitive domain of symptoms arise not from observations of clinical symptoms, but from clinical neuropsychology whereby standardized tests are used to assess the level of function (e.g. Wisconsin Card Sorting Test, N-back test) [18]. Studies using such approaches have shown that affected patients show deficits in a variety of cognitive domains including executive function, attention, working memory and language [19]. The onset of psychosis typically occurs in young adulthood (18-25 years of age), with patients experiencing a slow and gradual development of signs earlier in life, also known as the prodromal phase. In the prodromal phase, patients experience non-specific behavioural changes manifesting mostly negative symptoms, for example social withdrawal or sudden outbursts of anger. The current literature indicates sex differences in the symptomology and prognosis of schizophrenia, although results are not always consistent [20]. For example schizophrenia tends to occur in women 3-4 years later in life than men, and women also tend to have milder forms of the disease in their younger years than  4  their male counterparts. Further, epidemiological studies have found higher incidence rates in male than female [21].  1.3.1 Theories of Pathophysiology Several neurotransmitter systems have been found to exhibit dysfunction in the brains of schizophrenia patients [22]. Here, I discuss the role of three neurotransmitters: dopamine, glutamate and gammaaminobutyric acid (GABA) and their involvement in schizophrenia pathophysiology.  Dopamine is a catecholamine neurotransmitter that has many functions in the brain including important roles in behavior and cognition. Historically, dopamine receptors were divided into two major subtypes referred to as D1 and D2, but it is now recognized that additional receptors D 3, D4, and D5 also exist [23].The first formulation of the ‘dopamine hypothesis’ suggested an excess of dopaminergic activity associated with schizophrenia. This hypothesis evolved from the antipsychotic effects observed when the D2 receptor antagonist chlorpromazine (as part of its early clinical testing as an antihistamine) was administered to patients in a French mental hospital. Furthermore, psychosis-inducing effects were observed from dopamine releasing drugs such as amphetamines [24]. Given the predominant localization of D2 receptors in subcortical regions of the brain, much of the ensuing research focused attention on the mesolimbic dopamine pathway composed of dopaminergic neurons in the ventral tegmental area (VTA) and their projections to the nucleus accumbens, regions of the hippocampus, and the mesial components of the frontal, anterior cingulate and entorhinal cortices (Figure 1). Due to the conceptualization of schizophrenia as a disorder of increased dopamine transmission, the treatment of schizophrenia remained unchanged for many decades. First-generation or ‘typical’ antipsychotics such as chlorpromazine and haloperidol remained a first choice to patients, despite their unpleasant side effects. A reformulation of the ‘dopamine hypothesis’ eventually emerged; implicating the D1 receptor and the mesocortical dopamine pathway (Figure 1), whereby the neocortex (mainly the prefrontal cortex (PFC)) receives a dense dopaminergic innervation from the VTA. D1 receptors are localized in the PFC, and have been found to be associated with cognitive dysfunctions, especially working memory [25, 26]. Moreover, brain imaging studies show abnormalities of the D1 receptor density in the frontal cortex of  5  persons with schizophrenia [27, 28]. These lines of evidence suggest that hypoactivity of dopamine transmission in the mesocortical pathway is associated with negative and cognitive symptoms of schizophrenia, whereas hyperactivity of dopamine transmission in the mesolimbic pathway is attributed to positive symptoms of schizophrenia. A case has also been made proposing a link between the two hypotheses, in which the mesolimbic dopaminergic function in schizophrenia is a secondary phenomenon of a functionally compromised mesocortical system [29].  Another hypothesis suggests that hypofunctioning of the glutamatergic system may be involved in the pathogenesis of schizophrenia. Glutamate is the primary excitatory neurotransmitter in the mammalian brain and is thought to be utilized by 40% of all synapses [30]. Glutamate mediates its actions via four different post-synaptic receptors: N-methyl-D-aspartate (NMDA), alpha-amino-3-hydroxy-5-methyl-4isoazolepropionic acid (AMPA), kainite, and metabotropic receptors. Of the different receptor types, NMDA is the one that has received the most attention. Low doses of dissociative anesthetics such as phencyclidine (PCP) or ketamine when administered to healthy individuals were observed to elicit both positive symptoms (e.g. paranoia) and negative symptoms (e.g. blunted affect) [31], and also capable of inducing schizophrenia-like cognitive effects (e.g. attention and memory problems) [32]. It was discovered later that that these compounds function by blocking the NMDA receptor [33]. More recently, there is evidence that glutamatergic drugs can be helpful in the treatment of schizophrenia. Agents that enhance NMDA function by modulating the glycine modulatory site on the NMDA receptor (e.g. sarcosine) have been reported to reduce some symptoms in patients with schizophrenia [34]. Similarly, a selective agonist of metabotropic receptors has been shown to reverse PCP effects in animal models and are also effective in treating some positive and negative symptoms [35].  Abnormalities of the GABA system are also implicated in schizophrenia. GABA is the major inhibitory neurotransmitter in the brain, and is synthesized from its precursor L-glutamate via the enzyme glutamate decarboxylase (GAD). Benes et al illustrated a deficit in GABA-ergic (GABA producing) interneurons in the prefrontal and cingulate cortices of schizophrenia patients [36], consistent with a great deal of work from other postmortem brain studies, as reviewed in [37]. Moreover, messenger RNA (mRNA) expression  6  of major GABA-synthesizing enzyme GAD67 was also found to be decreased in prefrontal cortex of subjects with schizophrenia relative to controls [38, 39].  While each of these hypotheses has been generally considered independently, a more recently proposed paradigm suggests that they may be intertwined [40]. Anomalies in one neurotransmitter system often lead to a dysfunction in another. For example, NMDA antagonists are also potent activators of dopamine release which cause marked psychotic symptoms in healthy humans and exacerbate symptoms in schizophrenics [41]. Thus, dopaminergic dysfunction in schizophrenia may be secondary to an underlying glutametergic dysfunction. GABA-ergic interneurons were found to be more sensitive to NMDA receptor inhibition than pyramidal neurons (an abundant cell type in cortical structures) [30], suggesting that hypofunction of the NMDA receptor results in a subtle loss of GABA-ergic inhibition and interferes with localized processing. GABA also modulates the dopaminergic mescortical system, whose disturbance could theoretically create dopamine dysregulation. Alterations of other neurotransmitter systems (i.e. serotonergic, cholinergic, and opioid) also provide evidence of synaptic involvement leading to the pathophysiology of schizophrenia, however, the case for their involvement is not as strong. Additional support for a multiple neurotransmitter hypothesis, comes from the increased efficacy of clozapine and other second generation neuroleptics also referred to as ‘atypical’ antipsychotics, which act by antagonizing a wide variety of receptors (i.e. most dopamine receptors, norepinephrine receptors, and many cholinergic and serotonergic receptors) [42].  1.3.2 Genetic and Environmental Factors Schizophrenia is characterized by a genetic component working together with various environmental influences to onset symptoms. There is a substantial genetic contribution demonstrated by high heritability estimates (up to 80%), and > 50% concordance in monozygotic twins [43-45]. However, unlike monogenic diseases where a single mutation may cause the disease phenotype, schizophrenia involves the influence of multiple genes. To date there are no definitive universal genetic markers associated with schizophrenia, despite numerous efforts focused on their identification. Candidate-gene and genomewide association studies (GWAS) have been conducted in attempts to identify single nucleotide  7  polymorphisms (SNPs) of genetic loci underlying susceptibility to schizophrenia. A list of common genes identified from genetic studies of schizophrenia is provided in Table 2 of this chapter. The evaluation of genetic influences on schizophrenia was for many years guided primarily by the ‘common diseasecommon allele’ model, which states that common diseases are caused by multiple common alleles with relatively modest effects that contribute to increasing risk [46]. In contrast, the ‘common disease-rare allele’ model asserts that disease is caused by multiple variants that are highly penetrant, individually rare, and even specific to single cases or families [47]. While the two are competing theories, it is thought that both could be true for a single susceptibility gene for a complex disease, such as schizophrenia.  One of the most replicated findings of candidate genes in schizophrenia is the dysbindin (DTNBP1) gene, first reported by Straub et al [48] from association mapping across the linkage region on 6p22.3. Numerous other association studies followed, providing evidence in favour of the DTNBP1 gene, however there were inconsistencies in the specific risk alleles identified [49-51]. The dysbindin protein is involved in many functions, but of particular interest is its involvement in vesicle trafficking and the potential implication with schizophrenia [52]. Another strong candidate gene is neuregulin 1 (NRG1), showing suggestive linkage of schizophrenia to chromosome 8p, in not only the original Icelandic population [53], but also a number of other populations (see review [54]). The NRG1 gene gives rise to at least fifteen isoforms that encode different proteins carrying out a range of functions in the brain. Similar to DTNBP1, the way in which altered function of this gene would lead to schizophrenia is unclear.  In addition to reports on SNPs, other avenues of genetic abnormalities have been explored with respect to their association with schizophrenia. One example of this is the DISC1 gene, also known as disruptedin-schizophrenia 1. This gene was identified based on a balanced chromosomal translocation (1:11) (q42.1; q14.3) found to co-segregate with schizophrenia and other psychiatric disorders in a large Scottish pedigree [55, 56]. The translocation was found to disrupt two genes on chromosome 1: DISC1 and DISC2. Association studies that followed sought to identify DISC gene polymorphisms in another population. Positive findings were reported in a large Finnish sample [57] and in a US sample comprising additional subjects with bipolar disorder and schizoaffective disorder [58]. Another prominent example is  8  the presence of a rare hemizygous microdeletion at chromosome 22q11.2, also referred to as velocardio-facial syndrome (VCFS) [59]. The VCFS phenotype is complex including multiple congenital abnormalities affecting several tissue and organs, with at least 20-fold excess risk of developing psychosis [60]. The 3MB deletion region contains more than 45 genes, of which roughly 30 are usually lost in patients with VCFS with no ‘critical region’ of deletion identified. VCFS represents an unequivocal genetic ‘subtype’ of schizophrenia, but the extent to which its underlying mechanism generalizes to schizophrenia remains unknown. Non-deletion variants (polymorphisms) of individual genes within the 22q11 region have also been studied, as they may make a larger contribution to schizophrenia susceptibility in the wider population. Examples of such genes include G-protein subunit beta-like protein (GNB1L) [61], zinc finger DHHC domain containing protein 8 (ZDHHC8) [62], and catechol-O-methyl transferase (COMT) [63]. More recently, the availability of whole genome scanning methods have made it possible to interrogate genomic structural variants in schizophrenia on a larger scale, enabling us to move beyond the classic examples of DISC1 and 22q11. There is now convincing evidence for association between schizophrenia and a number of specific rare copy-number variants (CNVs) [64].  The last decade of genetic studies in schizophrenia has suggested many candidate genes, but a problem that persists is the failure to replicate across studies. Some studies may be underpowered to reliably detect genes of very minor effect size. To address the issue of sample size, Stefansson et al [65] carried out a comprehensive meta-analysis of genome-wide SNP data from several large independent studies of schizophrenia. They identified seven replicable associations including a large number of genes of the major histocompatibility complex (MHC), with follow-up on the most significant signals. Using slightly less stringent criteria on the same cohorts they were able to identify two additional common variants [66], one of which was later confirmed in a Han Chinese population [67]. These findings suggest that it is possible to detect significant genetic effects associated with schizophrenia, but a move towards even larger sample sizes will be highly beneficial.  While there is strong evidence for a genetic contribution, there is also a large body of evidence indicating a role for environmental risk factors in the etiology of schizophrenia. Examples of proposed environmental  9  factors include stress, drug use, history of trauma, head injury, low socioeconomic class, and a number of prenatal factors such as late winter/early spring birth, maternal infections in the second trimester, and obstetric complications [68]. Each of these factors range in the relative risk conferred; although the mechanism by which each influences schizophrenia is not clear. Further support for the role of environmental insults comes from the fact that some of the identified genetic candidates could provide a source of explanation. One study suggests a link between susceptibility genes that are pertinent to normal cell function, but also code for proteins involved in the life cycles of pathogens, by virtue of their multifunctionality [69]. If genes are implicated in the life cycles of pathogens (which are environmental risk factors), the polymorphisms of these genes may well affect the virulence of the pathogen. In line with prenatal risk factors, studies have also revealed a number of schizophrenia susceptibility genes to have key roles during neurodevelopment (e.g. DISC1, NRG1) [70, 71].  1.3.3 Insights from Human Brain Studies Abnormalities in a number of different brain regions have been reported in schizophrenia, but an exclusive region of pathology has not been identified [72]. From the literature, it is evident that reports of pathologic change in the hippocampus and the PFC predominate. Here, I detail only implications of schizophrenia in the PFC, as it is the brain region of greatest relevance to the work presented in my dissertation. The PFC is a neocortical region of the brain, serving specific cognitive functions involved in selective attention, working memory and behavioural inhibition [73]. These are brain functions that are critically impaired in patients with schizophrenia. Moreover, the PFC is one of the last cortical regions to develop structurally and functionally, with evidence of grey matter reduction (indicating synaptic pruning) and increased myelination continuing into early adulthood [74]. Thus, early adulthood may be a critical period of vulnerability of the PFC for psychosis. An early conceptualization of the neurodevelopmental model of schizophrenia was proposed by Weinberger [29], suggesting that the behavioural outcome of psychosis in late adolescence is the result of an early insult that begins long before onset. This model posits that the insult is a congenital static brain ‘lesion’ in the PFC. The effects of the lesion remain silent during much of development as the brain region is not yet functionally mature. It is during late adolescence when key maturational events take place in the PFC and clinical impact is observed. An  10  alternative theory proposed by Feinberg is termed the ‘late neurodevelopmental model’, which suggests that the onset of psychosis in late adolescence is due to an abnormality in normal maturational processes of the cerebral cortex during this time independent of any ‘lesions’ [75]. There is possibly excessive synaptic pruning in patients with schizophrenia, though the exact mechanism by which this occurs remains unclear [76].  Modern brain imaging techniques provide non-invasive approaches to observe brain structural changes in schizophrenia most recently using magnetic resonance imaging (MRI). There is a large body of research in this area, with in-depth coverage by various literature reviews [77, 78]. Here, I highlight from these findings patterns that emerge in relation to the neurodevelopmental model. Longitudinal MRI studies have consistently shown progressive increases in lateral ventricle volume, and reduced grey matter volume in schizophrenia [78]. Progression is evaluated in adult onset schizophrenia patients by imaging high-risk prodromal patients and assessing changes after the first psychotic episode (i.e. 12 month period) [79], or by comparing patients in the early stages of schizophrenia compared to chronic stages (i.e. 4 year period) and evaluating change in the context of clinical severity of the illness [80]. Although the normal trajectory of brain development involves loss of grey matter, the progressive loss observed in schizophrenia patients is significantly more than the decrease in healthy controls. This suggests an ‘exaggeration’ of normal brain development, corroborating with existing theories of excess synaptic pruning in schizophrenia [76].  Traditional MRI can also be used to look at white matter, though it is limited and can only give a gross overview. A relatively new imaging modality called diffusion tensor imaging (DTI) allows for a more detailed analysis of the integrity of the white matter tracts in the brain. DTI findings in schizophrenia are not entirely consistent, but the overall picture is one of reduced white matter integrity [81]. The most marked differences in white matter tracts that are observed in both chronic and first-episode patients, are found in the cingulate and the frontal lobes [81]. White matter tracts in the brain are the basis for largescale connectivity between functionally related but anatomically disparate regions of the brain, thus these findings make a case for disrupted neural connectivity in the pathophysiology of schizophrenia. DTI  11  findings have also been reported in the context of clinical correlates and cognitive functioning related to schizophrenia. For example, reduced white matter integrity in the anterior cingulum has been found to correlate with deficits on the WCST [82]. Behavioural deficits in schizophrenia have also been evaluated by integrating cognitive paradigms with functional MRI (fMRI), an imaging modality which measures cerebral blood flow as a proxy for neural activity. A recent meta-analysis of 41 functional neuroimaging studies reported consistently decreased activity in the dorsolateral PFC and the anterior cingulate cortex during discrete PFC dependent executive function tasks [83], reflective of hypofrontality in schizophrenia.  Studies of the postmortem human brain have been beneficial in elucidating features of schizophrenia that remain beyond the reach of neuroimaging. In general, schizophrenia lacks the presence of major identifiable neuropathological lesions and in earlier times was quoted as being the ‘graveyard of neuropathology’ [84]. However more recently, microscopic features of pathologic change have been identified in various regions of the limbic cortex and the neocortex; with a focus here on the latter. There have been reports of reduced dendritic spine density [85, 86] and smaller cell bodies [87] on prefrontal cortical pyramidal neurons in schizophrenia. Also, concordant with MRI findings, grey matter volume reductions have been reported [88], though it is thought that this is not due to loss of neurons but rather because of reduced neuropil [89]. A wide range of methods and techniques have been applied to postmortem tissue to investigate alterations at the cellular level. Protein-based schizophrenia studies have been conducted using postmortem tissue [90], ranging from standard Western blot analyses, and two-dimensional gel electrophoresis to newer methods such as liquid chromatography-tandem mass spectrometry (LC-MS). High-throughput approaches such as microarrays and RNA-Seq have enabled us to distinguish molecular features such as levels of gene expression change associated with schizophrenia; and the study of epigenetics provides a window into the regulation of those transcriptional changes. Several studies have reported alterations in DNA cytosine methylation and histone methylation at specific genes and promoters in the postmortem brain of subjects with schizophrenia, often in conjunction with changes in the levels of corresponding RNAs [91, 92]. There is evidence for epigenetic modifications influencing the GABAergic system (i.e. GAD1 [93] and RELN [94, 95]), the dopaminergic system (i.e. COMT [96]), and myelination-related processes based on the methylation of a transcription  12  factor important for myelination and oligodendrocyte function (i.e. SOX10 [97]).  Transcriptome analysis of the postmortem human brain has been of substantial interest in schizophrenia research as it holds the promise of identifying a signature of genes representing a pathology at the molecular level. Measurements of gene expression reflect the transcription processes in a cell at any given point in time. Changes observed between normal and diseased brain tissue (or cells that comprise that tissue), capture maladaptation of cell function in the diseased environment by identifying a signature of over- and under-expressed quantities of RNA. However, similar to genetic studies of schizophrenia few findings from gene expression analysis have been reliably replicated. Here, I will summarize some of the major findings pertaining to the PFC, but recent reviews such as [98] and [99] should be consulted for a more comprehensive overview. The first microarray analysis of postmortem human brain in schizophrenia was conducted by Mirnics et al. (2000) [98], investigating expression patterns in the PFC. Overall, genes involved with presynaptic function were shown to have lower expression in schizophrenia including, N-ethylmaleimide-sensitive factor (NSF), synapsin II and synaptojanin 1. Further support for the involvement of presynaptic genes was provided by later studies of the hippocampus [100] and the entorhinal cortex [101]. Also related to dysregulation at the synapse, a recent analysis using two large cohorts found consistent expression changes in gene sets associated with synaptic vesicle recycling, neurotransmitter release and cytoskeleton dynamics [102]. In line with the neurotransmitter hypotheses associated with schizophrenia, Mirnics et al. also identified altered expression of transcripts involved in GABA-ergic and glutamatergic neurotransmission. Notably, the decreased expression of GAD1 mRNA was of particular interest, and was later confirmed in another microarray study [103] and by other techniques [38, 104, 105]. Given the importance of myelination in the process of brain maturation and its implication in schizophrenia, reduced expression of oligodendrocyte transcripts found by Hakak et al. (2001) [106] was particularly relevant. Decreased expression of myelination-related genes such as myelin-associated glycoprotein (MAG) and oligodendrocyte lineage transcription factor 2 (OLIG2) have also been reported in neocortical regions and other brain areas [107]. While a large majority of studies have reported reduced expression of transcripts, there are also findings of up-regulated genes. Several studies have reported increased expression of immune and stress-response genes in patients with  13  schizophrenia, using mostly PFC samples [99, 108, 109]. Despite all of these studies there is no signature of genes that are reliably found across all studies. This lack of consensus can be attributed to various sources of limitation. First, many of these studies suffer from low statistical power. Typically sample sizes in these studies are small (N ≈ 40 or less), as postmortem brain tissue is a limited resource. Second, schizophrenia is associated with relatively small changes in gene expression and discriminating real differences from experimental noise is difficult. Finding small effects from small sample sizes is difficult, leading researchers to use lax criteria for identification of targets, at the risk of substantial false positive rates. Finally, differences between studies also arise due to reasons related to the nature of postmortem work; a topic which is covered in detail in the next section (Section 1.4). For my thesis, I have combined schizophrenia expression profiling data across studies, increasing the power to find more significant changes in gene expression. I hope that the identification of a robust schizophrenia signature of genes will help aid in forming further hypotheses on the complex etiology of schizophrenia.  1.4 Postmortem Human Brain Tissue The use of postmortem human brain tissue is a crucial element for understanding the pathological processes of psychiatric illness. While animal models can mimic certain aspects of human pathology we will never be able to fully recapitulate the disorder in an animal. Moreover, major psychiatric illnesses are disorders of the brain, thus direct examination at the source can better reveal details on the mechanisms underlying disease. However, postmortem human brain tissue poses several challenges to researchers [110]. Unlike animal models, whose genetic makeup and environmental factors can be controlled and influenced, human tissue comes from relatively uncontrolled sources. Variability of pre- and postmortem conditions among individuals can potentially influence the quality of tissue and consequently patterns of gene expression. In this section, I discuss some of the factors which can affect the integrity of conclusions made from gene expression profiling studies of postmortem human brain, as this is a focus of my thesis. It should be noted though, that these factors affect all postmortem brain studies regardless of the technique being applied.  14  1.4.1 Tissue Heterogeneity The human brain is made up of two major types of cells called neurons and glia, with the latter accounting for the majority [111]. Neurons are cells involved in processing and transmitting information through the brain via electrical and chemical signaling. Neurons exist in different shapes and sizes and can be classified by their morphology and function. Glia also have multiple subtypes (e.g. oligodendrocytes and astrocytes), each involved in a wide variety of functions providing structural, metabolic, and trophic support to neurons. More recently, glia have also been identified as active members of synaptic transmission, modulating the information flow between neurons [111]. It is the coordinated activity and cross-talk between these different cells that allow for proper functioning of the brain.  Given the intricate organization of the brain, it is difficult to obtain samples of homogenous cell type. Common approaches for postmortem brain tissue dissection include obtaining brain tissue blocks by use of a cryostat, whereby the brain is frozen and then sliced at variable thickness from which selected brain areas can be punched out. These samples comprise a heterogeneous collection of cell types, each of which is characterized by distinct molecular compositions. Moreover, each cell type could react to perturbations (such as disease) differently. A large fold change in the expression of a particular gene can be diluted considerably if the cell type expressing the gene represents only a fraction of the overall population of cells being studied. Furthermore, a relevant expression change in one direction (i.e. upregulation) in one cell type can be masked by regulation in the opposite direction (i.e. down-regulation) by other cell types in the same sample. To reduce the effects of tissue heterogeneity on gene expression, some recent studies of schizophrenia [112-114] have employed Laser Capture Microdissection (LCM) [115], a technique which allows the investigation of specific cell types within tissue sections.  1.4.2 Tissue Quality  Because of the concern that tissue degrades after death, it is routine to assess the integrity of the tissue using parameters such as the brain pH, postmortem interval and neuropathological assessment. Brain pH, is often used as a surrogate measure for RNA quality. It has been well demonstrated that brain pH  15  positively correlates with mRNA preservation in postmortem tissue [116-118], and as such some reports have suggested using a pH cutoff (< 5.9-6.0) to select samples for further analyses [119]. However, postmortem brain is a limited resource and removal of samples is often considered a last resort. Postmortem interval (PMI) is defined as the time that has elapsed between the time of death and the time at which the samples are collected from the deceased. As the PMI increases RNA degradation in the sample is thought to be more likely, although in human studies PMI has not yet been shown to have a clear relationship with the quality of RNA [120]. This confusion may be due to the fact that PMI can sometimes be an estimated value, particularly in situations in which there is uncertainty of the time of death. In addition to these measures, tissue samples typically undergo a neuropathological examination to rule out the presence of any abnormalities of other brain diseases (e.g. Alzheimer’s disease) that may mimic the clinical features of psychiatric disorders. Neuropathological assessment can also characterize the presence of brain abnormalities that might affect interfere with downstream interpretation. Protocols for this assessment generally include gross and microscopic examination, but can vary across brain banks [121].  1.4.3 Clinical Quality  Another consideration for postmortem brain studies is the clinical state of the subjects pre-mortem. An accurate diagnosis of subjects is dependent on multiple sources of clinical information, which are not always readily available. Such resources include psychiatric records coupled with medical records, interviews with medical professionals, interviews with family and semi-structured diagnostic assessment tools.  Toxicology reports of medication use and illicit drug and alcohol abuse are another important component of clinical quality. These substances elicit their effects by binding receptors in the brain which could interfere with dependent measures of interest in postmortem studies. Such screening is not always conducted on subjects, and therefore not reported. In the case of medication analysis, costs can be prohibitive for antipsychotics which are not part of routine assays [121]. Tests that are conducted at time  16  of death give data that is not necessarily representative of lifetime usage. Typically, the effects of antipsychotics are handled in one of several ways; each of which is listed here with its associated limitation.1) Lifetime exposure of antipsychotics can be estimated based on medical records of prescription and incorporated into analyses, but is not necessarily accurate as there are often problems with patient adherence. 2) Studies aim to use subjects that are ‘off’ medications for a period before death [122], although being off medication for a certain period of time cannot reverse the potential changes induced by medication received prior to the ‘off’ phase. 3) The effects of medication are evaluated in an animal model (usually rodents), in which medication is administered for a given number of weeks [123]. Expression changes found are then cross-referenced with gene lists obtained from the postmortem brain study to identify genes altered secondary to a medication effect. The limitation here is that treatment in the animal model is of short duration and is not exactly comparable to treatment courses in humans.  Cause of death is also a clinical measure that can profoundly affect the integrity of brain tissue. It is a measure which is less of a concern for subjects who died suddenly outside a medical care setting (e.g. automobile accident), compared to individuals who died following a prolonged illness that involved medical interventions (e.g. cancer). Prolonged agonal states have been shown to yield lower tissue pH and decreases in the expression level of genes involved in energy metabolism and proteolytic activities and increased levels of stress response genes and transcription factors [124]. Agonal states have been clearly associated with pH and RNA integrity [125], and subsequently a rating system based on agonal duration has even been developed [126].  1.4.4 Demographic Data  Gene expression will inevitably vary from one individual to another in ways that are not related to neuropsychiatric status. Subjects used in postmortem studies have diverse genetic backgrounds, different lifestyles and have been exposed to various environmental influences. Thus there are a number of extraneous factors that cannot be controlled for no matter how well thought out the experimental design. However, two demographic factors that are commonly measured and considered are age and sex. These  17  factors have been found to be associated with large expression changes in the brain [127, 128], thus can potentially mask or masquerade as the disease effect if not properly controlled for. It is routine to match samples across conditions by these factors, which can result in the removal of samples from the cohort. Alternatively, one could reduce the effects by adjusting for these factors during statistical analysis. Other demographic factors that have more recently been incorporated in postmortem brain studies include race/ethnicity of the subjects, and measures of lifetime smoking.  1.5 Gene Expression Profiling Genome-wide RNA expression profiling enables a large-scale approach to identifying molecular changes associated with a given condition. The RNA expression profile of a sample represents a static view of global gene expression which can then be used to compare against profiles from a different environment (diseased vs. control, treated vs. non-treated, etc.). These changes can be primary or secondary. Primary changes occur in response to sequence level changes (i.e. SNPs or mutations in regulatory or coding region) or environmental changes. Secondary expression changes are transcript level changes that are a consequence of the primary genetic and environmental factors. Thus, gene expression data can reveal unanticipated biological relationships from which we can then generate hypotheses. Expression profiling is frequently applied to the study of the human central nervous system [1, 2, 129]. With the accumulation of expression data on human brain tissue and its potential impact on psychiatric research, it is important to evaluate the current standing of the available technologies.  1.5.1 Profiling Technologies  Expression profiling technologies can be classified into one of two categories: 1) sequencing-based approaches or 2) hybridization-based approaches. The focus of this thesis is on the latter, however here we will discuss briefly on both. Serial analysis of gene expression (SAGE) [130] was the first reported sequencing-based high-throughput method for expression profiling followed by massively parallel signature sequencing (MPSS) [131]. These methods work by generating a short sequence tag for  18  transcripts. The tag is a short stretch of nucleotides adjacent to the 3’ most site of specific restriction enzyme in a transcript. Gene expression is then measured by counting these tags. MPSS can generate a larger number of signatures in a single run, providing better coverage than SAGE. Though both of these methods have several advantages, neither was as well adopted as microarrays at the time of my research. DNA microarrays are based on probes which are immobilized in an ordered two-dimensional pattern on substrates, such as nylon membranes or glass slides. Probes are usually designed to be specific for an organism and can cover the whole genome. Measuring the amount of hybridization between immobilized probes and mRNA sequences in the sample indicates the amount of signal or gene expression.  1.5.2 RNA Quality  It is critical to obtain samples with a high level of RNA quality, irrespective of the technology being used. While RNA in brain is relatively stable, RNA quality does vary and must be controlled for [132, 133]. A classic measure of RNA quality is the 28S/18S ratio in which ribosomal RNAs are quantified, and 28S is expected to be twice that of 18S. However, this has been called into question as unrepresentative for a couple of reasons. First, since rRNA and mRNA differ structurally it is likely the two would also result in different in situ stability. Also, this approach is subjective as it relies on human interpretation of gel images and therefore not comparable from one lab to another. Another measure involves the signal from the 3’ and 5’ ends of a housekeeping gene transcript (i.e. GAPDH) and taking the ratio of the expression levels. Generally, with this measure a ratio close to one would advocate good integrity of the sample. A recently developed and more commonly used measure is the RNA integrity number (RIN, generated using a software tool [134]. The RIN value is calculated using the entire electrophoretic trace of the RNA sample, including the presence or absence of degradation products.  1.5.3 Limitations of Microarrays  Microarrays have become a popular tool and come in different flavours, including arrays with probes representing only coding regions, exons, SNPs, and the option of custom arrays. However, they do have  19  their limitations. First, the effectiveness of a microarray is directly affected by the quality of genome annotations. Without a priori knowledge of the genome, analysis can be severely hampered. Second, non-specific cross-hybridization can occur between similar sequences, although there have been efforts to design probes less prone to cross-hybridization [135] and tools to infer the extent of cross-hybridization in resulting data [136]. Quality control methods are applied to ensure removal of outlier assays that contribute to noise in the data. To assess the data quality, various methods have been proposed by the chip manufacturers and have become standard protocol for microarray technology. Some examples include, a consistent scaling factor (related to the overall intensity of the chip), the use of internal and external spiked in controls and inter-sample correlation analysis. These among other measures allow us to gauge detection level and sensitivity after hybridization on the chip.  The multiplicity of different platforms for measuring expression presents further challenges in comparing results across labs. Each platform is built on the same general principle but differ from one another in their building strategy (i.e. in situ synthesis vs. deposition), probe length (i.e. short oligonucleotides vs. long cDNA arrays), probe labeling (radioactivity, fluorochrome incorporation, etc.), sequence representation (i.e. assay different genes), and array hybridization strategies (i.e. one colour vs. two colour). Array hybridization is a sensitive procedure involving reagents and hardware of which the quality can vary between labs. Differences in lab conditions and protocols for sample preparation can also contribute to the variation observed between datasets, creating potential ‘lab effects’. The MicroArray Quality Control (MAQC) project [137] was initiated to address concerns surrounding technical differences that may arise due to platform variation as well as the reliability of cross-laboratory microarray studies, and other performance and data analysis issues. They showed that when comparable methods were applied, a high level of both intra-platform consistency and inter-platform concordance resulted with respect to the genes identified as differentially expressed [137]. There is also concern regarding differences in the methods used for data processing such as, image segmentation, signal intensity measurement (accounting for background signal), probeset summarization, and normalization of data (within the array or across all arrays used in experiment). For each of these steps there are multiple algorithms (for example, MAS5.0 present/absent calls [138], RMA [139], gcRMA [140]) that can be  20  applied. Different published studies utilize combinations of methods which may contribute to different outcomes. Another concern is the issue of non-biological variation caused by ‘batch effects’. Practical considerations limit the number of samples that can be hybridized at one time, thus samples from the same experiment can be run several days or weeks apart. Arrays run on the same day may share preparation conditions, making arrays run in different batches (different days) not directly comparable. It is necessary to detect and control for these effects [141].  A possible fix to some the drawbacks of using microarrays is a move towards next-generation sequencing (NGS) based approaches. One example that is commonly used is RNA-Seq, a method that analyzes cDNA by means of NGS methods and subsequently mapping short sequence reads onto the reference genome. The sample is sequenced directly and thereby not dependent on user-defined sequences, removing issues of cross-hybridization and experimental bias from the data. Moreover, quantification of signal is based on counting sequence tags rather than relative measures between samples. Notably, NGS approaches also come with their own limitations (i.e. effective rRNA removal, and development of appropriate data analysis tools).  1.5.4 Differential Expression  Differential expression refers to the identification of meaningful changes in levels of gene expression across two or more conditions. In a simple case, we would look at one gene across two different conditions (i.e. control versus disease). This type of analysis is usually preformed using a two-sample ttest or a Wilcoxon rank-sum test. A more complex scenario would involve evaluating expression of a gene across multiple conditions, each having different factor levels (i.e drug dosage and case/control). This type of analysis would usually require more sophisticated statistical approaches for example, linear modeling using an analysis of variance (ANOVA) to identify significance of change with each factor. Highthroughput genome-wide datasets, such as microarrays, contain thousands of genes on which the statistical test must be applied. A major issue that arises when testing multiple genes is inflation of the false-positive rate; this is called the “multiple testing” problem. To deal with this problem, p-values must  21  be adjusted upwards to compensate and reduce the number of false positives. Different methods for multiple test correction exist, some examples include Bonferroni correction, Benjamini-Hochberg [142] and q-value [143]. The current literature of microarray studies in schizophrenia and other psychiatric illnesses have not applied multiple testing to their datasets as it often results in uninformative, small gene lists. These studies instead use extended gene lists with less stringent criteria to allow for a more inclusive understanding of affected pathways of pathology.  1.5.5 Coexpression  Coexpression refers to genes that have correlated expression patterns across a set of samples. Similarity is measured by the Pearson correlation coefficient (or some other metric) across all possible gene pairs within a microarray dataset, generating a symmetric matrix of correlation values. Often the associations established from coexpression analysis are represented as a network. A gene coexpression network is a graph in which nodes represent genes and edges between nodes indicate the two genes are coexpressed. Unlike other biological networks, whose edges represent well-defined biological interactions, the edges in a coexpression network reflect the correlation structure of the data. A connection between two genes should not be mistaken to suggest a physical interaction between them. The network is specified by an adjacency matrix, with values of the matrix corresponding to edge weights. Network construction requires an important decision to be made in considering which correlations are relevant enough to constitute a connection between two genes [144]. For un-weighted networks, hard thresholding is applied, which involves discarding edges (gene pairs) below a given threshold. Selection of the threshold can be based on the actual similarity values, or rank-transformed similarities [145]. The threshold can be arbitrarily chosen, or by controlling for statistical significance of similarities [146, 147]. For gene pairs that meet the threshold, correlation values are converted to a value of 1 and all other genes assigned zero, resulting in a sparse binary matrix. Alternatively, by raising correlation values to a power β ≥ 1 one can generate a weighted network also referred to as soft-thresholding [148]. By raising correlations to a power, each gene pair retains a value with an emphasis of high correlations at the expense of low correlations. A common next step is to extract gene relationships from the coexpression  22  network to uncover gene function. Genes that are highly similar are thought to reflect a functional relationship; a concept that has been termed guilt-by-association (GBA) [149]. By evaluating subtle but coordinated changes in expression across multiple genes in the network, we can extract highly connected clusters and characterize them into functional groups based on over-representation of pathway-specific genes. Moreover, for uncharacterized genes that associate with these groups we may be able to assign putative function.  1.5.6 Network Analysis  A network representation of gene coexpression data makes it amenable to mathematical analysis. We can then apply tools of graph theory to characterize various structural properties of the network. Here I discuss only those properties that are most relevant to my thesis. An important measure of a node in a network is how many other nodes it is directly connected to, also known as the node degree. If the distribution of node degrees in a network follows a power law, the network is defined as ‘scale-free’ [150]. Many quantities in nature follow a unimodal distribution, whereby there is a characteristic scale that is embodied by the mean and the singe mode. The significance of the scale-free topology is that there is no characteristic node degree; most nodes in the network are scarcely-connected, while few are highlyconnected ‘hubs’. The hubs in a network represent particularly important nodes, dominating the structure of networks in which they are present. In a biological network, they often reflect genes that are involved in multiple processes and are crucial to the functioning of the cell. Thus, the alteration of hub genes can have more severe effects than changes made to lower degree nodes. The presence of high degree nodes can also impose ‘small-world’ connectivity of the network, whereby most nodes in the graph can be reached by another through a small number of steps. Since hubs have links to an unusually large number of nodes in the network, they create shorter paths between any two nodes. Small-world networks tend to contain cliques or highly clustered sub-networks, modeled by small shortest path lengths between nodes and high cluster coefficients [151]. Shortest path length is defined as the minimal number of edges that need to be traversed to travel from one node to another, typically computed using Dijkstra’s algorithm [152]. The cluster coefficient describes the degree to which the neighbours of a node tend to cluster  23  together. Values range from 0 (indicating none of the neighbours are connected to each other) to 1 (indicating all neighbors of a node are also connected to each other). Together, these measures give us an idea of the overall organization of the network. A number of studies have analyzed these topological properties in gene coexpression networks and have shown that they exhibit the ‘small-world’ and ‘scalefree’ properties [153-155], as do many other biological networks (i.e protein-protein interaction (PPI) networks). Although recent literature suggests that perhaps it is a heavy-tailed distribution that is prominent but not necessarily a ‘scale-free’ fit [156]. Furthermore, dissection of networks into smaller substructures can reveal greater insight into biological function. These sub-networks, also called communities or modules, represent a group of nodes that are more densely connected to each other than to nodes outside the group. Searching for modules within a network is a difficult task and while there are available methods there is no efficient algorithm for doing so. A number of different methods have been proposed for identifying sub-networks; some that incorporate expression data with network topology [157, 158] and others that rely strictly on network structure and node properties [148, 159, 160]. Once sub-networks are identified, we can assess them for enrichment of genes that can be attributed to a specific characteristic for example, biological function or specific pathways.  In this thesis, I exploit both frameworks of evaluating gene expression. In Chapter 2 and 3 the focus is on evaluating gene expression changes in the normal and schizophrenia brain by use of differential expression analyses; the focus is then turned to coexpression and network analysis in Chapter 4. While each of these analyses stand alone in their findings, a key feature of my thesis is the integration of the two frameworks by taking the expression changes observed through differential expression and evaluating them in the context of functional modules derived from the coexpression networks.  1.6 Meta-analysis Meta-analysis provides an integrative data analysis method, enabling us to extract more value from a collection of individual studies. It is commonplace in a single dataset study to use the microarray as a screening tool and then validate a few differentially expressed genes of interest using techniques such as  24  reverse-transcription PCR (RT-PCR). By conducting a meta-analysis of multiple datasets, we can essentially validate and statistically assess all of the positive results simultaneously to yield a significant gene set. Data are combined across studies and evaluated in a single study to yield a more precise estimate of effect. The overwhelming accumulation of available transcriptomic data (particularly microarrays) in the last decade [161-163], has resulted in a corresponding spike in the number of metaanalyses being conducted across these data [164]. Statistical methods can be applied to combine information across datasets to increase power and find results with increased sensitivity. There are many proposed approaches to meta-analysis, each depending on the type of data being used and the biological purpose. In this section I focus on methods specific to my thesis and describe below the current literature for meta-analysis of differential expression and meta-analysis of coexpression.  1.6.1 Meta-analysis of Differential Expression  There are two general approaches of meta-analysis that are commonly used for cross-study microarray data comparisons, ‘relative’ and ‘absolute’ [165]. A ‘relative’ meta-analysis involves the aggregation or comparison of per-gene result values across multiple studies to estimate an overall summary statistic of the effect. The gene result value reflects the relationship between a gene and the phenotype(s) of interest within the dataset. One example of this is the Fisher’s inverse chi-squared method [166]. P-values representing significance of differential expression for each gene are combined across studies to generate a list of summary statistics. A variation on the Fisher’s method has been proposed, in which pvalues are weighted and only those that meet a specified threshold are considered in the computation [167]. Another example is integrative correlation analysis, demonstrated by Parmigiani et al. [168] in a study using various histological types of lung cancer samples. This method is based on the notion that consistency of correlation across datasets should reflect overall consistency of datasets. The ‘rank product’ approach proposed by Breitling et al [169] provides a non-parametric method whereby fold change values are computed for all genes and converted to ranks within each dataset. Ranks are then aggregated to produce an overall score for each gene across datasets. Choi et al [170] describe an approach to combine estimated effect sizes across datasets. The effect size, a standardized index  25  measuring the magnitude of effect between case and control, is computed per gene within a dataset. Effect sizes are then combined across datasets using either a fixed effects model (FEM) or random effects model (REM) to enable modeling of inter-study variation. Other ‘relative’ approaches for metaanalysis of microarray are described in more detail in [165].  The ‘absolute’ approaches for meta-analysis involve combining the raw or transformed data from multiple studies. Multiple datasets are thus considered as a single merged dataset for further analyses. Traditional microarray methods for differential expression can then be applied to the merged dataset. With a large combined sample size, there is added power to the statistical tests that are performed on the merged data to find expression changes with increased reliability. Moreover, when sample-specific covariates need to be considered, these approaches are more effective than the ‘relative’ class of methods. An example of an ‘absolute’ approach in the literature is demonstrated by Dawany and Tozeren [171], in which the significance analysis of microarrays (SAM) test [172] was the choice of differential expression measure applied to a merged dataset of normal and cancer tissue types. Other examples include a twostage ANOVA model approach [173], and a linear mixed effects model (MEM) [174, 175]. In Chapter 3 of this thesis I apply FEM and MEM to my data, therefore I briefly outline each in the paragraph below.  Ideally one wants a statistical model that explains the data well, with a minimum number of parameters and assumptions. Often the disease effect will vary as a function of study-level covariates such as age, sex etc. A proper synthesis requires one to understand how the disease effect varies as a function of these variables. Using the FEM we are trying to model observed gene expression values in terms of covariates that are treated as if the quantities were non-random. These variables are termed ‘fixed effects’, influencing only the mean of the expression data as they are sampled from a defined set of quantities. In a MEM, some of the variables are treated as fixed effects, while others are treated as if they arise from random causes drawn from a larger population. These random effects often have uninformative factor levels, and there is no need to estimate means of a small subset of factor levels. Therefore for ‘random effects’ we estimate the influence of variance around the true mean value of  26  expression. Deciding on how to model explanatory variables can be tricky, and we discuss some of the challenges we encountered in Chapter 3.  There are good motivations to apply meta-analysis of differential expression to psychiatric studies. Sample sizes of typical postmortem brain microarray datasets are fairly small, as brain tissue is generally hard to obtain. Additionally, the disease effect is small, making it difficult to distinguish biological signal from noise. Combining data across studies, allows for greater statistical power to more reliably estimate an average effect or highlight subtle variation not easily evaluated in single dataset studies.  1.6.2 Meta-analysis of Coexpression Networks  Gene coexpression networks represent relationships between genes that are based on a matrix of pairwise correlations between genes in the dataset. Because microarray data are noisy there has been much interest in the reproducibility of coexpression patterns between microarray datasets. Lee et al [147] demonstrate that patterns of coexpression that can be confirmed across multiple studies are more likely to be functionally relevant. Thus, it seems only natural to extend coexpression networks from the single dataset level to the meta-analysis scenario. One approach to combining coexpression evidence across studies involves vote counting. Reliability of gene pairs is assessed by confirmations across multiple datasets, and statistical significance is estimated based on randomized networks [147, 176]. Other studies have adapted meta-analytical approaches originally applied to differential expression analysis, for use on coexpression data. For example the Fisher’s method can be applied to correlation coefficient pvalues [177], or effect size based methods can be employed by converting correlation values to z-scores as demonstrated by Choi et al [178]. Moreover, a number of studies apply meta-analytical approaches by incorporating a priori knowledge of gene sets with some expected functional relationships, for example Gene Ontology (GO) annotations [160], pathway annotations or tissue specificity [179].  Another commonly observed protocol is to merge datasets and construct a network as if it were a single study [180, 181]. Datasets are combined at the level of expression data to obtain a merged matrix of samples across different studies using the same platform. Pearson correlation values are computed for all  27  probe pairs, and a network graph of the data is generated based on a user-defined threshold [180]. Ucar et al [181], apply a rank-based methodology to the resulting coexpression matrix to generate probe-pair reliability scores. A network is then constructed using probe pairs with reliability scores above a given threshold that is determined by the false-discover rate (FDR) cutoff. The method I use in Chapter 4 of this thesis constructs a single network across studies by aggregating data at the level of coexpression matrices. Individual coexpression matrices are noisy and thus by aggregating data we can obtain a clearer signal, improving performance of the resulting network [182]. For each dataset, a similarity matrix is computed for each cohort by taking the absolute value of the Pearson correlation between all possible gene pairs. Correlation values were replaced by ranks. These similarity rank matrices were aggregated across datasets by taking the mean rank for each gene pair. The aggregated matrix is then thresholded at 0.5% sparsity to obtain network connections. Gillis and Pavlidis [182] showed that this aggregation procedure is a robust method for producing high-quality coexpression networks.  An extension of coexpression network analysis is to identify sets of genes for which coherence of expression profiles is altered between different conditions or ‘differential coexpression’. This sort of analysis allows us to exploit condition-specific patterns of coexpression. Coexpressed pairs are identified across samples representing the normal state and are compared to patterns observed in the diseased state. Differential coexpression patterns could indicate disruption of a common regulatory mechanism, or dysregulation of a particular cellular process, amongst other things. Coexpression networks have been used in this context on few postmortem human brain studies, to identify disease mediated changes in network connectivity associated with neuropsychiatric illnesses such as depression [10], schizophrenia [13], and autism spectrum disorder [183].  28  Table 1: DSM-IV-TR Diagnostic criteria for schizophrenia A. Characteristic Symptoms: Two or more of the following, each present for a significant portion of time during a 1-month period (or less if successfully treated). 1) delusions 2) hallucinations 3) disorganized speech 4) grossly disorganized or catatonic behaviour 5) negative symptoms, i.e., affective flattening, alogia, or avolition (lack of drive)  B. Social/Occupational Dysfunction: For a significant portion of time since the onset of the disturbance, one or major areas of functioning such as work, interpersonal relations, or self-care are markedly below the level achieved prior to the onset (or there is a failure to achieve expected level)  C. Duration: Continuous signs of the disturbance persist for at least 6 months. This 6-month period must include at least one month of symptoms that meet Criterion A (i.e. active phase symptoms) and may include periods of prodromal or residual symptoms. During these prodromal or residual periods, the signs of the disturbance may be manifested by only negative symptoms or two or more symptoms listed in Criterion A present in an attenuated form.  D. Schizoaffective and Mood Disorder exclusion These disorders can be ruled out because either 1) no Major Depressive, Manic or Mixed Episodes have occurred concurrently with the active-phase symptoms or 2) if mood episodes have occurred during active-phase symptoms, their total duration has been brief relative to the active and residual periods.  E. Substance/general medical condition exclusion The disturbance is not due to the direct physiological effects of a substance (e.g., a drug of abuse, a medication) or a general medical condition  F. Relationship to a Pervasive Developmental Disorder If there is a history of Autistic Disorder or another Pervasive Developmental Disorder, the additional diagnosis of schizophrenia is made only if prominent delusions and hallucinations are also present for at least a month  Subtypes of Schizophrenia: 1) Paranoid type  2) Undifferentiated type  3) Disorganized type  4) Residual type  5) Catatonic type  29  Table 2: Candidate genes in schizophrenia Gene  Description  Function  Cytogenic band  * NRG1  neuregulin 1  Signaling protein with critical roles in growth and development  DTNBP1  dysbindin  Vesicle trafficking  6p22.3  RGS4  regulator of G-protein signaling 4  Signal transduction of GPCR; modulate neurotransmission  1q23.3  COMT  catechol-O-methyltransferase  Degradation of catecholamine transmitters (i.e. dopamine)  22q11.21  DISC1  disrupted in schizophrenia 1  Neurite outgrowth and cortical development  1q42.1  AKT1  v-akt murine thymoma viral oncogene homolog 1  Serine-threonine protein kinase  14q32.32  PPP3CC  protein phosphatase 3, catalytic subunit, gamma isozyme  Protein phosphatase involved in the downstream regulation of dopaminergic signal transduction  8p21.3  DRD2  dopamine receptor D2  D2 subtype of dopamine receptor  11q23  DAOA/G72  D-amino acid oxidase activator  Activates DAO which degrades the gliotransmitter D-serine  NRGN  neurogranin  Post-synaptic protein kinase substrate that binds calmodulin in the absence of calcium  11q24  PGBD1  piggyback transposable element derived 1  Transposase specifically expressed in the brain  6p22.1  PRSS16  protease, serine, 16 (thymus)  Role in alternative antigen presenting pathway  6p21  PDE4B  phosphodiesterase 4B  Regulation of second messengers  1p31  TCF4  transcription factor 4  Transcription factor with possible role in nervous system development  18q21.1  DRD4  dopamine receptor D4  D4 subtype of dopamine receptor (GPCR)  11p15.5  NOTCH4  notch4  Controlling cell fate decisions  6p21.3  TPH1  tryptophan hydroxylase 1  Biosynthesis of serotonin  11p15.3  HTR2A  5-hydroxytryptamine (serotonin) receptor 2A  Serotonin receptor  13q14  MDGA1  MAM domain containing glycosylphosphatidylinositol anchor 1  Possible brain development role  6p21  APOE  apolipoprotein E  Catabolism of lipoprotein constituents  8p12  13q33.2  19q13.2  A list of candidate schizophrenia genes identified based on genetic studies as listed in the top 45 list of the SZGene database. Genes highlighted in grey are of high epidemiological credibility on the basis of amount of evidence, consistency of replication, and protection from bias. Credibility is assigned based on meta-analysis results found at . *Not in SZGene top list.  30  Figure 1: A simple schematic of mesolimbic and mesocortical circuitry This figure has been used with permission from Piomelli, Nature Medicine 2001 [184] to illustrate the mesolimbic and mesocortical dopamine pathways in the brain. The mesocortical pathway can be seen as projecting from the ventral tegmental area (VTA) to the prefrontal cortex (PFC). The mesolimbic pathway begins in the VTA and connects to the nucleus accumbens via limbic structures including the hippocampus, amygdala.  31  1.7 Thesis Chapters Summary The general aim of this thesis is to identify changes in gene expression in the normal and diseased postmortem human brain, by applying a variety of meta-analytical techniques. In each chapter I have conducted a cross-laboratory meta-analysis across a number of independent microarray studies, carefully controlling for sources of variation where possible. Combining studies increases the total sample size and subsequently increases statistical power to find changes that might not have been considered significant in any one single study.  In Chapter 2, I describe gene expression changes associated with age, sex, brain pH and PMI using eleven independent microarray studies of the normal human cortex. Each dataset was first analyzed independently, and the results combined across studies using the Fisher’s method [166]. For each factor a characteristic meta-signature of genes was identified, highlighting specific transcriptional changes which implicate an assortment of critical cellular processes. We found a significant overlap between the metasignatures with independent gene lists extracted from the literature, but also identify a large proportion of genes identified as significantly changed only through meta-analysis. In addition, many previously proposed schizophrenia candidate genes appear in the meta-signatures, reinforcing the idea that studies must be carefully controlled for interactions between these factors and disease.  Chapter 3 focuses on differential expression patterns in the prefrontal cortex of individuals with schizophrenia compared to unaffected controls. Expression data was combined across seven microarray datasets forming a final cohort of 153 affected and 153 control individuals. Using an FEM, disease associated changes were extracted on a probe-by-probe basis with careful control for factors investigated in Chapter 2. The combined analysis revealed a schizophrenia meta-signature of 39 probes up-regulated in schizophrenia and 86 down-regulated. Gene expression changes associated with aspects of neuronal communication, and alterations of processes affected as a consequence of changes in synaptic functioning were observed. Some of these genes have been previously identified in expression profiling  32  studies, while others are novel to our analysis. A network analysis using a large protein-protein interaction network, predicts previously unidentified functional relationships among the signature genes.  Chapter 4 builds on the findings from Chapter 3, with the major goal being to explore schizophrenia from a network perspective. Coexpression was evaluated, using the same seven microarray datasets with samples split into cohorts of subjects with schizophrenia and unaffected controls. Coexpression matrices for each cohort were then aggregated across datasets using a rank-based approach, to generate a network representation of the control and schizophrenic prefrontal cortex. Differences between the two networks at a global level are small, suggesting that the overall coexpression structure is retained in the brain between normal and diseased states. Using the two networks we analyzed differential coexpression, by evaluating network properties of the schizophrenia meta-signature identified in Chapter 3. The meta-signature genes exhibit coexpression network properties not observed for other functional gene groups or other brain-related disease gene groups. Finally, each of the networks was clustered into high density sub-networks and we evaluated meta-signature genes in the context of functionally distinct gene complexes.  33  Chapter 2: Meta-analysis of the normal human postmortem brain1 2.1 Introduction Many studies have applied genome-wide expression analysis to human postmortem brain tissue with aims to identify changes in gene expression associated with neuropsychiatric disease [185]. Human brain tissue presents a particular challenge for the analysis of gene expression. The variability between individuals and heterogeneity of the tissue (different cell types), make the detection of small expression changes difficult. It is routine to match samples across conditions and check for confounding effects of sex, age and other factors. However, this is not always easy, as postmortem brain tissue is a limited resource and often sample sizes are small. Another common method of reducing the effects of these factors involves adjustment during data analysis. These methods include stratification of samples or implementation of statistical techniques based on observed covariate distributions in the compared populations. However, many studies are underpowered to detect genes so affected. This greatly complicates the detection of molecular changes associated with neuropsychiatric disorders such as schizophrenia and bipolar disorder [125].  It is therefore important to understand the effects of factors such as age, sex, brain pH and PMI on gene expression in the postmortem brain. This information will allow us to control for confounding sources of variability when seeking disease effects, and provide a means of elucidating biologically interesting patterns due to the factors themselves. A number of studies have examined expression differences associated with age [127, 186], sex [128, 187, 188] and brain pH [124, 189]. Because of small samples sizes and the presence of noise, our knowledge of gene expression changes associated with these factors is likely to be incomplete. 1  A version of this chapter has been published. (Mistry M, Pavlidis P (2010). A cross-laboratory comparison of  expression profiling data from normal postmortem human brain. Neuroscience 167:2. 384-95 doi:10.1016/j.neuroscience.2010.01.016).  34  One approach to detecting weak patterns is to use meta-analysis. In a meta-analysis, the results from multiple studies are statistically pooled to provide an overall estimate of significance of an effect. While meta-analysis has been increasingly used in the study of gene expression data [190-192], to our knowledge only a few studies have done so with postmortem human brain data [7-9].  In this chapter, I have conducted a large cross-laboratory meta-analysis of human postmortem brain data by integrating expression data from multiple studies, rather than a simple comparative analysis of published gene lists. The primary focus of this chapter is to examine gene expression changes in the normal human brain with respect to four factors: age, sex, PMI and brain pH. While many studies treat these factors as a nuisance and attempt to limit their range or control for them, we show that considerable variability in gene expression exists due to these factors. The results from this chapter provide new information on gene expression changes attributable to these factors, and will be useful for future postmortem brain expression studies of neuropsychiatric illness.  2.2 Methods 2.2.1 Data Collection Genome-wide expression data sets were selected on the basis of public availability, inclusion of normal subjects, use of neocortical tissue, and the availability of sample characteristic data. Details on each of the eleven datasets, including the source citation, can be found in Table 3. Sources include the Stanley Medical Research Institute (SMRI), the Harvard Brain Bank, and the Gene Expression Omnibus (GEO). GEO studies were identified by extensive manual and keyword searches. From the available 12 SMRI studies, only two were selected to represent each of the two SMRI brain collections; as the additional data sets represent repeated runs of samples from the same subjects. Sample characteristics for the normal subjects within each dataset were collected (see Table 4 for a summary). Datasets consisted of single-channel intensity data generated from various Affymetrix platforms and one dataset from the Illuimina HumanRef-8 BeadArray platform. For 8 out of the 11 datasets we obtained pre-processed data in which the expression levels were summarized, log transformed and normalized by using the ’rma’  35  function in the R bioconductor ‘affy’ package [193]. Where possible, we obtained the raw data (.cel files) for the remaining datasets and reprocessed it using the ‘rma’ function. For one study in which the raw data was not available, we used the data in its given format.  2.2.2 Regression Analysis Gene expression for each probe, in each dataset, was modeled as a function of each of the factors (age, sex, pH, and PMI). P-values were computed using one-sided tests, performed independently for the two alternative null hypotheses. To make a fair comparison across studies, we re-annotated probe sequences for each array and mapped them to the corresponding GenBank gene using the Gemma database ( ). Probes which were annotated to more than one gene were removed from consideration. For cases in which multiple probes mapped to a single gene, we combined p-values by retaining only the minimum p-value. Analyses were conducted in R [193] for which the code is available at  2.2.3 Meta-analysis of Differential Expression The following meta-analysis was carried out for each of the four factors, and each hypothesis independently. We computed a summary statistic S across n studies for each gene t using Fisher’s method [194], which has been used previously in other microarray meta-analyses [195, 196] , th  where pi is the regression p-value in the i experiment. A given gene was included in the analysis given it was measured in at least three datasets and the particular sample characteristic (i.e. age, pH) was reported. A p-value for S(t) is computed by observing that, under the null hypothesis of uniform p-values 2  within each study, S(t) has a χ distribution with 2n degrees of freedom. The meta-analysis p-values for each signature were processed with the R ‘qvalue’ package to control the false discovery rate, yielding a q-value measure for each gene [143].  36  2.2.4 Validation Analysis We extracted gene lists from the postmortem brain gene expression literature for age [186], sex [187] and brain pH [189]. Each set consisted of a list of probes (Affymetrix probe sets) differentially expressed in the human postmortem brain as reported in their respective studies, which were then split based on direction of change (i.e. up-regulated or down-regulated). Each probe was mapped to its corresponding gene using Gemma. Genes were removed if they were not included in our meta-analysis. Agreement of the metasignature ranking with the respective validation set was performed using receiver operating characteristic (ROC) curve analysis. A meta-signature with an area under the ROC curve (AUC) score closer to 1.0 indicates many genes in the validation set are near the top of the respective ranked list. On the other hand, a score closer to 0.5 reflects that the validation gene set is randomly distributed across the ranking.  Each meta-signature was further analyzed for functional enrichment of GO terms [197], using the ‘overrepresentation analysis’ (ORA) method in ErmineJ [198]. ORA evaluates the genes that meet a specified selection criterion (meta-q < 0.001) and determines if there are gene sets which are statistically overrepresented. Probabilities were computed using the binomial approximation to the hypergeometric distribution and then corrected for multiple testing using the Benjamini-Hochberg procedure.  2.3 Results We first assessed global levels of gene expression across datasets by assigning rank values to each gene based on its mean expression value within a dataset. While we observed variation between studies, there still emerged a clear pattern of genes which were consistently strongly or weakly expressed in the brain supporting the feasibility of comparing studies to one another.  To evaluate gene expression changes with respect to four factors (age, sex, brain pH, and PMI), we used linear regression within each dataset, for each factor individually. We considered both directions of change (up- and down-regulation) for each factor, creating up to eight different scenarios for each dataset. Although the focus of this chapter is to report on the results from the meta-analysis across  37  datasets, we briefly summarize here the results from individual studies. The numbers of genes that show evidence of change with each of the factors (q < 0.001) is given in Table 5. Not surprisingly, the datasets with smaller sample sizes showed fewer statistically significant changes associated with the factors. Overall, the factors associated with the most differential expression were age and brain pH.  To examine changes in gene expression that were consistent across all datasets, or supported by evidence from multiple data sets, we implemented a cross-study meta-analysis approach. The output of this analysis was eight meta-signatures (up and down for each of the four factors). The top ten genes from each meta-signature can be found in Table 6 and full lists (at meta-q < 0.001) can be found in Appendix A. To examine the results, we first extracted the corresponding p-values from each individual dataset and visualized them as (smoothed) plots in the order determined by the meta-signature (Figure 2).We observed that genes that have good meta-q-values tended to have good p-values in multiple, but not necessarily all studies. More detailed results are plotted for some example genes in Figure 3, illustrating that p-values for a given gene can vary across individual datasets. These plots demonstrate that the meta-analysis is capable of identifying significant genes even if they show weak or non-significant effects in some data sets. For example, in Figure 3, for age genes GFAP and RGS4, we observed weak changes in expression level (up and down, respectively), that are not significant after multiple test correction in those studies. On the other hand, we also found genes that show significant effects in most if not all studies (i.e., XIST, Figure 3). In Figure 4, we assembled the top 50 genes down-regulated with age, and plotted the expression levels within each dataset, with samples ordered by increasing age. For most of the studies, we observe a gradient across the dataset as gene expression decreases from high to low levels; illustrating that the meta-analysis recovers many genes which show fairly consistent trends across data sets.  While we have presented results from an analysis which treated each factor independently (linear regression), we also performed a meta-analysis which models gene expression based on the factors simultaneously in an analysis of covariance. This analysis yielded meta-signatures very similar to those identified when factors were modeled independently, with correlations of q-values ranging from 0.79 to  38  0.99. We also attempted to model interactions amongst factors, but for some datasets there was insufficient data. The majority of the data sets used in our analysis are small in sample size (≤ 30 samples), and lack the power to reliably model gene expression with so many predictors.  We tested the robustness of our meta-signatures by using a jackknife procedure. This involved sequentially removing a dataset, performing the meta-analysis on the remaining datasets, and then selecting genes at a slightly higher significance value of meta-q < 0.01. This procedure was repeated for each data set in turn, and genes found in all rounds were retained as a ‘core’ signature. Each of the ‘core’ signatures encompass more than half of the genes found in the corresponding meta-signature, with the pH meta-signatures as an exception. The ‘core’ signatures can be found in Appendix A of this thesis.  The studies we selected for meta-analysis were, in general, not designed to test the effects of age, sex, pH or PMI; in fact attempts may have been made to limit the range of these factors (especially in the case of PMI and pH). However, even across a small range there is inevitable variability of expression; and enough to enable us to perform a meaningful meta-analysis. We still questioned the extent to which our results would agree with more targeted studies, and therefore sought to validate findings from our approach. We identified independent gene lists from the literature for age, sex and brain pH (we could not find a comprehensive validation set for PMI). Each validation gene list was then separated into two groups based on the direction of change, to correspond with meta-signatures obtained from our metaanalysis. Obviously none of these validation gene lists can be considered true gold standards, but does help put our results in the context of previous findings.  To quantify the predictive power of our analysis meta-signatures with respect to the corresponding validation sets, we first performed a standard receiver operating characteristic (ROC) analysis. The score reported for each signature and its respective validation set is the area under the ROC curve (AUC), a value between 0 and 1, where 1.0 is perfect agreement with the external list and 0.5 would reflect a random order. The AUC values for each meta-signature are reported in Table 7. We also tested the effect of using a specific statistical threshold for selecting genes from the meta-signatures, by collecting genes  39  at two significance levels (meta-q <0.01, and meta-q < 0.001). The overlap with the validation set was significant (p<0.001, Fisher’s exact test; Table 7) for all signatures. We also found a comparable overlap between each of the ‘core’ signatures and the validation sets.  A brain pH validation set of genes was obtained from Vawter et al. [199], determined from fold change of controls with no agonal factors and high pH ( > 6.87), compared to controls with agonal factors and pH below 6.87. The study was carried out on two cortical regions, from which we used the dorsolateral PFC pH-sensitive genes for validation of our brain pH meta-signatures. The ROC analysis for brain pH gave high AUC scores of 0.91 and 0.86 (for up- and down-regulated genes, respectively). Additionally, we obtained reasonably high AUC scores of 0.88 and 0.89 (for up- and down-regulated genes, respectively) using a smaller independent pH gene list obtained from Mexal et al. [189], despite a difference in the brain region used between the validation study and the meta-analysis. Because pH itself probably covaries across brain regions [189], our results are consistent with the hypothesis that pH-related changes in gene expression are similar across brain regions. The age signatures on the other hand, exhibited slightly lower AUC scores than those obtained for brain pH. Erraji-Benchekroun et al. [186] used samples from dorsolateral PFC Brodmann area 9 (BA9) and orbitofrontal PFC Brodmann area 47 (BA47) from each subject, to evaluate age expression differences, showing comparable changes in both brain regions. As such, our validation set consisted of genes showing age expression changes collectively within both neocortical brain regions BA9 and BA47. While many of these genes appear at the top of our ranked lists, some are also dispersed throughout our ranking. Finally, the sex meta-signatures from our analysis also scored high when validated with a set of genes from Galfalvy et al.[187]. Although most of the validation genes appeared at the top of the ranking, it should be noted that this validation set had much fewer genes than the others.  We compared the significant genes (meta-q < 0.001) from each of our meta-signatures with genes known to be associated with schizophrenia. We extracted a list of 34 schizophrenia candidate genes provided in a comprehensive literature review [200], and searched this list of genes within each of our metasignatures. We found that 12 of these genes identified with at least one of our meta-signatures, although  40  the majority of overlap was observed with the age and pH meta-signatures (Table 8). The overlap between schizophrenia genes and each of the age meta-signatures was significant at p < 0.01.  To derive a high-level biological interpretation of our meta-signatures we performed a GO [197] enrichment analysis using ErmineJ [198]. We extracted the ‘top’ GO categories for each of the signatures and a compared them amongst each other. Various ‘biological processes’ were found to be unique to each meta-signature. In Figure 5, we have displayed the top ten categories for each meta-signature by depicting each GO category and the associated p-value (corrected for multiple testing). For the age and pH meta-signatures we found GO terms to appear with greater significance.  For genes increasing in expression with the progression of age, top GO categories included those involved in cell growth and proliferation, and cell-cell interaction, consistent with previous studies [186, 201]. Other processes included the insulin receptor signaling pathway, encompassing a number of genes involved in longevity and aging [202]. The age down-regulated genes presented an enrichment in synaptic and/or receptor activity with GO categories such as “neuron recognition” (GO:0008038), “neurotransmitter transport” (GO:0006836), “neurotransmitter secretion” (GO:0007269), and “regulation of neurotransmitter levels” (GO:0001505). This finding is concordant with existing aging studies in mouse and human [186, 203]. Similarly, we found an enrichment of genes involved in neuropeptide signaling in the pH up-regulated meta-signature, in addition to genes implicated in metabolism and a different array of pathways (e.g. G-protein signaling). The female and male meta-signatures identified enrichment of different terms including some sex-specific processes with the female meta-signature such as “female gamete generation” (GO:0007292), and female pregnancy (GO:0007565). Functional enrichment analysis of our meta-signatures does not provide hard cellular evidence, but still serves as a useful indication of the biological processes altered by each factor and contributes some insight at the molecular level.  Because we analyzed each factor independently, we wished to check whether the values for each factor were correlated with each other across the 415 samples (Table 9). Age and PMI displayed the highest correlation of 0.35, consistent with a positive correlation reported in [187]. Age and brain pH displayed a  41  slight negative correlation of -0.2. Further investigation of these two factors revealed that categorizing the values of age into ‘young’ (< 50 years of age) and ‘old’ (≥ 50 years of age) groups resulted in a lower mean pH in the ‘old’ group versus the ‘young’. This was the general trend within each dataset as observed in Figure 6. Due to these correlations, we expected that individual genes in some meta-profiles might overlap with other meta-profiles (Table 10). Accordingly, the two factors that displayed the highest number of overlapping genes were those in age (up or down) and those with brain pH (down or up), respectively. However, these effects were weak and were even weaker in the ‘core’ signatures. We also found that a number of genes up-regulated with PMI were also identified amongst the profiles for brain pH and age in both directions, but we were unable to extract any definite patterns.  42  Table 3: Human postmortem brain datasets included in control brain meta-analysis Dataset  Reference  Description  Microarray Platform  Brain region(s)  A  Stanley Chen  n/a  Schizophrenia, Bipolar, depression  HG-U133A/B (RMA)  DLPFC  13  B  GSE1572  Lu et al. (2004)  Aging study  HG-U95vA (RMA)  Frontal lobe  30  C  GSE2164  Vawter et al. (2004)  Gender differences in expression  HG-U95vA (RMA)  DLPFC  10  D  GSE3790  Hodges et al. (2006)  Huntington’s  HG-U133A/B (MAS 5.0)  Frontal cortex  36  E  GSE11882  Berchtold et al. (2008)  Gender and aging  HG-U133 Plus 2.0 (GC-RMA)  Superior Frontal Gyrus  47  F  GSE11512  Somel et al. (2009)  Transcriptional neoteny  HG-U133 Plus 2.0 (RMA)  Frontal cortex  15  G  GSE8919  Myers et al. (2007)  Cortical gene expression  Illumina Sentrix BeadChip  Cerebral cortex, temporal lobe, frontal lobe, parietal lobe  193  (Illumina Software)  No. of normal Subjects  H  Stanley Kato  Iwamoto et al. (2005)  Schizophrenia, Bipolar  HG-U133A (RMA)  DLPFC  34  I  GSE13162  Chen-Plotkin et al. (2008)  Frontotemporal lobar degeneration  HG-U133A (RMA)  Frontal cortex  8  J  GSE5390  Lockstone et al. (2007)  Down Syndrome  HG-U133A (RMA)  DLPFC  8  K  McLean PFC  n/a  Schizophrenia, Bipolar  HG-U133A (RMA)  PFC  27  43  Table 4: Sample characteristics for control human postmortem brain datasets A B C D E F G H I J K  Dataset Stanley Chen GSE1572 GSE2164 GSE3790 GSE11882 GSE11512 GSE8919 GSE13162 Stanley Kato GSE5390 McLean PFC  Age Range 25-60 26-106 50-82 19-70 20-99 16-47 65-90 47-92 30-60 30-60 30-80  Male : Female 9:4 18 : 12 5:5 21 : 9 23 : 24 10 : 5 107 : 86 5:3 15 : 9 7:1 19 : 8  PMI Range (hrs) 8 - 60 1 - 21 9.8 - 30.75 n/a 2 - 12 4 - 25 1.17 - 54 3.5 - 21 9 - 60 32 - 61 7.42 – 28.83  PH Range 6.0 - 6.6 n/a 6.12 – 6.98 n/a n/a 6.49 – 6.96 n/a n/a 6 – 7.03 n/a n/a  Table 5: Significant genes (q<0.01) identified within each individual dataset Dataset Stanley Chen Stanley Kato GSE1572 GSE2164 GSE8919 GSE11512 GSE11882 GSE13162 GSE5390 McLeanPFC GSE3790  Age Down 0 0 162 0 153 0 428 0 1 0 0  Sex  pH  Up 0 0 64 0 133 0 0 0 1 0 0  Female 2 1 1 1 6 3 2 0 n/a 1 1  Male 7 10 6 2 19 13 14 0 n/a 5 13  Down 0 703 n/a 0 n/a 1 n/a n/a n/a 0 n/a  Up 0 0 n/a 0 n/a 0 n/a n/a n/a 0 n/a  PMI Down 0 0 0 0 2 0 1 0 0 n/a n/a  Up 0 0 0 0 278 0 0 0 0 n/a n/a  ‘Union’ Signature  689  198  10  33  704  0  2  278  Meta-signature Overlap  415  102  6  19  191  0  1  46  Q-values were calculated from regression p-values for each factor within each dataset. The number of genes reported here are significant at q < 0.01. The ‘union’ signature represents the union of unique genes identified for each factor across all datasets. The meta-signature overlap indicates the number of union signature genes overlapping with the corresponding meta-signature.  44  Table 6: Top meta-signature genes for age, pH, sex and PMI Age Down-regulated  Age Up-regulated  OLFM1  olfactomedin 1  NEBL  nebulette  KCNF1  potassium voltage-gated channel, subfamily F, member 1  MED12  mediator complex subunit 12  RGS4  regulator of G-protein signaling 4  BCL2  B-cell CLL/lymphoma 2  PPP3CB  protein phosphatase 3 (formerly 2B), catalytic subunit, beta isoform  GMPR  guanosine monophosphate reductase  ADCY2  adenylate cyclase 2 (brain)  GFAP  glial fibrillary acidic protein  SVOP  SV2 related protein homolog (rat)  ADH1B  alcohol dehydrogenase 1B (class I), beta polypeptide  EFNB3  ephrin-B3  WWOX  WW domain containing oxidoreductase  ATP2B2  ATPase, Ca++ transporting, plasma membrane 2  PLEC1  plectin 1, intermediate filament binding protein 500kDa  HPCA  Hippocalcin  VCAN  versican  CALB1  calbindin 1, 28kDa  AHCYL1  S-adenosylhomocysteine hydrolase-like 1  pH Down-regulated  pH Up-regulated  FGF2  fibroblast growth factor 2 (basic)  AHCYL1  S-adenosylhomocysteine hydrolase-like 1  DTNA  dystrobrevin, alpha  MAPKAPK2  mitogen-activated protein kinase-activated protein kinase 2  TJP1  tight junction protein 1 (zona occludens 1)  S100A13  S100 calcium binding protein A13  RBBP6  retinoblastoma binding protein 6  GNG12  guanine nucleotide binding protein (G protein), gamma 12  ANP32E  acidic (leucine-rich) nuclear phosphoprotein 32 family, member E  BAALC  brain and acute leukemia, cytoplasmic  Female Up-regulated  SLC1A1  solute carrier family 1 (neuronal/epithelial high affinity glutamate transporter, system Xag), member 1  LARGE  like-glycosyltransferase  C17orf81  chromosome 17 open reading frame 81  HISPPD2A  histidine acid phosphatase domain containing 2A  PRKCD  protein kinase C, delta  DLG3  discs, large homolog 3 (Drosophila)  KCNAB1  potassium voltage-gated channel, shaker-related subfamily, beta member 1  SLC8A1  solute carrier family 8 (sodium/calcium exchanger), member 1  GABRA5  gamma-aminobutyric acid (GABA) A receptor, alpha 5  RIT2  Ras-like without CAAX 2  Male Up-regulated  XIST  X (inactive)-specific transcript (non-protein coding)  JARID1D  jumonji, AT rich interactive domain 1D  HDHD1A  haloacid dehalogenase-like hydrolase domain containing 1A  USP9Y  ubiquitin specific peptidase 9, Y-linked (fat facets-like, Drosophila)  UTX  ubiquitously transcribed tetratricopeptide repeat, X chromosome  EIF1AY  eukaryotic translation initiation factor 1A, Y-linked  JARID1C  jumonji, AT rich interactive domain 1C  CYorf15B  chromosome Y open reading frame 15B  TSIX  XIST antisense RNA (non-protein coding)  DDX3Y  DEAD (Asp-Glu-Ala-Asp) box polypeptide 3, Y-linked  USP9X  ubiquitin specific peptidase 9, X-linked  UTY  ubiquitously transcribed tetratricopeptide repeat gene, Y-linked  LOC554203  alanyl-tRNA synthetase domain containing 1 pseudogene  RPS4Y1  ribosomal protein S4, Y-linked 1  STS  steroid sulfatase (microsomal), isozyme S  TTTY15  testis-specific transcript, Y-linked 15  ZFX  zinc finger protein, X-linked  CYorf15A  chromosome Y open reading frame 15A  45  PNPLA4  patatin-like phospholipase domain containing 4  PMI Down-regulated BRD8  bromodomain containing 8  RBM5  RNA binding motif protein 5  PUM2  pumilio homolog 2 (Drosophila)  ARHGEF7  Rho guanine nucleotide exchange factor (GEF) 7  TMSB4Y  thymosin beta 4, Y-linked  PMI Up-regulated GOSR2 CYB5B SNF8 GRLF1 C6orf1 MGMT MAX ST3GAL2 PTGER3 EXT1  golgi SNAP receptor complex member 2 cytochrome b5 type B (outer mitochondrial membrane) SNF8, ESCRT-II complex subunit, homolog (S. cerevisiae) glucocorticoid receptor DNA binding factor 1 chromosome 6 open reading frame 1 O-6-methylguanine-DNA methyltransferase MYC associated factor X ST3 beta-galactoside alpha-2,3-sialyltransferase 2 prostaglandin E receptor 3 (subtype EP3) exostoses (multiple) 1  For each meta-signature we have listed the top ten genes, as ranked by meta-q-value (all at q < 0.001). For each gene we have listed the gene symbol an gene name. Complete meta-signature lists for each factor can be found in Supplementary Table 6  46  Table 7: Comparison of meta-signature profiles against validation gene sets No. of profile genes  Overlap with validation set  No. of profile genes  Overlap with validation set  (q < 0.001)  (q < 0.001)  (q < 0.01)  (q < 0.01)  Age Genes from Erraji-Benchekroun et al, 2005 Up-regulated (268) 404 40 Down-regulated (260) 1134 113  1241 2247  69 136  0.74 0.76  pH Genes from Vawter et al, 2006 Up-regulated (497) 25 Down-regulated (294) 215  11 55  368 1018  131 122  0.91 0.86  Sex Genes from Galfalvy et al, 2003 Male (13) 14 Female (1) 19  7 1  38 128  7 1  0.90 n/a  n/a n/a  691 49  n/a n/a  n/a n/a  PMI Genes Up-regulated Down- regulated  75 4  AUC score  Table 8: Schizophrenia candidate gene analysis Schizophrenia genes identified in meta-profiles  Age pH PMI  Down Up Down Up Up  Opcml, Pldn, Nrg1, Rgs4, Bdnf, Dlg4, Gad67 Ntrk2, Ppp1r1b, Erbb3 Ntrk2, Slc1a2 Rgs4, Gad67 Nrg1  47  Table 9: Rank correlations between sample information Age  Sex  pH  PMI  Age Sex -0.12** pH -0.2 * 0.17 PMI 0.35*** -0.2*** -0.06 Spearman rank correlations were computed using sample characteristic information for individual subjects. * indicates a p-value of ≤ 0.05 ** indicates a p-value of ≤ 0.01 *** indicates a p-value of << 0.001  Table 10: Evaluating gene overlap between meta-signatures No. of Profile genes Age  Age Up  pH  Down  Up  PMI  Down  Sex  Up  Down  Up  404  Down  1134  2  Up  25  0  18  Down  215  75  1  0  PMI  Up  75  2  32  0  2  Sex  Down Female  4 19  1 4  2 1  0 0  0 1  0 0  0  Male  14  0  1  0  0  0  0  pH  Female  Male  0  Using genes for each meta-profile at q < 0.001, we compared them against one another to evaluate the overlap and potential relationships between the factors. We note that the age meta-signature returned genes changing in both directions. This is a consequence of multiple probe-to-gene mappings resulting in the selection of two probes of different specificity for each direction of expression change.  48  49  Figure 2: Distribution of dataset p-values across meta-signature q-values For each dataset used, gene p-values were plotted against the corresponding meta-q value and a loess fit was computed to generate a smooth curve between points. The fact that most data sets show a rise in p-values correlated with the meta-q-values indicates the contribution of signals of varying strengths to the meta-signatures. The distorted curves for gender are due to the strong effects of a small number of genes with very small meta-q-values (note the difference in scale of D compared to A-C).  50  Figure 3: Distribution of dataset p-values for individual genes: a magnified view For selected genes from each of the meta-signatures we have plotted the log regression p-values from each dataset. Open circles represent the datasets for which the gene was found to be significant after multiple test correction (q < 0.01). Dashed line indicates a per-study p-value significance level of 0.05 for reference.  51  Figure 4: Top genes down-regulated with age The top 50 age down-regulated genes were selected based on meta-analysis q-value ranking. For each gene, the corresponding data from each study was extracted and converted to a heat map. Expression values were normalized across samples within each dataset, and ordered by age. Age is plotted at the top of each heat map. Light values in heat map indicate higher expression. Grey bars indicate missing values. All data sets are at approximately the same horizontal scale except the last, which is compressed to fit on the page.  52  53  Figure 5: GO enrichment analysis For the each of the eight meta-signatures, we have displayed the top 10 GO terms identified using a GO over-representation analysis. The y-axis displays the given ‘biological process’ GO term category, while each column on the x-axis represents a meta-signature. The color scale depicts the significance of the term by the negative log10 of the corrected p-value. GO terms were collapsed to parent term if parent and child both appeared in the top ten. Grey bars indicate the absence of the term for the analysis.  Figure 6: Investigating the relationship between age and brain pH Subjects from each dataset were categorized by age into ‘young’ ( < 50 years of age) and ‘old’ ( ≥ 50 years of age) groups and pH levels were plotted. The general trend was lower pH levels in the ‘old’ group. In A and B, we have plotted values within each dataset. In C ,we have plotted the two datasets against each other, as each contains subjects from only one group (GSE11512 = ‘young’; GSE2164 = ‘old’). We see a more pronounced difference between the groups using subjects across all datasets in D.  54  2.4 Discussion In this chapter, I have conducted a meta-analysis of gene expression in the human cortex, by examining changes that occur with respect to sex, age, postmortem interval and brain pH. This meta-analysis was made possible by the fact that many gene expression analyses have useful data for each of these factors, even though they were originally considered potential “confounds” to be controlled for. The results from this chapter have at least two potential uses for future studies. First, the results of our meta-analysis provide new information on the effects of each of the factors on gene expression and can be studied further independently or used to bolster support for other studies. Second, the identification of signatures associated with these factors will provide a ‘watch list’ of genes which might be viewed cautiously if they are found to be implicated in neuropsychiatric disease by expression studies. To facilitate the use of these lists in future studies, they are provided at, with the top ten from each list displayed in Table 6 and significant gene lists (q < 0.001) provided in Appendix A.  There are some limitations to the work presented in this chapter. First, we used a relatively simple metaanalysis method, and acknowledge that there are other techniques which may provide higher sensitivity. Second, we combined datasets generated using different platforms, which may contribute noise and potentially reduce the power of our meta-analysis. The MAQC project recently initiated a number of studies to specifically address these concerns, and in general, reported a high agreement between platforms [137, 204, 205]. A number of other studies have been conducted to this end, showing agreement between platforms [206-208], and a high concordance between the top functions identified by each platform [102, 209]. While we acknowledge that there still remain small differences between studies, we are only focused on the consistencies and combining them to extract more robust expression changes than can be derived from single dataset studies. To maximize total sample size in our study, we accepted studies of any neocortical brain region. All of our datasets utilized samples from the frontal cortex, but we also included one dataset which included samples from the temporal and parietal cortices. There are groups that have studied regional patterns of gene expression in the postmortem human brain, revealing that cortical regions tend to cluster together indicating a shared global expression profile [210-212].  55  Finally, we only considered linear model fits to age, pH and PMI. Future studies can address some of these issues, and also include more data as studies become available.  Comparison of our results to the validation sets strongly supported the relevance of the meta-signatures. The overlap of ‘top genes’ between meta-signatures and their respective validation sets, while statistically significant, identified only a subset of the genes in the validation lists. There are several possible explanations for this effect. One is that most of the studies we used in our analysis treated these factors as nuisances to be eliminated, which may have reduced our power to find real changes. For example, more than half of the datasets have no subjects under the age of 30 at the time of death, and most are over 40. In particular the Kato and Chen data sets, which use samples from the SMRI, have a particularly well-controlled (narrow) age range. In contrast the age validation set used a broader age range (13 to 79 years of age) [186]. Additionally, we expect biological variation among sample groups (and therefore studies). Strong signals in any given data set, including the validation sets, may be specific to that study. That is, none of the validation sets are truly gold standards. Further examination of genes on the validation lists within each individual dataset supports this notion. The agreement between the metaanalysis and the validation lists is better than the agreement between the validation genes and results from any single dataset, with only a few to none of the validation genes being identified in each dataset. In summary, the limited overlap of the meta-signatures with the validation sets may simply be contingent on the data we used, and does not call into question the validation sets or the meta-analysis.  Using a jackknife analysis, we obtained ‘core’ signatures for each of the factors. We found a large proportion of the meta-signature genes to overlap with the ‘core’ signatures, illustrating the ability of our meta-analysis to extract gene profiles robust to influences from individual datasets. The exception was brain pH, for which the ‘core’ signatures consisted of 13 up-regulated genes and only one down-regulated gene. This was due to a large pH effect in the Kato dataset. However, examination of the other data sets revealed that many genes showing large effects in the Kato data set also show trends with pH. Thus, even though the pH meta-signatures are arguably biased by strong effects from the Kato dataset, these genes also show weak signals with pH in the other data sets.  56  The inverse relationship we observed between our age and brain pH meta-signatures is in agreement with a previous study [116]. A review of the literature also reveals that results from independent gene expression studies examining changes with age are strikingly similar to results derived from brain pH profiling studies, [8, 124, 186, 201, 213], further supporting this relationship. It has been suggested that the relationship between age and pH is likely a result of slower modes of death experienced by elderly subjects [116], but this has not yet been fully explored. Previous studies however, have found brain pH to be a proxy for agonal stress [214]. Subjects experiencing a longer terminal phase of death results in lower brain pH levels than would be observed in subjects experiencing a sudden death. We were unable to obtain cause of death information for the majority of our datasets, and thus were unable to incorporate this information into the meta-analysis. This raises the question of whether the reasoning behind the inverse relationship is as hypothesized by Harrison et al. [116], or if the process of aging actually results in a general decline of brain pH.  To this point we have focused on evaluating the meta-analysis in light of other data sets, but clearly one of the reasons to do a meta-analysis is to integrate information on weak patterns. Indeed, our metaanalysis has confirmed previous findings and also added to them. By assembling the significant genes (q<0.01) from each individual dataset, we were able to generate a ‘union’ signature, for each of the factors (Table 5). A comparison of the union signatures against the corresponding meta-signatures revealed that greater than 50% of the genes in each of our meta-signatures are novel (not found in any of the individual studies, and only revealed by the meta-analysis). These novel genes span a broad range of cellular functions, implicating various biological processes with each of the different factors. An example is alterations of the GABA-related transcriptome found with age. We found two GABA receptor genes (GABBR and GABRG2) and two glutamic acid decarboxylase genes (GAD67 and GAD65) to be downregulated with age. Animal studies have demonstrated that GABA receptors are markedly decreased with age, and there is evidence to suggest that this may play a role in age-related cognitive changes [215, 216]. Evidence of reduced inhibitory neurotransmission in the human brain with ageing is supported by evidence from a recent study [217] using the Lu et al. [127] dataset, and is also observed in the results of our meta-analysis. Also consistently altered in our age meta-signature are members of the regulator of G-  57  protein signaling (RGS) family of genes. RGS family members are expressed in the brain and periphery. Their gene products play a critical role in signal transduction by negatively regulating G-protein-coupled receptors (GPCR) by means of their GTPase accelerating activity. These proteins have been implicated in neuronal function and many have been identified as vulnerability factors for several CNS disorders such as addiction, Parkinson’s disease, schizophrenia and mood disorders [218]. In the brain, they function to modulate neurotransmission resulting from the activation of metabotropic GPCRs. RGS4, a member of this family has been previously shown to be down-regulated with age [200]. In our study, we confirm this finding and additionally report four other members of the family, (RGS6, RGS7, RGS12 and RGS17) that display an age-related decline in expression. Alterations in expression of RGS genes present a possible molecular mechanism that could affect neuronal functioning during aging.  One motivation of this chapter was to identify gene expression changes which need to be accounted for when studying potential expression changes in neuropsychiatric disorders such as schizophrenia. This is important because changes in expression due to the factors we studied can be large, compared to the reported effects of psychiatric disease [185]. Thus, even a mild bias in age might cause a change in expression which is potentially larger than the effect of disease. Therefore a gene which is known to change expression with age (for example) must be analyzed very carefully if it is to be considered a candidate marker for disease, because it is difficult to control perfectly for age. We searched our metasignatures for a list of schizophrenia associated genes and found that 12 of these genes identified with at least one of our meta-signatures (Table 8). One such example is RGS4, a gene that has been extensively characterized in schizophrenia both as a susceptibility allele and from expression studies [219]. It is also a gene which we find to be down-regulated with age. Our results confirm previous work showing that RGS4 is down-regulated with age [200]. We also identified the receptor-ligand pair ERBB3 (v-erb-b2 erythroblastic leukemia viral oncogene homolog 3) and NRG1 in our age up- and down-regulated metasignatures, respectively. NRG1 and its receptor ERBB3 are implicated in key neurodevelopmental processes in the nervous system [220], and have also been implicated in schizophrenia [221]. Evidence of a role for NRG1-ERBB3 in schizophrenia includes reduction in the level of ERBB3 expression in human postmortem PFC samples and genetic association evidence linking NRG1 to schizophrenia [221].  58  Another notable candidate is GAD67, a gene that is down-regulated across our meta-signatures for age (consistent with some reports in the literature [222, 223]), and up-regulated with pH. The reduction of GAD67 expression in schizophrenia is arguably one of the best established changes for this disorder [185]. Together these findings of expression alterations of genes implicated in schizophrenia with respect to these factors contribute an added complexity to their pre-existing relationships with the disorder.  Looking specifically at findings from microarray studies of schizophrenia [108, 224-226], we find additional overlap with the results from our meta-analysis. Synaptic machinery transcripts such as SYN2 and SYNJ1, reported as down-regulated in subjects with schizophrenia [98], are also down-regulated with age in our meta-signatures. We see similar patterns between our data and other genes reported to be downregulated in schizophrenia such as MAPK1, KCNK1, and CRYM. Careful analysis of such genes will allow us to explore the potential of interrelationships between these factors and schizophrenia, and reveal the underlying factors driving the changes in gene expression.  In summary, the results from this chapter show that meta-analysis of postmortem human brain gene expression data is both feasible and informative. We have provided a list of gene expression changes associated with four factors that can potentially have confounding effects in portmortem brain studies. The identification of disease associated genes amongst our meta-signatures indicates that the transcriptional response of these genes may warrant special consideration when studying disease effects in neuropsychiatric illness.  59  Chapter 3: Genome-wide expression profiling of schizophrenia using a large combined cohort2 3.1 Introduction Schizophrenia is a severe psychotic disorder that affects approximately one percent of the population worldwide [16]. Many groups have attempted to identify changes in gene expression in the brains of individuals with schizophrenia, often focusing on the prefrontal cortex [185, 225, 227]. Such studies have identified alterations in genes implicating various molecular processes. Some examples include (but are not limited to) synaptic machinery and mitochondrial-related transcripts [98, 224, 228, 229], immune function genes [108] and a reduction in oligodendrocyte and myelination-related genes [106, 230, 231]. The variety and scope of these processes, found in different subject cohorts, raises the question as to whether there are underlying commonalities in molecular signatures in schizophrenia. Such commonalities are presupposed by most genetic studies, which look for alleles overrepresented in large numbers of schizophrenic individuals [232-234]. It is important to establish if there are any common features of the disease at the molecular level.  The diversity of results in transcriptome studies can be attributed to many sources. Besides differences in the sampled cohorts and disease heterogeneity, discrepancies between transcriptome studies can be due to methodological differences in sample preparation, choice of platform, and data analysis. There are also issues that are especially pertinent to the analysis of postmortem human brain tissue. One is the confounding effect of factors such as age, sex and medication. Such factors are often associated with relatively large gene expression changes [235], while psychiatric illnesses such as schizophrenia are associated with small effect sizes. If these factors are not correctly controlled for, they can mask or  2  A version of this chapter has been published. (Mistry M, Gillis, J, and Pavlidis P (2012). Genome-wide expression  profiling of schizophrenia using a large combined cohort. Molecular Psychiatry. doi: 10.1038/mp.2011.172)  60  masquerade as expression patterns associated with the disease. Standard practice involves minimizing the effects of such factors either in the experimental design by sample matching or treating these factors as covariates in regression models. It is also increasingly appreciated that technical artifacts such as ‘batch effects’ can result in substantial variability [13, 236-238]. In addition, postmortem brain tissue is a limited resource, leading to small sample sizes with low statistical power. For this reason, most studies have not applied multiple test correction, and perform validation only on the same RNA samples that were used for profiling. All of these issues are likely to contribute to the differences in findings across studies. I hypothesize that a good way to address these problems is to re-analyze and meta-analyze the studies in question, a task I undertake in this chapter.  The use of meta-analyses to combine high-throughput genomics studies has become increasingly used in neuropsychiatry [7, 13, 233, 239, 240]. Combining datasets across studies increases power and facilitates the identification of gene expression changes that are consistent and reliable, reducing false positives. In a meta-analysis, multiple studies are statistically pooled to provide an overall estimate of significance of an effect, highlighting important yet subtle variations. While meta-analysis has been used in the study of gene expression data [171, 192, 241], to our knowledge only a few studies have done so with postmortem human brain data [7-9, 13]. A cross-study analysis of psychosis was conducted across seven datasets using samples from the SMRI postmortem brain collections [7], in which subjects were divided into groups based on the presence or absence of psychotic features. As such, the control group consisted of patients with bipolar (without psychotic features), and depression in addition to normal healthy controls. Additionally, the SMRI report results from a cross-study analysis across schizophrenia datasets in their online genomics database ( ), computing ‘consensus’ fold changes while adjusting for confounding variables. However, the studies used in these analyses use samples from the same two brain collections and are therefore not entirely independent. More recently, a comparative analysis was conducted across two independent schizophrenia cohorts; probes were identified as differentially expressed within each study and the intersecting probes between the two studies were reported [102]. Thus, while there have been attempts to meta-analyze schizophrenia  61  expression profiling data, there has not yet been an integration using the primary data of more than two independent microarray studies.  In this chapter I present a cross-study analysis of seven microarray datasets comprising a total of 153 schizophrenia samples and 153 normal controls. We applied a linear modeling approach to control for factors such as age, brain pH and batch effects, and applied multiple testing corrections to control the false discovery rate. We show that we are able to detect small yet consistent and statistically significant changes. Careful control of extraneous factors using probe-specific statistical modeling, results in gene expression changes associated with the disease effect. Our results from this chapter confirm some previously reported expression changes in schizophrenia in addition to identifying potential new targets which suggest alterations in synaptic function.  3.2 Methods 3.2.1 Data Collection Genome-wide expression data sets were selected on the basis of microarray platform, use of prefrontal cortex (BA 9, 10 or 46), the availability of information on covariates such as age, and finally the availability of the raw data. Each dataset is comprised of a cohort of neuropathologically normal subjects and a cohort of schizophrenia subjects, as diagnosed and reported in their respective studies (Table 11). Sources for data include the SMRI, the Harvard Brain Bank, and the Gene Expression Omnibus (GEO). GEO studies were identified by extensive manual and keyword searches. While the SMRI has additional data sets, these represent repeated runs of the samples from the same subjects, so we selected one dataset to represent each of the two SMRI brain collections. Two additional studies were obtained from the authors [107, 242]. Sample characteristics for the subjects were collected and are summarized in Table 12. Batch information was obtained using the ‘scan date’ stored in the CEL files; chips run on different days were considered different batches. Datasets consisted of single-channel intensity data generated from two Affymetrix platforms, but only probe sets on the HG-U133A chip from each dataset were used for analysis. Probe sets were re-annotated at the sequence level by alignment to the hg19  62  genome assembly, using methods essentially as previously described [238], and also cross-referenced with problematic probe lists provided by The final data matrix consisted of expression values for 22,215 probes sets and 306 samples.  3.2.2 Data Pre-processing The raw data (“CEL”) files from all the datasets were pooled together and expression levels were summarized, log transformed and normalized by using the R Bioconductor [243] ‘affy’ package using default settings for the RMA algorithm. Data was also processed using four other pre-processing methods for evaluating the robustness of our meta-signatures. We decided to retain standard RMA as the method on which to centre the analysis, because RMA has been shown independently to be a high performer on gold standard data sets [139, 244, 245]. The four methods are as follows: 1) RMA with quantile normalization applied after summarization, 2) RMA using MAS5-style mean adjustment rather than quantile normalization, 3) RMA using MAS5-style mean adjustment rather than quantile normalization (applied after summarization as is typically done in MAS5) and 4) MAS5. We compared the results from each of these methods to our original meta-signatures using Pearson correlations and compared the significant probes (q < 0.1) by computing overlaps.  3.2.3 Data Quality Control Sample outliers were then identified and removed from each dataset based on inter-sample correlation analysis, resulting in the removal of 13 samples (2 of these are the same outliers identified in a previous analysis of SMRI data; Briefly, a sample-by-sample correlation matrix was generated for each dataset by reducing each sample into a vector of probe expression values and taking all pair-wise Pearson correlations. Outlier samples were identified as those showing correlations less than 0.8 with all the other samples, and were removed from the dataset and not included in subsequent analyses.  63  3.2.4 Statistical Modeling Gene expression values for each probe set were modeled using a standard FEM framework. We also employed a model selection procedure, in which each probe set was modeled using the full model including all five factors, as well as various sub-models (an approach similar to that used previously [245]). For the full model, we treated Disease, Age, Brain pH, Batch date and Study as fixed effects for which unknown constants are to be estimated from the data. We generated four sub-models by inclusion/exclusion of selected parameters (Table 13). Each sub-model fit was compared to the full model fit using an ANOVA, whereby an F-statistic was computed to assess whether the loss of a parameter resulted in a substantial loss of explanatory power. Model comparison was also repeated using the Akaike Information Criterion (AIC), a measure of the relative goodness of fit between the two models. AIC is computed using the likelihood function for the estimated model and incurs penalty for each parameter included. Results using the AIC method were highly correlated with those obtained using the ANOVA measure (0.99 Pearson correlation). For each probe set, the t-statistic for the disease effect was then extracted from the best model fit and p-values were computed using one-sided tests, preformed independently for the two alternative null hypotheses (i.e. gene expression does not increase with schizophrenia and gene expression does not decrease with schizophrenia). The resulting p-values for the up- and down-regulated signatures were further adjusted for multiple testing using the q-value method [143] to control the FDR. Alternatively, we also explored mixed-effect models (MEM), treating either Study and/or Batch as a random effect. For each probe the goodness of fit was compared across the different models by using the AIC. For the majority of the probes the FEM fit resulted in the lowest AIC value, indicating best fit of the data to the FEM model.  3.2.5 Literature-derived Signatures Our signatures were compared to probe lists obtained from the original publication for each of the datasets used in our analysis. As the two SMRI datasets were unpublished, gene lists were compiled from the SMRI online genomics database. For the Mclean dataset we used the list of ‘significant probes’ as reported in [102]. For the Haroutunian data set we chose to use probes selected at the ‘low stringency  64  criteria’ described in [107]. Details on each of these gene sets can be found in Table 15 (probes were excluded if they were not on the HG-U133A chip). Additional signatures for comparison were obtained for published schizophrenia expression profiling studies, and a list of the top 45 candidate schizophrenia genes reported in the SZGene database [232]. Agreement of the meta-signature ranking with each validation gene set was assessed using ROC analysis. A meta-signature with an AUC score closer to 1.0 indicates many genes in the validation set are near the top of the ranked list. On the other hand, a score closer to 0.5 reflects that the validation gene set is randomly distributed across the ranking.  3.2.6 Enrichment Analysis To characterize each meta-signature we looked for enrichment of GO terms [197] using the gene score re-sampling (GSR) method in ErmineJ [198, 246], and we evaluated CNS cell type enrichment by crossreferencing the genes in each cluster with published lists of neuron, oligodendrocyte and astrocyte marker genes [247]. We also evaluated each meta-signature against modules of coexpressed genes in the human brain as reported in [12]. We obtained the module membership data pertaining to the cortex dataset (CTX) consisting of 67 samples representing four cortical areas, and analyzed using the U133A array. The CTX gene coexpression network identifies a total of 19 modules to which we compared the probes from each of our meta-signatures by computing overlaps. Significance of overlap was corrected for multiple testing by use of the Benjamini-Hochberg method [142].  3.2.7 Network Analysis We evaluated the path-length and node degree (number of associations) properties of the meta-signature genes in a large human PPI network obtained by aggregating data from multiple sources [248-253]. The network contains 100,623 unique interactions among 11,697 genes. Path lengths in the network were measured using Dijkstra’s algorithm [152]. Statistical significance was assessed by reference to an empirical null distribution obtained by randomly sampled 10000 gene sets of similar size and node degree.  65  3.3 Results For each of our samples we obtained information pertaining to age, sex, brain pH, and PMI. These factors were assessed for significant differences between the control and schizophrenia cohorts to help determine the selection of factors used as fixed effects for our model. We observed no significant differences in age and PMI, and the number of males and females between the groups were well matched (Table 12). Brain pH, however, was significantly different between the two cohorts (t-test; p = 0.001). P-value distributions for each demographic variable indicated considerable differential expression for age and pH and PMI, but a fairly uniform distribution was observed for sex. We also found it was necessary to correct for “batch effects” (technical artifacts caused by running chips on different days or even years [238]), as they contributed the vast majority of variance in gene expression. Based on these observations we chose to include only age, pH and batch (in addition to disease) as fixed effects in our model.  Each probe was considered in a model selection procedure, to identify probe sets that were differentially expressed between schizophrenia and control samples. After multiple test correction we identified a meta-signature of 39 up-regulated and 86 down-regulated probes at an FDR of 0.1 (Table 14). If we assess the number of unique genes that appear in each signature we obtain a list of 25 up-regulated and 73 down-regulated genes. These numbers highlight several cases of a gene which appears in our signature more than once, suggesting higher confidence in the finding of expression changes for those genes. Figure 7 shows the expression levels of the top down-regulated probe we identified (mapping to the gene NECAB3). As expected, expression changes were small (~ 15% expression change), and more evident in some datasets. As required by our modeling procedure, the direction of expression changes is mostly consistent.  While our linear modeling approach controlled for the effects of age and brain pH, we checked our signatures against gene lists for pH and age from our study of normal postmortem human brain in Chapter 2 [235]. The overlap was significant only for our down-regulated signature, which contains 32  66  genes previously identified to be down-regulated by age. Because our profiles are age-corrected and our cohorts age-matched, this suggests overlap in expression changes in age and schizophrenia rather than a confounding effect. We also cross-referenced our schizophrenia signatures with gene lists for sex and PMI (from Chapter 2); the two factors excluded from our model selection approach. We observed a total of three overlapping genes, suggesting the effects from these factors are likely subtle and do not dominate our results. We also sought to address other factors that we were unable to account for in our approach, such as medication effects and alcohol and drug abuse. Using gene lists provided from the SMRI Online Genomics Database ( ), we extracted significant gene lists (p < 0.001; FC>1.2) pertaining to the effects of lifetime alcohol use (23 genes), lifetime drug use (26 genes), and lifetime antipsychotics (69 genes) in subjects with schizophrenia. A comparison of each of these lists to our meta-signatures identified only two overlapping genes. We found KCNK1, which is present in our down-regulated signature, also increases with lifetime alcohol use. From the up-regulated signature the gene LPL, appears to increase with lifetime antipsychotic use and decrease with increased drug use.  To test the robustness of these findings, we used a jackknife procedure, sequentially removing one of the seven studies and performing the meta-analysis on the remaining six, for each study in turn. We expected that results highly influenced by a single data set would not be stable across jackknife runs. Each leaveout iteration resulted in a new meta-signature, which was then ranked by q-value and compared against the final meta-signature. The range of rank correlations among jackknife iterations (0.87 - 0.99) illustrates the robustness of our meta-signatures, demonstrating that our results are not highly biased by any single dataset. The lowest correlations were observed upon removal of the Bahn and GSE21138 datasets (0.88 and 0.87, respectively) suggesting that these datasets may be contributing a slightly stronger signal, particularly to the up-regulated signature. The lack of significant genes at a q < 0.1 in the signature for those jackknife runs corroborates this finding. Finally, the top 100 probes were taken from each jackknife signature and an intersection set was retained to form a ‘core signature’ of 16 down-regulated and 14 upregulated probes (highlighted in Table 14). We consider these probes to be the most reliable findings from our study as they are relatively insensitive to the choice of data sets used. In Figure 8, we have  67  assembled the ‘core signatures’ and plotted expression levels within each dataset with samples separated into control and schizophrenia groups. For some studies we observe a more obvious gradient between the two groups illustrating expression change, and for others the difference is more subtle.  To assess the sensitivity of our results to the choice of pre-processing algorithm we re-analyzed our data with four different methods (see Methods). We obtained good agreement between the results of each method and our final meta-signatures despite dramatic changes to the preprocessing procedure. Additionally, we took the intersection of significant probes from each of the different methods to assemble a list of probes that are completely insensitive to the choice of pre-processing method. This list comprises a total of 5 up-regulated and 8 down-regulated probes, highlighting novel genes and genes that have been previously implicated in independent studies (marked in Table 14).  The set of 98 differentially expressed genes identified from our analysis implicates a variety of genes and functional groups, many of which have been previously reported in the literature. For example, downregulation of mu-crystallin (CRYM), potassium channel subfamily K member 1 (KCNK1), and F-box protein 9 (FBXO9) and up-regulation of lipoprotein lipase (LPL) and lysyl hydroxylase 2 (PLOD2) are concordant with findings from previous studies [106, 108, 229, 254]. We manually evaluated the list of differentially expressed genes individually according to literature reports and Uniprot definitions to characterize genes into high-level functional categories. In the down-regulated signature we found genes to cluster into functional groups pertaining to various molecular mechanisms of neuronal communication. On the pre-synaptic side we found genes involved in cell adhesion (for example, OPCML), and neurotransmitter secretion (for example, APBA2, PCSK2). We also observed genes involved in signalling pathways that elicit metabotropic effects (for example, GNAL, OPN3, CRHR, RGS7, GNB5). Concordant with previous studies, we also identified various genes involved in oxidative phosphorylation (for example, CYP26B1, COQ4, SLC25A15, ATP5C1, SLC25A12) and ubiquitination (for example, FBXO9, COPS7B, USP19, TACC2, DCAF8). From our up-regulated signature we found a number of transcription-related genes (for example, BAZ1A, CBFA2T2, BBX, ANP32A) and genes involved in translation (for example, EIF3E, EIF2C3, PAIP2B). Other genes include cell organization/maintenance factors (for example, PKP4,  68  PLOD2) and various stress response genes (for example, SMG1). Additionally for both signatures we found a small group of genes with unknown function.  We performed a functional analysis to systematically detect enrichment of biological processes, using GO annotations. After multiple test correction, we were unable to identify any significant terms using the ORA method, but significant terms were found using the threshold-free GSR algorithm [198]. For the 73 genes with decreasing expression levels in schizophrenia, the top GO categories included those involved in energy metabolism, and ubiquitination, neurotransmitter transport and various metabolic processes. The 25 schizophrenia up-regulated genes showed enrichment in various immune-related GO categories in addition to terms related to cellular localization. While some of these categories corroborate with findings from the above manual evaluation, there are some that do not (i.e. immune response). It should be noted that the GSR algorithm provides a functional representation of top ranking genes, but not necessarily the significant ones (q < 0.1) that we discussed previously.  Because the genes we identified were functionally diverse, we hypothesized there might be additional insight gained at the level of gene networks. In particular we asked whether the signature genes had any unusual properties in their protein interaction patterns, compared to carefully selected groups of background genes (see Methods). Taking all 98 genes together, we specifically looked at within-group connectivity, node degree (the number of connections) and path lengths between genes. Our most striking finding is that the genes within our set were significantly closer to one another in the network than expected by chance (p<0.02). This relationship suggests a higher likelihood of functional relationships among the signature genes [160, 249]. In contrast, the signature genes did not possess a particularly high rd  node degree within the network (23 percentile in the whole network), that is, they tend not to be ‘hubs’.  We also evaluated each meta-signature against modules of coexpressed genes in the human cortex as reported in [12]. Our up- and down-regulated signatures significantly overlap with the “turquoise” and “brown” modules (p < 0.01 and p < 0.05 respectively; Table 16). These are modules of interest as they display a notable extent of preservation across datasets in [12], suggesting that differential expression of  69  our signature genes may be disrupting core networks in the human brain. This also reinforces the importance of gene network structure analysis in determining the basis of this disorder.  To characterize our schizophrenia signatures with respect to cellular organization in the cortex we crossreferenced our ranked meta-signatures with published lists of CNS cell type markers [247]. An ROC analysis of the meta-signatures for astrocytes, oligodendrocytes and neurons revealed no preferential association with our ranked meta-signatures. However, evaluating only the significant probes (q<0.1) in our signatures, we find an enrichment of probes mapping to neuronal markers in the down-regulated signature.  Each meta-signature was evaluated against the top 45 candidate schizophrenia genes reported in the SZGene database ( Agreement of the meta-signature ranking with the SZGene set was assessed using receiver operating characteristic (ROC) curve analysis. The SZGene list appeared to be randomly distributed across our ranking. We also computed a simple overlap between the 45 candidate genes and our results, identifying OPCML as the only common gene.  We were interested in comparing our re-analysis of these seven data sets to the “hit lists” provided by the data set providers. We first tested whether our meta-signature gene rankings were enriched for genes reported by the original study, using ROC analysis (Table 15). We observed high AUC scores for most gene sets; however the Haroutunian and GSE21138 studies exhibited exceptionally low scores, possibly in part because the original studies have an added dimension of variability as gene sets were generated for stratified cohorts as opposed to a case versus control comparison. While high AUCs suggest some similarity in the results, a more sensitive analysis examines just the very top of the rankings. We therefore computed the overlap of each reference gene set with the meta-signature of genes collected at q<0.1. This reveals a handful of probes in each study that also show up in our significant gene lists (Table 15). We also re-analyzed each individual dataset using our linear modeling approach. This allowed a more fair evaluation of the contribution of each to the final meta-signatures, since the original studies used a variety of methods for gene selection. After correcting for multiple testing, only two of the data sets (Altar and  70  Haroutunian) yielded significant genes at q < 0.1. We therefore considered the top 100 probes from each dataset, and computed overlaps with our meta-signatures. The overlap is highest with the Bahn and GSE21138 datasets, which is in accord with the finding that these datasets contribute a stronger signal to the meta-signature than the others. Despite being the only two data sets which have significant differential expression after multiple test correction, the Altar and Haroutunian results showed very little overlap with the final meta-signature. We note that considering the seven data sets independent of our meta-signature, there was no overlap among their top 100 probes. Similarly, there was little correlation of the overall rankings of probes among the data sets (< 0.3 correlation, with most values closer to zero). Overall these results suggest that our re-analysis is concordant with the analysis conducted by the original study authors, subject to important differences likely attributable to our analytic approach (for example, correction for batch effects), and only revealing commonalities through meta-analysis which contribute weakly to the findings of the individual studies.  71  Table 11: Schizophrenia datasets Dataset  Reference  Microarray Platform  Brain region(s)  No. of Subjects  Stanley Bahn  SMRI database  HG-U133A  Frontal BA46  Stanley AltarC  SMRI database  HG-U133A  Frontal BA46/10  11 : 9  Mclean  HBTRC  HG-U133A  26 : 19  Mirnics  Garbett K. et al, 2008 [242]  HG-U133A/B  Haroutunian  Katsel P. et al, 2005 [107]  HG-U133A/B  GSE17612  Maycox P. et al, 2009 [102]  HG-U133 Plus 2.0  GSE21138  Narayan S. et al, 2008 [255]  HG-U133 Plus 2.0  Prefrontal cortex (BA9) Prefrontal cortex (BA46) Frontal (BA10/46) Anterior prefrontal cortex (BA10) Frontal (BA46)  CTL:SZ 31 : 34  6:9 29 : 31 21: 26  29 : 25  SMRI, Stanley Medical Research Institute; HBTRC, Harvard Brain Tissue Resource Centre (Mclean66 collection  Table 12: Summary of demographic variables across combined cohort Control  Schizophrenia  153  153  Age  56.25 ± 20  55.27 ± 19  p = 0.67  Sex  101M : 52F  113M : 40F  p = 0.1  6.5 ± 0.28  6.39 ± 0.29  p = 0.001  21.95 ± 15.3  22.65 ± 15.2  p = 0.69  Number of Subjects  Brain pH PMI  P-value  F, female; M, male; PMI, postmortem interval. There were 319 samples collected across seven datasets of which 306 passed quality control analysis. The summary demographics (mean ± standard deviation) and t- test p-values for group differences are shown for those subjects used in the analysis. For sex we report the p-value generated from a chi-squared test for equality of proportions.  72  Table 13: Probe model selection across schizophrenia signatures Model Description  Up-regulated Signature (q < 0.1) 5  Down-regulated Signature (q < 0.1) 15  Full Model  Disease + Age + pH + Batch + Study  Model 2  Disease + Age + Batch + Study  22  42  Model 3  Disease + pH + Batch + Study  7  20  Model 4  Disease + I(Age + pH) + Batch + Study  5  6  Model 5  Disease + Batch + Study  0  3  I(Age + pH), models the case in which the effect of age and pH on expression is the same The full model and each of the sub-models used in model selection are described above. Disease, Batch and Study are factors retained in each sub-model; the inclusion/exclusion of Age and pH being the distinguishing factors. The numbers reported indicate the number of probes in the final meta-signature (q < 0.1) that were best fit to each model.  73  Table 14: Schizophrenia meta-signatures A: Up-regulated in schizophrenia  Model  Fold Change  Q-value  Overlapping Factor  Probe Specificity  2  1.11  9.58E-03  age  Insensitive  4  1.04  9.58E-03  Non-specific  2  1.11  2.05E-02  Non-specific  216048_s_at  Multiple gene mappings Rho-related BTB domain containing RHOBTB3 3  2  1.11  2.05E-02  213187_x_at  FTL  2  1.11  2.18E-02  202619_s_at  PLOD2  fullModel  1.12  5.07E-02  age, pH  202975_s_at  RHOBTB3  fullModel  1.16  6.31E-02  age, pH  204060_s_at  Multiple gene mappings  2  1.11  7.16E-02  209747_at  Multiple gene mappings  212788_x_at  FTL  ferritin, light polypeptide  213501_at  ACOX1  acyl-CoA oxidase 1, palmitoyl  216762_at  Unknown  218345_at  TMEM176A  transmembrane protein 176A  219156_at  SYNJ2BP  221503_s_at  KPNA3  synaptojanin 2 binding protein karyopherin alpha 3 (importin alpha 4)  59625_at  NOL3  202506_at 203549_s_at 207543_s_at  P4HA1  209144_s_at  CBFA2T2  219426_at  EIF2C3  Probe  Gene Symbol  Gene Description  203548_s_at  LPL  210057_at  SMG1  lipoprotein lipase SMG1 homolog, phosphatidylinositol 3-kinase-related kinase  209069_s_at  ferritin, light polypeptide procollagen-lysine, 2-oxoglutarate 5dioxygenase 2 Rho-related BTB domain containing 3  2  1.11  7.16E-02  fullModel  1.09  7.16E-02  2  1.06  7.16E-02  2  1.05  7.16E-02  fullModel  1.10  7.16E-02  4  1.03  7.16E-02  3  1.02  7.16E-02  nucleolar protein 3  2  1.08  7.16E-02  SSFA2  sperm specific antigen 2  2  1.11  7.20E-02  LPL  lipoprotein lipase prolyl 4-hydroxylase, alpha polypeptide I myeloid translocation gene-related protein 1 eukaryotic translation initiation factor 2C, 3  2  1.14  7.46E-02  3  1.12  7.46E-02  3  1.04  7.46E-02  2  1.05  7.46E-02  age, pH  Insensitive  Insensitive Non-specific  Mis-targeted  age  74  Model  Fold Change  Q-value  2  1.21  7.46E-02  2  1.07  8.22E-02  bobby sox homolog (Drosophila)  2  1.05  8.22E-02  crumbs homolog 1 (Drosophila)  2  1.09  8.22E-02  Multiple gene mappings  2  1.07  8.51E-02  Non-specific  4  1.05  8.60E-02  Non-specific  208697_s_at  Multiple gene mappings eukaryotic translation initiation factor EIF3E 3, subunit E  4  1.05  8.60E-02  213016_at  BBX  2  1.11  8.60E-02  201051_at  ANP32A  3  1.06  8.81E-02  204032_at  BCAR3  3  1.06  8.81E-02  age  211994_at  WNK1  2  1.12  8.81E-02  age  216520_s_at  4  1.05  8.81E-02  Non-specific  fullModel  1.12  8.98E-02  Insensitive  217985_s_at  Multiple gene mappings ATP-binding cassette, sub-family A, ABCA1 member 1 ATP-dependent chromatin BAZ1A remodeling protein  2  1.05  9.43E-02  218826_at  SLC35F2  solute carrier family 35, member F2  2  1.05  9.43E-02  201927_s_at  PKP4  plakophilin 4  3  1.11  9.46E-02  201929_s_at  PKP4  plakophilin 4  3  1.16  9.46E-02  220532_s_at  TMEM176B  transmembrane protein 176  2  1.13  9.46E-02  Probe  Gene Symbol  221868_at  PAIP2B  211992_at  WNK1  Gene Description poly (A) binding protein interacting protein 2B WNK lysine deficient protein kinase 1  213015_at  BBX  220522_at  CRB1  211997_x_at 200063_s_at  203504_s_at  bobby sox homolog (Drosophila) Acidic (leucine-rich) nuclear phosphoprotein 32 fammily, member A breast cancer anti-estrogen resistance 3 WNK lysine deficient protein kinase 1  Overlapping Factor  Probe Specificity  age Insensitive  Non-specific  75  B: Down-regulated in schizophrenia Probe  Gene Symbol  Gene Description  Model  Fold Change  Q-value  210720_s_at  NECAB3  N-terminal EF-hand calcium binding protein  2  0.92  6.17E-03  212646_at  RFTN1  raftlin, lipid raft linker 1  2  0.91  6.17E-03  213924_at  Unknown  2  0.89  6.17E-03  Overlapping Factor  Probe Specificity Non-specific  age Insensitive  206355_at  GNAL  guanine nucleotide binding protein G(olf) subunit alpha  2  0.87  7.42E-03  220807_at  HBQ1  hemoglobin, theta 1  2  0.91  7.42E-03  220741_s_at  PPA2  pyrophosphatase 2  3  0.92  9.55E-03  205694_at  TYRP1  tyrosinase-related protein 1  2  0.88  1.20E-02  Insensitive  219032_x_at  OPN3  opsin3  2  0.89  2.71E-02  Insensitive  205510_s_at  Multiple gene mappings  2  0.93  2.74E-02  212987_at  FBXO9  F-box protein 9  fullModel  0.88  3.28E-02  age  Insensitive  202596_at  ENSA  endosulfine alpha  fullModel  0.90  4.20E-02  age  Insensitive  203719_at  ERCC1  DNA excision repair protein  2  0.94  4.20E-02  218328_at  COQ4  4  0.93  4.20E-02  218653_at  SLC25A15  coenzyme Q4 homolog solute carrier family 25 (ornithine transporter) member 15  2  0.94  4.20E-02  206290_s_at  RGS7  2  0.89  4.36E-02  age  Non-specific  218262_at  RMND5B  2  0.95  4.36E-02  219825_at  CYP26B1  fullModel  0.86  4.36E-02  206356_s_at  GNAL  regulator of G-protein signaling 7 required for meiotic nuclear division 5 homolog B cytochrome P450, family 26, subfamily B guanine nucleotide binding protein G(olf) subunit alpha  2  0.90  4.81E-02  219982_s_at  3  0.89  4.81E-02  203851_at  Multiple gene mappings insulin-like growth factor binding IGFBP6 protein 6  2  0.91  5.00E-02  203349_s_at  ETV5  2  0.92  6.76E-02  age, sex  205413_at  MPPED2  fullModel  0.89  6.76E-02  age  ets variant 5 metallophosphoesterase domain containing 2  Insensitive age  age  Insensitive  76  Probe  Gene Symbol  Gene Description  219997_s_at  COPS7B  COP9 constitutive photomorphogenic homolog subunit 7B  206209_s_at  CA4  206215_at  OPCML  207949_s_at  ICA1  209871_s_at  APBA2  215003_at  DGCR5  201190_s_at  PITPNA  201310_s_at  C5orf13  201694_s_at  EGR1  202688_at  TNFSF10  204000_at  GNB5  205489_at  CRYM  221983_at  FAM134A  202322_s_at  GGPS1  crystallin mu family with sequence similarity 134 , member A geranylgeranyl diphosphate synthase 1  203769_s_at  STS  205003_at  DOCK4  208870_x_at  ATP5C1  212942_s_at  KIAA1199  Model  Fold Change  Q-value  Overlapping Factor  4  0.97  7.28E-02  carbonic anhydrase IV opioid binding protein/cell adhesion molecule like  fullModel  0.92  7.41E-02  age  2  0.94  7.41E-02  age  islet cell autoantigen 1 amyloid beta (A4) precursor protein binding, family A, member 2 DiGeorge syndrome critical region gene 5 (non-protein coding) phosphatidylinositol transfer protein alpha chromosome 5 open reading frame 13  2  0.95  7.41E-02  age  5  0.95  7.41E-02  fullModel  0.93  7.41E-02  fullModel  0.93  7.54E-02  age  2  0.93  7.54E-02  age  2  0.89  7.54E-02  3  0.83  7.54E-02  4  0.94  7.54E-02  age, pH  fullModel  0.87  7.54E-02  age  3  0.93  7.54E-02  5  0.95  8.55E-02  steroid sulfatase, isozyme S  3  0.91  8.55E-02  dedicator of cytokinesis 4 ATP synthase subunit gamma, mitochondrial  3  0.95  8.55E-02  fullModel  0.93  8.55E-02  2  0.92  8.55E-02  2  0.93  8.55E-02  age  3  0.90  8.60E-02  sex  4  0.87  8.66E-02  early growth response 1 tumor necrosis factor (ligand) superfamily, member 10 guanine nucleotide binding protein, beta 5  34726_at  CACNB3  201328_at  ETS2  calcium channel voltage-dependent subunit beta 3 v-ets erythroblastosis virus E26 oncogene homolog 2  203339_at  SLC25A12  solute carrier family 25, member 12  Probe Specificity  Non-specific  Non-specific  77  Probe  Gene Symbol  Gene Description  Model  Fold Change  Q-value  Overlapping Factor  204679_at  KCNK1  212252_at  CAMKK2  213366_x_at  ATP5C1  202159_at  FARSA  203188_at  B3GNT1  potassium channel, subfamily K member 1 calcium/calmodulin-dependent protein kinase beta ATP synthase subunit gamma, mitochondrial phenylalanyl-tRNA synthetase, alpha subunit Beta-1,3-Nacteylglucosaminyltransferase 1  2  0.88  8.66E-02  age  2  0.94  8.66E-02  age  fullModel  0.93  8.66E-02  2  0.95  8.76E-02  age  204002_s_at  ICA1  islet cell autoantigen 1  2  0.95  8.76E-02  age  2  0.96  8.76E-02  age  204869_at  PCSK2  neuroendocrine convertase 2  2  0.89  8.76E-02  age  205794_s_at  NOVA1  neuro-oncological ventral antigen 1  3  0.94  8.76E-02  age  209093_s_at  GBA  glucosidase beta acid  2  0.95  8.76E-02  209699_x_at  Multiple gene mappings  2  0.93  8.76E-02  210638_s_at  FBXO9  F-box protein 9  fullModel  0.88  8.76E-02  218125_s_at  CCDC25  coiled-coil domain containing 25  2  0.92  8.76E-02  220031_at  OTUD7B  3  0.96  8.76E-02  202852_s_at  FLJ11506  OTU domain containing 7B alpha- and gamma-adaptin binding protein  3  0.91  8.79E-02  age, PMI  206490_at  DLGAP1  PSD-95 binding protein  3  0.94  8.79E-02  age  205114_s_at  Multiple gene mappings  2  0.88  8.97E-02  203476_at  TPBG  trophoblast glycoprotein  2  0.89  9.61E-02  204676_at  TMEM186  transmembrane protein 186  3  0.94  9.61E-02  205381_at  LRRC17  Leucine rich repeat containing 17  3  0.91  9.61E-02  210874_s_at  HYAL3  hyaluronoglucosaminidase 3  2  0.94  9.61E-02  211038_s_at  Multiple gene mappings  3  0.95  9.61E-02  214285_at  FABP3  fatty acid binding protein 3  fullModel  0.90  9.61E-02  age  217946_s_at  SAE1  SUMO1 activating enzyme subunit 1  fullModel  0.95  9.61E-02  age  221921_s_at  CADM3  cell adhesion molecule 3  2  0.97  9.61E-02  218569_s_at  KBTBD4  kelch repeat and BTB domain 4  2  0.94  9.69E-02  Probe Specificity  Insensitive  Non-specific age  Non-specific age  Non-specific Non-specific  78  Probe  Gene Symbol  Gene Description  Model  Fold Change  Q-value  Overlapping Factor  201410_at  PLEKHB2  202250_s_at  DCAF8  202289_s_at  TACC2  202667_s_at  SLC39A7  204685_s_at  ATP2B2  solute carrier family 39, member 7 plasma membrane calciumtransporting ATPase 2  206023_at  NMU  neuromedin U  211796_s_at  214619_at  Multiple gene mappings proteasome 26S subunit, nonPSMD14 ATPase regulatory subunit 14 putative splicing factor, SFRS14 arginine/serine-rich 14 corticotropin releasing hormone CRHR1 receptor 1  pleckstrin homology domain  3  0.94  9.82E-02  age  DDB1 and CUL4 associated factor 8  3  0.94  9.82E-02  transforming acidic colied-coil containing protein 2  2  0.94  9.82E-02  3  0.95  9.82E-02  2  0.94  9.82E-02  age  fullModel  0.90  9.82E-02  age  fullModel  0.92  9.82E-02  3  0.95  9.82E-02  3  0.97  9.82E-02  4  0.95  9.82E-02  214883_at  THRA  4  0.96  9.82E-02  216970_at 64488_at  Unknown  2  0.96  9.82E-02  Mis-targeted  Multiple gene mappings  2  0.95  9.82E-02  Non-specific  218032_at  SNN  stannin  2  0.96  9.84E-02  205061_s_at  EXOSC9  exosome component 9  3  0.96  9.97E-02  206435_at  B4GALNT1  glycolipid synthesis  2  0.96  9.97E-02  214674_at  USP19  ubiquitin specific peptidase 19  5  0.96  9.97E-02  218968_s_at  ZFP64  zinc finger protein 64 homolog  2  0.94  9.97E-02  212296_at 214092_x_at  thyroid hormone receptor, alpha  Probe Specificity  Non-specific  age Non-specific age  Each probe is listed with its associated gene symbol, gene description, linear model of best fit, and fold change. Score from meta-analysis is provided by q-value, an FDR adjusted p-value, see [143]. The overlapping factor column identifies genes that also appear in our previously reported age, pH, PMI and sex gene lists. Finally, we have also included a column for probe specificity. Probes identified as ‘mis-targeted’ or ‘non-specific’ were found to overlap when cross-referenced with lists extracted from Probes identified as ‘insensitive’ are robust expression changes that are completely insensitive to the choice of pre-processing algorithm. Rows highlighted in green indicate ‘core’ signature genes retained after jackknife validation.  79  Table 15: Comparison of meta-signatures with findings from original studies  Significance Criteria  Down-regulated Probes AUC Overlap  Probes  Up-regulated AUC Overlap  Dataset Stanley AltarC  p < 0.05  848  0.70  14  34  0.78  0  Stanley Bahn  p < 0.05  69  0.85  6  91  0.89  5  Mclean [102]  p < 0.05, intensity > 30  570  0.75  13  300  0.76  7  Mirnics [242]  p < 0.05, |ALR| > 0.58  7  0.55  0  4  0.94  1  Haroutunian (BA10) [107] Haroutunian (BA46) [107] GSE17612 [102]  p < 0.05, |FC| > 1.4, present calls > 60% p < 0.05, |FC| > 1.4, present calls > 60% p < 0.05, intensity > 30  5  0.5  0  14  0.66  1  50  0.21  0  11  0.59  0  548  0.74  22  466  0.71  7  GSE21138 [255]  p < 0.05; |FC| > 1.25  482  0.50  1  173  0.57  1  (Short DOI) GSE21138 [255]  p < 0.05; |FC| >1.25  132  0.60  0  78  0.69  0  (Int DOI) GSE21138 [255]  p < 0.05; |FC| > 1.25  37  0.63  1  89  0.63  1  (Long DOI) DOI, duration of illness; AUC, area under the curve; FC, fold change. The findings from each dataset are summarized including only probes used in our analysis. The ‘Probes’ column indicates number of probes found from the study based on study specific significance criteria. AUC values were computed from an ROC analysis of each gene set against the corresponding ranked meta-signatures. Overlap values report the number of probes in each gene set that overlaps with probes from the meta-signatures at q < 0.1.  80  Table 16: Evaluating meta-signatures against brain-specific gene coexpression modules A: Genes down-regulated in schizophrenia Module Module Probe Size Overlap  Corrected p-value (BH)  Genes  *brown  868  9  0.019  turquoise  1115  9  0.06  PITPNA,ENSA,DOCK4,CRYM, OPCML,FBXO9,FABP3 C5orf13,IGFBP6,GNB5,MPPED2 RGS7,CAMKK2,RFTN1,CACNB3  grey  158  1  0.26  WDR42A  pink  98  1  0.15  FLJ11506  black  188  1  0.28  B3GNT1  *darkolivegreen  28  2  2.27E-03  SLC25A12,TPBG  yellow  837  2  0.72  DLGAP1,APBA2  green  502  2  0.45  USP19,CADM3  B: Genes up-regulated in schizophrenia Module Module Probe Size Overlap  Corrected p-value (BH)  Genes  *turquoise  1115  8  4.47E-04  ANP32A,PKP4,SSFA2, WNK1,BBX  *black  188  4  2.06E-04  *brown  868  7  4.5E-04  PLOD2,SYNJ2BP,EIF2C3 RHOBTB3,ABCA1,BCAR3, FTL, ACOX1,TMEM176B  grey  158  1  0.053  blue  1037  1  0.616  EIF3E  Each meta-signature was cross-referenced against modules of coexpressed genes as reported in [12]. Modules showing an overlap with meta-signature genes are listed. The total number of probes contained in each module, and the number of module probes overlapping with each meta-signature is also presented. Additionally, we have included the gene names for the signature probes contained in these modules. Benjamini-Hochberg (BH) correction was applied, and corrected p-values are reported. Modules highlighted with an asterisk (*) indicate significant overlaps (p < 0.05).  81  Figure 7: Example of consistent expression changes for a gene across data sets Expression data within each dataset after covariate correction is presented for the top down-regulated gene NECAB3. Plots are labeled with the associated dataset. Samples were separated into disease and control cohorts and expression was plotted as a boxplot. Individual sample values were overlaid on with red squares representing control individuals and blue triangles representing schizophrenics.  82  83  Figure 8: Expression changes in the ‘core signatures’ For each probe in the core signatures (meaning they are retained as significant even after the removal of any single study), the corresponding data from each study was extracted and converted to a heat map. Expression values were normalized across all samples within each dataset, and as in Figure 1 the data are corrected for the covariates such as batch and age. Rows represent probes and are labeled with its unique gene mapping if one exists. Columns represent samples. Grey bars represent the control brain samples, and the black bar represents the schizophrenia samples. Light values in the heat map indicate higher expression values.  84  3.4 Discussion In this chapter I present expression changes associated with schizophrenia which are consistent across up to seven independent cohorts of subjects. To my knowledge, the degree of validation and confirmation inherent in this analysis is unprecedented. Unlike previous studies, which use PCR assays to check results on the same RNA samples used for microarrays, or which compare at most two cohorts, I have identified changes in expression that are shared across independent subject cohorts, analyzed by laboratories distributed around the world. The results of this chapter provide a new window into the molecular changes that might underlie schizophrenia.  The larger number of down-regulated probes is in agreement with previous reports [98, 108, 227]. Many of the genes we have identified have been previously reported to be expressed in the brain, with some genes showing neuronal specificity. Some of the genes we report as differentially expressed have been previously implicated in schizophrenia, either through expression profiling studies of schizophrenia (KCNK1, CRYM, FBXO9), or genetic association studies (OPCML [234]). We also identify three genes in our signature (up-regulated genes WNK1 and ABCA1 and down-regulated gene SNN) that overlap with results from a comparative analysis of two of the studies we used [102]. Additionally, we found functional gene groups discussed in previous expression studies of schizophrenia. Many of the same metabolic processes were observed in a study of 71 different metabolic genes groups in schizophrenia [229]. Also in agreement, genes related to energy pathway and mitochondrial function were found previously in dorsolateral PFC studies of schizophrenia [254, 256]. Over-expression of immune responses from our GSR analysis is also concordant with recent findings of over-expression in genes related to immune function in schizophrenia [108, 109, 255].Thus, our results are supportive of at least some previous findings and reveal a previously unrecognized similarity across studies.  Our meta-signatures contain a number of interesting new candidate genes, particularly our downregulated meta-signature which potentially reflects alterations in neuronal communication. NOVA1 is a regulator of RNA splicing recently found to inhibit splicing of exon6 from the dopamine receptor D2 gene  85  resulting in D2L, the long isoform of the receptor [257]. With NOVA1 decreasing in expression in schizophrenia, inhibition may be repressed leading to higher than normal levels of the spliced D2S isoform which is involved in neuron firing and dopamine release. The DLGAP1 gene encodes a protein interacting with PSD-95 and a complex of other proteins in the postsynaptic density. Decreased expression of this scaffold protein may have consequences for anchoring and organizing receptors and signaling molecules on the postsynaptic side. Moreover, we have identified several genes associated with calcium signaling (CACNB3), binding (SLC25A12, NECAB3) and homeostasis (CCL3, ATP2B2), processes of likely relevance to schizophrenia [258]. We have also identified genes that associate with the GPCR signaling pathway. One example is GNAL, a gene encoding for the alpha subunit of the Gprotein Golf, expressed in many regions of the brain. Given the critical roles of G-proteins it is plausible that GNAL (and other GPCR related genes) may have a role in the pathophysiology of schizophrenia [259]. GNAL expression has not been previously shown to be affected by schizophrenia, but it is located in a chromosomal region (18p.2) that has been linked to schizophrenia and bipolar disorder. More specifically, a di-nucleotide repeat in intron 5 of the GNAL gene has been linked to schizophrenia in some families [260]. These expression changes concerning synaptic function may reduce neuronal energy demand in the brains of affected patients thus providing explanation for the down-regulation of various oxidative phosphorylation and energy metabolism genes that we observe.  We also sought to examine whether our signature genes could be inferred to share some previously unknown function, by making use of gene network analysis. One way to do this is by the principle of “guilt by association”, which states that genes with shared function are more likely to interact [149]. However, the meta-signature genes have a fairly low number of interaction partners, making “guilt” difficult to ascertain. Another property to examine is path length in the network, where genes that have short paths between them might be more functionally related. In general, low node degrees would imply higher path lengths among the genes, but this was not the case for our gene set. That is, the signature genes are linked by unusually short paths in the network. Additionally, we found each of our meta-signatures revealed a significant overlap with previously identified gene coexpression modules in the human cortex  86  [12]. This suggests a relationship among the genes that is not reflected in current annotations and a network analysis of these schizophrenia genes will be investigated in greater detail in the next chapter.  We found that some of the down-regulated schizophrenia genes overlap with genes that decrease in expression with age. Many of the biological processes affected by age (for example, oxidative phosphorylation) also tend to appear as affected processes in schizophrenia, both in this study and existing profiling studies [106, 108, 229, 254, 256]. These findings suggest that many genes affected by age are also affected by schizophrenia, but also raises the possibility of confounding effects. A comparison of pH genes with schizophrenia meta-signature genes did not reveal much of an overlap, although we did observe a significant difference in pH levels between the control and schizophrenia samples. Our findings of patients with schizophrenia displaying decreased brain pH levels relative to controls, is a common feature observed in postmortem brain studies of schizophrenia [118, 256]. The cause of this decrease is unclear. It has been proposed that lower pH together with increased lactate levels in schizophrenia implicates oxidative stress and energy metabolism as a possible pathology of the illness [256]. However, studies of the frontal cortex of rats treated with antipsychotics also exhibit similar properties (with pH and lactate levels), suggesting decreased pH levels is secondary to medication effects [261]. As these age and pH effects could be confounded with the disease effect, one could filter the list of schizophrenia candidate genes from our results by simply removing known age- and pHaffected genes from the final signature (leaving 31 up- and 51 down-regulated probes) to investigate these effects more thoroughly.  The results from this chapter should be interpreted in the context of several caveats. First, the approach employed is specifically designed to find concordant results across studies, and does not detract from the findings from the original single data set studies. We do suggest, however, that genes found to be commonly differentially expressed by multiple studies are of particularly high value in identifying underlying etiological influences in schizophrenia. As is the case for all postmortem brain studies, we also cannot be sure that the expression changes we have identified are direct effects of the illness or are secondary effects of the disease, medication or other external factors. An additional caveat is that  87  because we were unable to obtain medication or illicit drug use information for all subjects, we were not able to incorporate this information into our analysis. To help address this we compared our signatures against gene lists derived from a recent review on convergent antipsychotic mechanisms [262]. We observed no overlap with our signatures. In addition to antipsychotics, the use and abuse of other recreational drugs and smoking are also compounds that can confound the study of disease-related gene expression. Due to a lack of sufficient information on these factors we were unable to strictly control for them in our analysis. However, using gene lists provided from the SMRI Online Genomics Database ( we were able to make comparisons to address some of these factors and identified two overlapping genes. While the small number of overlapping genes is suggestive that we have identified genes in our signature that are not affected by such extraneous factors; we acknowledge that we cannot entirely exclude the possibility that the gene expression changes we have identified are still in some way influenced.  In conclusion, I have contributed the most comprehensive meta-analysis of schizophrenia expression profiling studies to date. The most striking finding is that despite the heterogeneity of the disorder, we were able to detect a common signature of schizophrenia. Additionally, I elaborate on the biological relevance of our gene list, illustrating a need for further genetic study to fully enhance our understanding of the direct implication of these changes in expression with the illness. The signatures we identified are consistent with current hypotheses of molecular dysfunction in schizophrenia, including alterations in synaptic transmission and energy metabolism. However, the diversity of genes we found suggests that systems biology approaches, exemplified by the analysis of gene network structure, will be of value in determining the basis of this disorder. The approaches used in this chapter of work should be applicable to other neuropsychiatric disorders if sufficient data are available.  88  Chapter 4: Gene coexpression network analysis of schizophrenia 4.1 Introduction Schizophrenia is a severe psychotic disorder for which a comprehensive biological understanding remains elusive. Evidence from neuroimaging, neurocognitive and postmortem brain studies of schizophrenia demonstrate that multiple cellular pathways are related to the pathophysiology [72]. Notably, there is a lack of concerted integration of our knowledge of molecular deficits which can hinder systems-level interpretation of schizophrenia. Gene expression profiling of postmortem human brain (e.g. microarrays) has been increasingly used as a means to investigate patterns of molecular disruption in the brains of patients with schizophrenia. One of the most common types of analysis applied to expression profiling data is differential expression; which is used to identify over- or under-expressed genes associated with the illness. Candidate genes identified from expression profiling studies in schizophrenia have implicated alterations in different cellular systems, including myelination, synaptic transmission, metabolism, and ubiquitination [106, 108, 224, 228, 229, 231, 256]. These findings are not always replicated across individual studies, nor have they been successfully integrated into a comprehensive biological framework.  As presented in Chapter 3, I performed a meta-analysis of differential expression using seven independent expression profiling datasets, to identify a set of candidate genes which are consistently differentially expressed in the prefrontal cortex of patients with schizophrenia [263]. The functions reflected in our ‘meta-signature’ of schizophrenia genes are diverse and the interactions between them are largely unexplored. Because gene function is partly defined by interactions with other genes (at the biochemical, physical interaction, genetic or regulatory levels), it is attractive to apply gene networks as an aid in the interpretation of gene function in the nervous system. In Chapter 3, I evaluated our ‘metasignature’ of schizophrenia genes in the context of a PPI network, revealing a shared relationship not reflected in the current annotations of these genes. In this chapter I evaluate the ‘meta-signature’ of genes within gene coexpression networks to establish further evidence of relationships among them. My  89  aim is not to replicate our findings from the PPI network, as the two types of networks are not directly comparable. Rather, I ask whether our ‘meta-signature’ of schizophrenia genes also collectively exhibit unusual properties in brain coexpression networks and how those properties may differ between healthy controls and individuals with schizophrenia.  A gene coexpression network is an undirected graph, which is constructed from expression profiling data using correlation-based inference methods. The graph nodes correspond to genes, and the edges (or connections) between two genes represent significant coexpression based on thresholded values of the pairwise correlation coefficient calculated from the expression data [147]. In PPI networks, the edges correspond to binary values indicating the presence/absence of a known interaction between two proteins. The edges in a coexpression network on the other hand reflect the correlation structure of the data. While the edges in a coexpression network indeed contain come biological meaning, a connection between two genes should not be mistaken to suggest a physical interaction between them. In general, gene networks can be analyzed to identify higher-level features of gene-gene relationships based on graph theoretic considerations such as node degree (the number of connections a gene has, i.e., to identify “hubs”) or clustering coefficient (how well connected the neighbours of gene are to one another) [148, 159, 264]. Evaluating the broader network structure allows us to detect modularity in the graph, or groups of densely connected nodes with sparse connections between groups [265]. Characterization of these tightly knit ‘modules’ can convey useful information as they may be associated with specific molecular complexes or functions, yielding hypotheses that would be difficult to ascertain based on a gene-by-gene analysis.  An especially attractive feature of coexpression analysis compared to PPI networks is that it can exploit data that are condition-specific. Given samples from two different conditions we can consider the occurrence of ‘differential coexpression’ which might reflect changes in regulatory network wiring [266]. We can evaluate modularity between condition-specific networks to elucidate similarities and differences in network topology. Network analyses have recently been applied to a number of postmortem human brain expression profiling datasets for examining general transcriptome patterns of the CNS [12], and to  90  interrogate the molecular basis of neuropsychiatric [10, 13, 183] and neurodegenerative diseases [11, 267]. A recent study by Torkamani and colleagues [13], conducted a network analysis by combining two independent schizophrenia expression profiling datasets. Expression data was merged across studies and separated into control and schizophrenia cohorts. Weighted gene coexpression network analysis (WGCNA) based methods [148] were then applied to the merged data to create two networks to represent the control and schizophrenia brain. Modules of coexpressed genes were identified and characterized by disease association, cell type specificity and functional enrichment. Similar module composition was observed in both schizophrenia and control networks, highlighting relevant biological themes such as oxidative phosphorylation, energy production and metabolism.  In this chapter, we apply coexpression network analysis to the seven schizophrenia microarray datasets used in Chapter 3, and evaluate differences in network properties. We used a rank aggregation approach [182] to combine coexpression data across studies and generate separate networks to represent the control and schizophrenia brain. The two networks exhibit a similar structural topology, suggesting that the overall coexpression structure of the prefrontal cortex is normal in individuals with schizophrenia. We then examined properties of our ‘meta-signature’ of schizophrenia genes within each network, and identified features that were not observed with other functionally similar groups of genes or other brainrelated disease gene sets. Clustering of our networks into high density sub-networks, and association of our ‘meta-signature’ genes with these clusters, suggests the disruption of processes previously implicated in schizophrenia. In particular, our results provide evidence for dysfunction in oligodendrocytes and myelination-related processes in schizophrenia. Finally, we also discuss the challenges to be addressed for future studies of gene coexpression networks in schizophrenia.  91  4.2 Methods 4.2.1 Data Processing and Quality Control  Expression profiling data sets were selected on the basis of criteria required for inclusion in the metaanalysis of differential expression described in Chapter 3 and in [263]. Details on each of the seven datasets, including the source citation, can be found in Table 11 of Chapter 3. Data were preprocessed as described; briefly, expression levels were summarized, log2 transformed and normalized for each individual dataset using the R Bioconductor ‘affy’ package [243], with default settings for the RMA algorithm. Sample outliers were removed from each dataset based on an inter-sample correlation analysis, resulting in a total across the seven data sets of 306 samples (153 from schizophrenia subjects, 153 from unaffected controls). For each of the seven data sets, batch information was obtained using the ‘scan date’ stored in the CEL files; chips run on different days were considered different batches and batch effects for each dataset were removed using the ComBat algorithm [214].  4.2.2 Gene Coexpression Networks For each dataset, samples were separated into control and schizophrenia cohorts. Probes were mapped to genes using annotations provided in Gemma (, which are based on stringent methods described in [238]. For genes mapping to multiple probes, the average expression value was retained. Only genes that were represented in all seven datasets were considered, leaving a total of 12,582 genes. This yielded seven expression data matrices for schizophrenia and seven for controls (one for each study). Separate networks were constructed for the schizophrenia and control groups based on previously described methods [182]. Briefly, a gene expression profile similarity matrix was computed for each cohort by taking the absolute value of the Pearson correlation between all possible gene pairs. Correlation values in the similarity matrix were replaced by ranks. These similarity matrices were aggregated by cohort across datasets by taking the mean rank for each gene pair. We previously showed that this aggregation procedure is a robust method for producing high-quality coexpression networks. In keeping with previous work [182], the aggregated matrix was thresholded at  92  0.5% sparsity, resulting in an adjacency matrix of 392,606 connections for each of the control and schizophrenia cohorts.  4.2.3 Random Coexpression Networks  To evaluate the significance of network measures across the whole network, formulation of appropriately randomized null models are required. We devised a procedure that results in a random network with the same number of genes and the same node degree distribution as the original data. Additionally, the node degree for each individual gene is preserved (i.e. each gene still has the same number of connections, but the specific genes which it is connected to are scrambled). All gene pairs were assembled into an adjacency list (2 columns, 392,606 rows) and genes on one side of the edge were permuted. The resulting edges that represent self-connections and/or duplicate gene pairs (“problem edges”) were isolated and permutation was re-applied to them. This was done iteratively on the subset of problem gene pairs until the number of “problems” was reduced to ten or less. These remaining “problems” were removed from the final random network.  4.2.4 Network Properties  We explored four different network properties, each of which is briefly described below.  Node Degree Each gene can be characterized by the number of connections it has, that is, the number of other genes it is significantly coexpressed with. This property is called the node degree. Node degrees were characterized by their distribution. For many biological networks the degree distribution has been characterized as ‘scale-free’, or at least ‘heavy tailed’. This can be observed by the quality of a linear fit of the distribution on log-log scale [268].  93  Shortest Path The shortest path length measures the shortest distance to get from one gene to another gene by traversing edges in the network. In an un-weighted network this is the least number of edges traversed to get between the two genes. We computed shortest paths using Djikstra’s algorithm [152]. A value is obtained for a gene against every other gene in the network, and presented as the mean shortest path length across all genes. Genes without any direct neighbours are treated as missing values.  Clustering Coefficient The clustering coefficient of a gene indicates how connected the direct neighbours of a gene are to one another. It is the ratio of the number of connections in the neighbourhood of a node to the number of connections if the neighbourhood was fully connected. The clustering coefficient ranges from zero to one. A value of 1 would indicate that all the neighbours of a node are all connected to each other, or ‘cliquish’ in nature. A value of 0 would indicate that none of the neighbours of a node are connected to each other. This measure can only be computed for nodes that interact with more than one other node.  Assortativity A property that is computed at the whole network level is assortativity. It is defined as a preference for a network’s nodes to attach to other nodes that are similar in some way, most often evaluated in the context of node degree. In the example of a highly assortative network, highly connected nodes or ‘hubs’ would be connected to other hubs and nodes with few connections or ‘provincial’ nodes would be connected to other provincial nodes. Using simulated scale-free networks it has been proposed that increasing assortativity is correlated with higher path lengths and changes in the behaviours of clustering coefficient [269], however these relationships remain to be validated in real biological networks.  4.2.5 Schizophrenia Meta-signature Network Analysis  The meta-signature gene set of 25 up- and 73 down-regulated schizophrenia genes were obtained from the results of our meta-analysis of differential expression in Chapter 3 [263]. Four genes were removed  94  from the down-regulated gene set as they were not present in the network, leaving a total of 94 ‘schizophrenia genes’. Throughout this chapter we will refer to these gene sets as SZUP and SZDOWN for the genes up- and down-regulated, respectively. Average values of shortest path length and clustering coefficient for the SZUP and SZDOWN gene sets were evaluated within each network. To estimate the relevance of the network measures for SZUP and SZDOWN, we implemented three essential controls described below.  Random gene set comparison. For each meta-signature gene set, the average values of shortest path length and clustering coefficient were compared to a background distribution in each network. The background distribution was generated by randomly selecting 1000 gene sets with size and node degree matched to the meta-signature gene set. To ensure a well-matched node degree for each random gene set, selection was done on a per-gene basis by choosing a random gene within ± 50 of its node degree rank. Z-scores were then computed to quantify the difference between the mean of the background distribution to the observed values for each network measure of SZUP and SZDOWN (Table 18). For positive z-scores a p-value was computed reflecting how many random gene sets have values higher than the observed value. For negative z-scores a p-value was computed reflecting how many random gene sets have values less than the observed value.  Functional gene set comparison. Although our meta-signature of schizophrenia genes span a range of cellular functions, they possess a shared functional feature of altered expression in schizophrenia. Thus, it is important to assess whether the network properties we observe with our meta-signature gene sets are not just a property of gene groups that have shared functional features. To control for this, we generated functionally characterized gene sets using the Gene Ontology (GO). From the GO database, we obtained 3,230 GO terms for which the associated gene set size ranged from 10-1000 genes. For each GO term we retrieved all human genes that were annotated with that term to compile a gene group for each GO term, also referred to as a functional gene set. Each of these functional gene sets were evaluated individually by comparison to a background distribution of randomly selected gene sets (of equivalent size and node degree), within each network. The distribution  95  of z-scores obtained from 3,230 functional gene groups was plotted for each network and used to evaluate network properties of the meta-signature gene sets in reference to other functionally related gene sets.  Disease gene set comparison. To assess the network properties of our schizophrenia meta-signature genes in relation to other sets of disease-associated genes, we compiled disease gene lists for five different brain-related disorders. Gene sets were assembled for Alzheimer’s disease (, Parkinson’s disease (, multiple sclerosis (, and schizophrenia ( from their respective gene databases. Each database has been compiled based on findings from genetic association studies and provide gene lists on their website. The schizophrenia list obtained from SZGene ( comprised only the top 45 of the most reliable gene associations based on findings from an SZGene in-house meta-analysis. We also compiled an Autism spectrum disorder gene list from Toro et al [270]. Average values of shortest path length and clustering coefficient were computed for all five disease gene sets. Network measures were compared to a background distribution of randomly selected gene sets and z-scores were compared against functional gene set z-score distributions.  4.2.7 Network Clustering  To extract clusters (i.e. groups of densely connected nodes) from the control and schizophrenia networks we implemented two different algorithms, both of which are described below.  WGCNA-related methods. Each adjacency matrix was transformed into a distance matrix by computing the topological overlap between all probe pairs [271]. Topological overlap measure (TOM) between two genes is calculated by comparing the direct connections of each. If two nodes connect to the same group of other nodes they are said to have ‘high topological overlap’. We used a generalization of this measure that enriches TOM’s sensitivity to longer ranging connections between nodes by incorporating the number of m-step neighbours (m=2) that a pair of node share [271]. The TOM matrices were subjected to  96  WGCNA-based methods [148], whereby hierarchical clustering was applied with average linkage, and the resulting tree was used to define network clusters. The clusters generated by WGCNA-based methods are referred to as WC1, WC2, etc.  MCODE Algorithm. The MCODE [159] plugin for the Cytoscape platform [272] begins by computing a score of local density for each node whereby all neighbouring nodes are connected to each other with at least k-specified edges (k=0,1,2,3…). High scoring nodes become seeds which are expanded in a local search procedure, connecting highly scored nodes to the cluster. The expansion process proceeds until a given score threshold is reached (i.e. % score from the seed node) and re-iterates on the remaining nodes. Network clusters were generated by MCODE at five different k values (k = 2,3,4,5,6) using the default settings. We evaluated the top five clusters generated by MCODE since the gene membership for these clusters remained constant irrespective of choice of k. The clusters generated by MCODE are referred to as MC1, MC2, etc.  4.2.8 Enrichment Analysis  For each network, the top five clusters produced by the two clustering methods were further analyzed for functional enrichment of GO terms using the ROC scoring method in ErmineJ [198]. The ROC method evaluates the ranking of genes in the list to determine if there are gene sets which are statistically overrepresented. P-values for this method are computed as described in [273], and then corrected for multiple testing using the Benjamini-Hochberg procedure. Clusters were also evaluated for CNS cell type enrichment by cross-referencing the genes in each cluster with published lists of neuron, oligodendrocyte and astrocyte marker genes [247]. Hypergeometric probabilities were computed to evaluate the significance of overlap in each cluster. For each meta-signature gene set (SZUP and SZDOWN), we extracted a ranked list of gene coexpression within each network. Briefly, we used a binary vector representation for each gene set and multiplied it against the unsparse matrix (CTL and SZ). This transforms the matrix into a column vector of 12,582 values. Each value is a sum of normalized ranks  97  which reflects the extent of coexpression of a gene with the given gene set. A standard ROC analysis was performed on the ranked column vector to quantify the enrichment of cell type marker genes.  4.3 Results We constructed two gene coexpression networks; one representing the control human prefrontal cortex and the other representing the prefrontal cortex in schizophrenia (referred to as CTL and SZ throughout the chapter). Each network was comprised of 12,582 genes (nodes), and 392,606 coexpression ‘links’ among them. The two networks had similar values in the average clustering coefficient (p > 0.1), but average shortest path length across nodes differed slightly (p < 0.01). Positive assortativity values were observed in both networks with slightly higher values in CTL. These network properties are summarized in Table 17. Both networks exhibited a ‘heavy-tailed’ node degree distribution, with most of the genes interacting with few partners and a small proportion of genes displaying ‘hub’-like behaviour interacting with many genes. In the literature, such distributions are sometimes described as ‘scale free’. We used a linear regression of the log-scale node degree distribution to examine this in our networks (Figure 9). 2  2  While a linear fit explains over 80% of the variance in node degree distribution, (CTL R = 0.857; SZ R = 0.872), the fit is not very good at the extremes. Based on established criteria, our networks are not ‘scalefree’ [156]. However, the ‘heavy-tailed’ nature of the node degree distributions in our networks is typical of other ‘biologically relevant’ coexpression networks cited in the literature.  The small differences in global network properties observed between CTL and SZ suggests that there is an overall coexpression structure of the prefrontal cortex that is retained between the two. Fifty-seven percent of the edges (224,384 links) are the same in the two networks, much higher than expected by chance. The remaining 168,222 edges are not shared between the two networks (Figure 10). Differences in connectivity between the networks are also indicated by a higher maximum node degree in SZ (935) than CTL (737), and the increased number of non-connected nodes in CTL (2356) compared to SZ (2288). These differences could indicate subtle biological differences between the two networks, but are presumably at least partly due to the effect of noise.  98  Previously our group showed that both PPI and coexpression networks show a correlation between node degree and gene multifunctionality [274].Using the large PPI network constructed for use in Chapter 3, the correlation of node degree with multifunctionality is 0.53. Gillis and Pavlidis [274] found that for coexpression networks, the correlation between node degree and multifunctionality tended to be much lower, reaching 0.28 for a network made from aggregating 47 large studies. With individual datasets of smaller size, lower values of correlation are found. In our networks the correlations were low but significant with 0.039 in CTL and 0.036 in SZ, consistent given the small number of datasets used. The low correlation suggests a biologically-relevant signal in the data.  In addition to comparing average network properties across the SZ and CTL networks to each other, we compared each separately to a node degree-matched random network (see Methods). For features based on connectivity (i.e. shortest path length and clustering coefficient), we found the observed distributions of both networks to be higher than compared to random networks. Shortest path length displayed slightly higher values than found in randomized networks (Figure 11). Additionally, genes showed an increased clustering into local communities compared to genes from a randomized network with identical degree distribution (Figure 11). Thus while the SZ and CTL networks are similar, they are also clearly distinct from random networks with the same node degree distribution.  We next investigated network properties at the level of gene groups, focusing on our previously identified meta-signature of genes differentially expressed in schizophrenia [263]. The meta-signature of 94 genes (25 up-regulated and 69 down-regulated) will be referred to as SZUP and SZDOWN, respectively. Network properties were assessed for each gene set individually by taking an average across all genes in the group. These results are summarized in Table 18. For each gene set we computed the average values for shortest path length, cluster coefficient and node degree and evaluated differences observed between the control and schizophrenia networks. In general, both gene sets had a low mean node degree with respect to the network degree distribution of CTL and SZ, tending not to be ‘hubs’. For the SZUP gene set, we found higher node degree, shorter path length and an increased clustering coefficient in the SZ network, though these differences were not statistically significant (Wilcoxon- Rank Sum test p >  99  0.05). Conversely, the SZDOWN gene set exhibited a decreased node degree, larger path length and a significantly lower clustering coefficient (Wilcoxon- Rank Sum test p < 0.05) in the SZ network. Thus, each gene set displayed properties that differ between the two networks, and the two gene sets compared to one another revealed opposite changes in behaviour.  To assess the relevance of the changes observed for SZUP and SZDOWN between the two networks we implemented three different methods of control. A first control was supplied by comparing observed network measures for SZUP and SZDOWN to a background distribution of 1000 randomly selected gene sets of matched size and node degree (see Methods). The difference between the observed values and background was assessed by computing z-scores and p-values, as reported in Table 18. P-values represent the probability of a random gene set having a value higher (for positive z-scores) or lower (for negative z-scores) than the observed network measure. The shortest path length for SZUP was not different from the background in either network. However, values for SZDOWN were substantially different from the background in both networks. Thus, for both gene sets we observed no change in path length between networks. Our strongest result was the behaviour of the clustering coefficient for both gene sets. In the previous paragraph we reported that for SZUP the average clustering coefficient showed an increase in the SZ network compared to the CTL network. The p-values indicate that the increased value in SZ is significant when compared to a background distribution (p = 0.005). For SZDOWN, we reported an increased average clustering coefficient in the CTL network compared to the SZ network. A comparison to the background distribution generated a marginally significant p-value (p =0.06) for the increased value in CTL indicating a difference from random, albeit small. Together, these results converge to highlight two properties of our gene sets, 1) the SZUP genes become highly interconnected in the SZ network, exhibiting a property that is different from the background distribution and 2) the SZDOWN genes are more interconnected in the CTL network compared to the background distribution, and lose this property in the SZ network.  A second control was applied to examine whether or not the properties observed for SZUP and SZDOWN are a feature of other functionally grouped sets of genes. This is a more stringent control than simply  100  comparing to random gene sets, because we are interested in properties of our genes that are special compared to functional gene sets. We created 3,230 different functional gene sets based on GO terms and their associated genes. Network measures were computed for each functional gene set, and compared against values obtained from randomly selected gene sets of the same size and node degree. A z-score was computed for each comparison to the background. The distribution of 3,230 z-scores obtained from the GO groups was plotted and z-scores for SZUP and SZDOWN were evaluated by comparison (Figure 12). The results corroborate our findings from the background distribution control. For shortest path length, the z-scores for SZUP resembled values from GO groups and z-scores for SZDOWN were more distinct from the GO group distribution. Together, the application of control methods revealed that shortest path length is not a special property of the SZUP gene set, and although it is for SZDOWN this feature does not differ between the CTL and SZ networks. For the clustering coefficient, the z-score for SZUP is distinct from functional gene set values in the SZ network but not as much in the CTL network. The opposite is true for the SZDOWN z-score values. Thus, the clustering coefficient is a unique property of our meta-signature genes based on the differences exhibited between the two networks.  A final control was applied to evaluate whether our meta-signature gene sets share properties with gene sets associated with other brain-related disorders. We assembled gene sets for five different illnesses mostly based on findings from genetic association studies. For each disease gene set, z-scores were computed based on a background distribution and compared against the functional gene set z-score distribution generated from GO groups. Of particular interest are the results observed for clustering coefficient in the two networks. Interestingly, the Alzheimer’s disease gene group (red line, Figure 12) exhibited strikingly similar properties to SZUP in both networks despite having only one overlapping gene. Notably, the schizophrenia and Parkinson’s disease gene groups follow a similar but more subtle trend as SZDOWN.  We next tested the robustness of the network measures observed for SZUP and SZDOWN using a jackknife procedure. In this process, we removed one of the seven datasets and regenerated aggregate  101  CTL and SZ networks on the remaining six, for each study in turn. This yields seven pairs of jackknife networks. For each jackknifed network, the average shortest path length and clustering coefficient was computed for SZUP and SZDOWN and values were compared between networks (Figure 13). For SZUP, we observed a general agreement of increasing clustering coefficient and consistently decreasing path length between CTL and SZ across all iterations. For SZDOWN, we found that only the clustering coefficient effects were robust to removing single data sets; the path length results proved to be more sensitive. Taken as a whole, these results are consistent with there being subtle network property differences for the SZUP and SZDOWN genes between the two networks.  The clustering coefficient of the meta-signature gene sets is a feature of particular interest as it reveals insight into the community structure of nodes. Genes contained in SZUP and SZDOWN are in more interlinked neighbourhoods in the SZ and CTL networks, respectively, as suggested by their high clustering coefficient. Considering a sub-network of only within-gene set interactions the average clustering coefficient roughly doubles in value for both SZUP (CC CTL = 0.62; CCSZ = 0.66) and SZDOWN (CCCTL = 0.55; CCSZ = 0.46), suggesting within gene set interactions are driving interconnectivity in the full networks. Not surprisingly, these genes are highly linked to one another unlike random gene sets of matched size and node degree. Further, the within gene set connectivity of both meta-signature gene sets is increased in the SZ network. For SZUP, this is simply a result of the addition of new links, but in SZDOWN there appears to be a rearrangement of topology, with a combination of addition and removal of links (Figure 14).  Our analysis to this point examined either the entire networks or used supervised approaches to select sets of genes for analysis. We complemented this with an unsupervised method based on clustering. This analysis was motivated in part by the observation that the meta-signature gene sets showed significant modularity differences between CTL and SZ. We hypothesized that this might be a more general property, and that in particular we might find ‘modules’ which contain SZUP and/or SZDOWN genes and link them to other cellular functions. To identify modules in each network we first converted our sparse matrices into TOM matrices which represents the neighbourhood overlap between all gene pairs.  102  Hierarchical clustering was then applied to the TOM matrices and the resulting trees were used to define network modules [148]. Each network resulted in 18 modules of varying sizes. A comparison of modules between the two networks was done by computing significance of module overlap using the hypergeometric distribution. A matrix of the resulting log10 transformed p-values is plotted using a heatmap representation in Figure 15. Significant overlaps were observed between modules indicating excellent overlap of gene membership between the control and schizophrenia network modules, concordant with findings reported by Torkamani et al [13].  In order to characterize network clusters with regard to central nervous system (CNS) function and cellular organization, we cross-referenced total gene membership of the top five clusters (ranked by size) from each network with gene lists of (1) CNS cell type markers [247], and (2) biological process terms from the Gene Ontology. Results are summarized in Figure 16 and Table 19. Also reported are the number of schizophrenia meta-signature genes that associate with each of the modules. Not surprisingly, a similar enrichment of GO terms was observed in modules of both networks. The WC1 module in both networks associated with terms related to oxidative phosphorylation and energy production. The WC1 module was also highly enriched with neuronal markers and significantly overlapped with SZDOWN genes in both networks. This is consistent with evidence indicating mitochondrial dysfunction and defects in brain metabolism leading to oxidative stress in schizophrenia [224, 256]. The WC2 module associated with only one significant GO term, “regulation of action potential” (GO:0019228), suggesting it is a myelination-related module in both networks. As expected, cell type enrichment of the WC2 module identified a large number of oligodendrocyte marker genes in the control network. Interestingly, in the schizophrenia network the enrichment is altered such that there are more astrocyte marker genes that exhibit a significant overlap, in addition to the few oligodendrocyte marker genes that remain. The WC2 module also overlaps significantly with SZUP genes in the schizophrenia network but not in the control network.  We also applied the MCODE algorithm [159], another clustering method, to each of our networks for comparison to the WGCNA-based results. MCODE is designed to identify dense interconnected genes  103  from networks. It uses a weighted clustering coefficient to score nodes, and then clusters nodes based on similarity of scores. We examined the top five MCODE clusters (ranked by score) for enrichment of GO terms (Table 20) cell type markers and overlap of SZUP and SZDOWN genes (Figure 17). Similar to the WGCNA-based clustering we found a myelination-related cluster which was present in both networks (MC2). The genes in this cluster overlap with genes in the WC2 module in both networks, although a higher significance of overlap is observed in the control network (Figure 18). Also, concordant with findings from the WC2 module is a loss of oligodendrocyte markers genes and a shift towards more astrocyte marker genes in MC2 of the schizophrenia network.  The results from clustering methods provide evidence of gene coexpression patterns in the myelinationrelated module between the two networks. The module is largely conserved between networks, but there are small differences as illustrated by the loss of oligodendrocyte marker genes and the addition of astrocyte marker genes. From the WGCNA-based clustering, we also observed changes in gene membership of the SZUP gene set with the myelination module (WC2). Only two of the SZUP genes displayed and overlap in the control network, but 12 genes overlapped in the schizophrenia network. Thus, we examined coexpression patterns associated with all 25 SZUP genes to explore the possibility of a relationship between SZUP and coexpression alterations of the myelination module (see Methods). A ranked vector of gene coexpression was extracted within each network, with values representing how well each gene in the network is coexpressed with SZUP. A standard ROC analysis was applied to the ranked list to quantify the enrichment of cell type marker genes. We found that some of the astrocyte marker genes are coexpressed with SZUP as they exhibited an AUC score higher than random (0.600.62). If we look at only the 12 overlapping SZUP genes, the AUC score increases to 0.68. It is therefore possible that the up-regulation of SZUP genes in schizophrenia may be contributing to the network alterations of astrocyte genes observed in the myelination module. However, investigation of coexpression patterns at the individual gene level is required to make any definite conclusions.  104  Table 17: Whole network properties of the control and schizophrenia brain networks Control 2356 747 77 0.219 3.34 0.29 0.857  Non-connected nodes Maximum node degree Mean node degree Assortativity Shortest path length Cluster coefficient 2 log-log fit (R )  Schizophrenia 2288 935 76 0.158 3.32 0.29 0.872  Table 18: Schizophrenia gene set network properties Up-regulated (25) CTL SZ  Down-regulated (69) CTL SZ  63.9 2  83.5 2  127.4 7  106.2 3  12  23  129  144  Shortest Path Mean  3.28  3.14  3.31  3.48  Random gene set comparison Z-score p-value  -0.58 0.23  -0.91 0.15  3.15 0  4.46 0  Cluster Coefficient Mean  0.35  0.38  0.32*  0.27*  2.47 0.005  1.51 0.06  -0.607 0.28  Node degree Mean Non-interacting nodes  Edges (within gene set)  Random gene set comparison Z-score 1.11 p-value 0.14 *Difference is significant between CTL and SZ at p= 0.05  105  Table 19: Gene Ontology enrichment of modules identified by WGCNA-based clustering A: Control Network GO Term  GO ID  Corrected P-value  mitochondrial electron transport, NADH to ubiquinone  GO:0006120  1.42E-16  oxidative phosphorylation  GO:0006119  2.88E-16  ATP synthesis coupled electron transport  GO:0042773  1.47E-13  respiratory electron transport chain  GO:0022904  1.76E-11  electron transport chain  GO:0022900  2.76E-10  GO:0019228  6.93E-03  keratinocyte differentiation  GO:0030216  9.98E-03  sensory perception of chemical stimulus  GO:0030216  0.012  epidermal cell differentiation  GO:0007606  0.017  peptide cross-linking  GO:0009913  0.018  cellular defense response  GO:0006968  0.028  GO ID  Corrected P-value  mitochondrial electron transport, NADH to ubiquinone  GO:0006120  1.42E-16  oxidative phosphorylation  GO:0006119  2.88E-16  ATP synthesis coupled electron transport  GO:0042773  1.47E-13  respiratory electron transport chain  GO:0022904  1.76E-11  electron transport chain  GO:0022900  2.76E-10  GO:0019228  6.93E-03  keratinocyte differentiation  GO:0030216  9.98E-03  sensory perception of chemical stimulus  GO:0030216  0.012  epidermal cell differentiation  GO:0007606  0.017  peptide cross-linking  GO:0009913  0.018  cellular defense response  GO:0006968  0.028  WC1  WC2 regulation of action potential in neuron WC4  B: Schizophrenia Network Name WC1  WC2 regulation of action potential in neuron WC4  The top five modules (ranked by size) in each network were characterized by an enrichment of biological process terms from the Gene Ontology (GO). For each module, significant GO terms are listed (pcorr < 0.05) to a maximum of top five terms. GO terms listed are taken from results of an ROC analysis in ErmineJ [198].  106  Table 20: Gene Ontology enrichment of modules identified by MCODE clustering A: Control Network GO Term  GO ID  Corrected P-value  gamma-aminobutyric acid signaling pathway  GO:0007214  0.032  glial cell development  GO:0021782  0.045  ATP synthesis coupled electron transport  GO:0042773  0.048  regulation of action potential in neuron  GO:0019228  2.36E-03  regulation of action potential  GO:0001508  0.034  gamma-aminobutyric acid signaling pathway  GO:0007214  0.045  ensheathment of neurons  GO:0007272  0.047  GO:0006297  0.043  GO ID  Corrected P-value  regulation of action potential in neuron  GO:0019228  0.013  gamma-aminobutyric acid signaling pathway  GO:0007214  0.021  regulation of action potential  GO:0001508  0.026  deoxyribonucleotide catabolic process  GO:0009264  0.024  deoxyribonucleotide metabolic process  GO:0009262  0.029  ATP synthesis coupled electron transport  GO:0042773  0.014  mitochondrial electron transport, NADH to ubiquinone  GO:0006120  0.02  gamma-aminobutyric acid signaling pathway  GO:0007214  0.031  MC1  MC2  MC4 nucleotide-excision repair, DNA gap filling  B: Schizophrenia Network GO Term MC2  MC3  MC5  The top five clusters (ranked by score) in each network were characterized by an enrichment of biological process terms from the Gene Ontology (GO). For each module, significant GO terms are listed (pcorr < 0.05) to a maximum of top five terms. GO terms listed are taken from results of an ROC analysis in ErmineJ [198].  107  Figure 9: Connectivity distribution of control and schizophrenia networks The control brain network (A) and the schizophrenia brain network (B) connectivity distribution on a log10log10 scale. Plotted on the x-axis is the number of links versus the number of genes that have the corresponding number of links on the y-axis.  108  Figure 10: Shared edges between networks To assess the node degree differences between networks, values of node degree for all 12,582 genes in CTL were plotted against the number of shared edges in SZ. Data was plotted using the ‘hexbin’ R package to display proportions of values rather than the raw data points. The presence of data points that deviate from the identity line indicate differences in gene-to-gene connections for a number of genes between the two networks.  109  A)  B)  C)  D)  Figure 11: Comparison to random network distributions For each network, we generated a corresponding random network by swapping edges and maintaining the same node degree distribution. (A, B) Shortest path length distribution of real networks are shifted slightly higher than corresponding random network distributions, but distributions between CTL and SZ are similar. Grey histograms reflect values from the random network, and black histograms represent the real network data. (C, D) Genes cluster into local communities with high number of interconnections compared to corresponding random networks. Black dots represent real network data and grey dots represent random network data.  110  Figure 12: Comparison of gene set properties to functional GO groups Histograms represent z-score distributions for cluster coefficient (A, B) and shortest path length (C,D) computed across 3,230 different GO groups in the control and schizophrenia networks. Z-scores represent the difference between the mean value of the network measure of the GO group compared to the mean of random gene sets of the same size and matched node degree. Lines plotted represent the zscore obtained for the up- and down-regulated meta-signature gene sets and additional disease gene sets as labeled in the legend.  111  Figure 13: Jackknifed network measures For each jackknifed network (in which one dataset is removed), we computed shortest path length and clustering coefficient for SZUP and SZDOWN. To summarize trends observed in the jackknife analysis, we plotted clustering coefficient, shortest path length found in the CTL and SZ networks. Results from SZUP are found in A-B, and SZDOWN in C-D. Each line represents a different jackknifed network, with the legend indicating which dataset was removed.  112  A) Up-regulated CTL  SZ  113  B) Down-regulated CTL  SZ  Figure 14: Network representation of within gene set interactions for schizophrenia metasignature genes Genes were extracted from CTL and SZ networks, along with any edges that connected to nodes within the A) up-regulated and B) down-regulated schizophrenia meta-signature gene sets. An addition of nodes and edges are observed for the up-regulated gene set in the SZ network. In the down-regulated gene set, a re-wiring is reflected by the addition and removal of numerous within gene set interactions between the CTL and SZ networks. Yellow nodes represent genes that are present in both networks, purple nodes represent genes added in the SZ network and blue nodes represent genes lost in the SZ network. Black lines represent edges retained in both networks, red lines indicate new edges and blue lines indicate edges lost in SZ.  114  Figure 15: Comparison of modules between networks (WGCNA) Each network was clustered using WGCNA-based methods [148]. Modules were compared between networks by computing the number of overlapping genes. Significance of overlap was performed by using the hypergeometric distribution to test for the probabilistic significance of module overlap. The resulting log10 transformed p-values for each overlap are plotted with the color scale provided above.  115  Figure 16: Enrichment of cell type markers in WGCNA modules The five largest modules identified by WGCNA in each network were characterized by an enrichment of cell type markers. For each cluster we report the number of genes in each module that overlaps with oligodendrocyte (OL), neuron (NEU), and astrocyte (AST) marker genes provided by [247]. Overlap with each meta-signature gene set is also reported. Similar x-axis scale is used for each module for a fair comparison between networks. Hypergeometric probabilities were computed to evaluate significance of overlap. ** p <0.001; *** p < 1E-10  116  Figure 17: Enrichment of cell type markers in MCODE modules The top five modules identified by MCODE in each network were characterized by an enrichment of cell type markers. For each cluster we report the number of genes in each module that overlaps with oligodendrocyte (OL), neuron (NEU), and astrocyte (AST) marker genes provided by [247]. Overlap with each meta-signature gene set is also reported. Similar x-axis scale is used for each module for a fair comparison between networks Hypergeometric probabilities were computed to evaluate significance of overlap. ** p <0.001; *** p < 1E-10  117  Figure 18: Cluster comparison between WGCNA and MCODE clustering algorithms Each network was clustered using two different algorithms: 1) WGCNA-based methods [148] and 2) MCODE algorithm [159]. The top five clusters were obtained from each method and were compared against each other within each network. The number of overlapping genes was computed and hypergeometric probabilities were computed to assess the significance of module overlap. The resulting log10 transformed p-values for each overlap are plotted with the color scale provided above.  118  4.4 Discussion Our network-based approach for evaluating gene coexpression provides a novel assessment of coexpression patterns across seven large schizophrenia microarray datasets. We implemented a rank aggregation approach for network analysis revealing interesting patterns of molecular connectivity in the control and schizophrenia postmortem human brain. Overall, the two coexpression networks were very similar to one another. This is consistent with the findings of Torkamani et al [13] which involved two datasets, both of which were also included in our analysis. The networks shared a similar node degree distribution, and average values of path length and clustering coefficient taken across all nodes in the network were not significantly different. However, two networks that are similar in average network properties can still be quite different with respect to their underlying topology.  To evaluate differences in gene-gene connectivity between networks, we focused on the network properties of a specific group of relevant genes. We used a list of 95 differentially expressed ‘schizophrenia genes’ as reported in our previous meta-analysis [263]. This gene list was divided into two groups: 1) genes which are up-regulated in schizophrenia and 2) genes which are down-regulated in schizophrenia. The network properties of each gene set were examined within the control and schizophrenia networks to identify any distinguishing features of these ‘schizophrenia genes’. Importantly, we applied different controls to ensure that any network features identified were specific to the ‘schizophrenia genes’. These stringent controls are necessary aspect of network analysis, although not typically observed in existing studies. The clustering coefficient, a measure which gives us insight into the community structure of nodes, proved to be an interesting characteristic of both gene sets. The SZUP genes exhibited a unique increase in clustering coefficient indicating a high level of interconnectivity in the schizophrenia network. In contrast, the SZDOWN genes displayed high interconnectivity in the CTL network which diminished in the SZ network. Loss of interconnectivity was demonstrated by the clustering coefficient being reduced to a value representative of the background distribution. Further, we demonstrated that this differential interconnectivity of the ‘schizophrenia genes’ between networks is a  119  feature is not observed with other functionally similar groups of genes and most other brain-related disease gene groups.  An assessment of modularity across all nodes in each network was performed using unsupervised methods based on clustering. As the modules identified in a network can be contingent on the algorithms applied, we used two different methods and highlighted the similarities observed between them. Both algorithms identified a ‘myelination-related’ module/cluster which consistently appeared in both networks. The module was largely conserved but differences were observed in gene membership. In the schizophrenia network, the number of oligodendrocyte marker genes present in the module decreased to half the amount present in the control network module, suggesting alteration of myelination-related processes. Myelination is the process by which oligodendrocytes envelope axons in their myelin sheath, allowing more rapid action potential conductance and information flow between brain regions. A wide range of white matter abnormalities have been revealed in schizophrenia [275] and genetic studies have contributed a number of myelin and oligodendrocyte –related genes as candidate genes (eg. APOD, PLP1, MAG) [276-278]. Moreover, myelination-related genes have also been found to be down-regulated from gene expression studies [106, 279].  Interestingly, the loss of oligodendrocyte marker genes in the myelination-related module was coupled with a large increase in astrocyte marker genes. Also, genes in SZUP were found to be coexpressed with some of the astrocyte markers. Astrocytes are the most abundant glial cell type in the brain, playing multiple roles in organizing and maintaining brain structure and function. An example pertinent to our results is the direct role of astrocytes in promoting myelination; supported by studies in vitro and in vivo. It has been shown that astrocytes secrete factors which influence the rate of myelin ensheathement by oligodendrocytes [280]. Further, astrocytes have a major influence on remyelination as demonstrated by the observation that oligodendrocytes preferentially remyelinate axons in areas containing astrocytes [281], and transplantation of astrocytes into demyelinated lesions enhanced endogenous remyelination [282]. Taken together, with the findings from our network study of a shift towards more astrocyte markers and less oligodendrocyte markers, there is evidence for the recruitment of astrocytes in response to  120  abnormal myelination in schizophrenia. This might in turn imply the presence of mild astrogliosis in the PFC of individuals with schizophrenia, however this issue is still a matter of debate with studies reporting findings for and against it [283]. Also, a large number of genes in SZDOWN were found to overlap with the oxidative phosphorylation-related module in both networks. Studies into white and grey matter abnormalities of patients with mitochondrial encephelomyopathy have shown that white matter is particularly vulnerable to damage by oxidative stress [284]. The findings from our network analysis demonstrate that although our ‘schizophrenia genes’ are not functionally similar based on current annotations, they are intertwined based on gene-gene relationships derived from coexpression. We have used this comprehensive panel of genes to support a model linking together broad areas of dysfunction which may contribute to the pathophysiology of schizophrenia. However, our interpretation of the network model is based on GO enrichment and should not be considered to be biological evidence, thus further investigation at the individual gene level will provide an explanation of higher resolution.  For the remainder of the discussion, we summarize our findings in the context of caveats of coexpression network analysis and propose some possible interpretations and avenues for further research. In this study we have aggregated coexpression networks across seven independent studies to generate a network representation of the postmortem PFC in healthy controls and individuals with schizophrenia. By aggregating data across studies we aimed to increase the reliability of observed interactions whilst reducing chances of identifying spurious interactions. Although seven datasets is a much larger cohort than found in the current literature of schizophrenia network studies [13], we note that our study could benefit substantially with additional data. Aggregating across a larger number of datasets has been shown to result in networks more comparable to PPI networks [182, 246], and is likely to give more reliable interactions. Another important feature of our study is that we tested the robustness of our results. When conducting meta-analysis across heterogeneous data sources, it is imperative to determine whether the signal observed is being driven by a particular dataset. The jackknife procedure applied to our networks demonstrates that our findings, though subtle, are not overly sensitive to the choice of data used.  121  A final comment is on the interpretation of coexpression networks and the underlying relevance of the gene-gene interactions we observe in these networks. In contrast with other biological networks (i.e. PPI networks) whose edges represent well-defined biological interactions, the edges in a coexpression network are a representation of the correlation structure of the data. The edges are related to values of the pairwise correlation coefficient that are calculated from the expression data of the genes, and are dependent on the threshold applied to infer those networks. A connection between two genes in a coexpression network does not necessarily correspond with a connection in PPI networks or regulatory networks [285]. Thus, when considering gene coexpression networks it is important not to confuse the gene to gene connections as direct physical interactions.  We have contributed the largest meta-analysis of gene coexpression in schizophrenia. We evaluated various topological properties of the control and schizophrenia networks to reveal a shared coexpression structure between them. Characterization of functional clusters in each network with cell-type marker genes displayed differences that link to together disease-related processes. Differentially expressed genes in schizophrenia also associate with biologically relevant clusters providing evidence for systems level dysfunction. Further research is required to disentangle these network findings to distinguish primary from secondary disease phenomena, but we hope our study will encourage new directions in the network biology of schizophrenia. Finally, our work suggests that coexpression network analysis is difficult but promising, and to ensure we are not misled by the data future work in this area should proceed with careful interpretation.  122  Chapter 5: Conclusion 5.1 Summary of Major Findings A decade ago, gene expression was investigated on only a few genes at a time; these genes were carefully selected for analysis on the basis of a hypothesis regarding possible implications with the disease under study. With the advent of microarrays and other expression profiling technologies, it is now possible to preform genome-wide expression analysis in a hypothesis-free manner. Gene expression profiling of the postmortem human brain represents an active area of research contributing to the sustained effort in understanding the neuropathological underpinnings of schizophrenia and other psychiatric illnesses. The focus of my thesis was to use expression profiling data of the postmortem human brain and evaluate gene expression changes across studies by using a meta-analytical framework. The first objective was to evaluate gene expression changes in the control human brain. These results provided a meta-signature of gene expression changes associated with four factors that can have a potentially confounding effect in postmortem human brain research: age, sex, brain pH and PMI. The second objective was to identify differentially expressed genes in the prefrontal cortex of individuals with schizophrenia and to evaluate those genes in the context of gene coexpression networks. I focused on identifying consistencies in expression change across studies to ultimately identify a common biological theme to explain the neural basis of schizophrenia. Remarkably, this second objective resulted in the discovery of a robust signature of 98 ‘schizophrenia genes’ that showed expression changes associated with the illness. Moreover, I showed that these ‘schizophrenia genes’ exhibit unique network properties, providing insight into alterations of relevant molecular processes associated with schizophrenia  In Chapter 2, I explored gene expression changes associated with four different fatcors. These factors are associated with large expression changes that can mask the detection of expression patterns attributable to the psychiatric disease under study. Of the many important issues affecting RNA, I focused specifically on age, sex, brain pH and PMI. The unique aspect of Chapter 2 is that in comparison to studies that have  123  examined each of these factors using a single dataset [127, 128, 189], I identified expression profiles that were mostly consistent across eleven different datasets comprising over 400 samples. For each factor a meta-signature of up- and down-regulated genes was identified, implicating an assortment of critical and relevant cellular processes. A significant overlap was observed for each meta-signature when validated with independent gene lists extracted from the literature, yet there were also a large proportion of novel findings. Comparisons of the meta-signatures against one another identified shared biological themes which suggest potential relationships between factors. Moreover, I found that previously identified candidate schizophrenia genes to appear in the meta-signatures, reinforcing the need for careful consideration of these factors in postmortem human brain research.  In Chapters 3 and 4, I turned my focus to examining gene expression patterns associated with schizophrenia. Using a total of seven schizophrenia microarray datasets, I had the unprecedented opportunity to create a large combined cohort comprising a total of 153 schizophrenia and 153 healthy controls. This combined cohort represents the largest, most comprehensive integration of schizophrenia expression profiling studies to date. Using this cohort I exploited two types of gene expression analysis: differential expression in Chapter 3 and coexpression network analysis in Chapter 4. In Chapter 3, linear modeling was used to identify differentially expressed genes in schizophrenia with careful consideration of the factors examined in Chapter 2. To account for the effects of age, pH, study and batch, each was included as a covariate in the model. A meta-signature of 98 genes were found to be significantly differentially expressed at a false discovery rate of 0.1, highlighting several novel genes and a handful of previously identified genes. The results of Chapter 3 make a substantial case for meta-analysis. Existing single dataset studies in schizophrenia report transcriptome alterations in related to processes such as synaptic transmission, energy metabolism, and immune function [98, 108, 224, 256]; but the individual genes which show expression change are not often replicated. The reported findings from each individual study used in our meta-analysis also lacked consensus. Yet, when combined in a meta-analytical framework with careful control of covariates together these seven studies identify a signature of genes that exhibit significant and consistent changes in expression. Moreover, this schizophrenia signature  124  reflected biologically relevant changes including transcripts associated with aspects of neuronal communication, and processes affected as a consequence of changes in synaptic functioning.  In Chapter 4, the schizophrenia signature was examined using network analysis to relate these genes to one another and to functional modules of coexpression. Two aggregated coexpression networks of the brain were generated: one representing normal healthy controls and the other representing individuals with schizophrenia. As previously observed in the literature [13], the two networks were similar in overall structure. However, differences were observed when network properties were examined for the up- and down-regulated ‘schizophrenia genes’ from Chapter 3. A particular property of interest was the clustering coefficient, reflecting a high degree of coexpression among neighboring genes. The up- and downregulated genes exhibited differences in the average clustering coefficient between networks illustrating a behavior unlike other functionally similar groups of genes or brain-related disease gene groups. Functional characterization of modules in each network identified a ‘myelination-related’ cluster. While the majority of genes in the myelination-related module were retained between networks, we observed changes in the assignment of cell-type markers and ‘schizophrenia genes’ to this cluster. Specifically, we found an increased coexpression of astrocyte marker genes with the module in schizophrenia coupled with a loss of coexpression with oligodendrocyte markers. This is suggestive of recruitment of astrocytes in response to the abnormal myelination feature of the illness.  For the remainder of this chapter I will discuss the results from Chapters 2-4 in more detail, highlighting contributions to the field and particular strengths and weakness of the study. I will also attempt to pull together the findings from my work and provide a coherent, yet speculative biological interpretation. Finally, I will close with some potential applications for future work.  5.2 Contribution to Field of Study Existing expression profiling studies of schizophrenia have provided few replicable candidate genes. This is partly explained by technical differences between studies including: cohorts (representing patients of  125  varying duration of illness), microarray platform, pre-processing methods, and statistical methods applied for differential expression analysis. When studies have been conducted in different ways, their results are no longer comparable. The use of the meta-analytical framework demonstrated in Chapter 3, enabled the unique opportunity to reduce the effects of some of these sources of technical variation and focus on biological variation. For instance, the raw data was obtained for each dataset and pre-processed using the same algorithm. Using only probes common to all datasets, the combined data matrix was treated as single dataset for linear modeling. Sources of biological variation were also carefully partitioned by use of statistical modeling to extract patterns of expression associated specifically with the disease. Importantly, I applied multiple testing correction and identified of a set of genes that show significant differential expression in schizophrenia at an FDR of 0.1. I believe that the major contribution of my thesis is the identification of the most reliable set of gene expression changes associated with schizophrenia to date. Most studies of schizophrenia in the postmortem microarray literature have not applied standard FDRs to the dataset since doing so usually results in few to no significant genes. It is important to take this into consideration when trying to understand the discordance in findings across studies. If the gene lists are not identified using proper statistical considerations, it is unlikely that one could extract common findings. Thus, using methods described in my thesis I was able extract consistencies across studies and draw conclusions that would have not been possible from a simple comparison of published findings. I have provided a robust set of expression changes that reflect relevant biological processes, which will be discussed in detail in later sections of this chapter. Hopefully, these genes will contribute to a line of future research that will seek more direct evidence for their involvement in schizophrenia and lead to novel targets for treatment strategies in this illness.  An additional contribution of this work is the results from Chapter 2. The gene lists identified for four factors provided critical information towards the identification of problematic genes, when investigating the postmortem human brain in psychiatric illnesses such as schizophrenia. To increase the utility of my findings to the scientific community, these meta-signatures are available at to disseminate the results.  126  5.3 Strengths and Limitations Schizophrenia is a disease of the brain. Thus, direct investigation of the postmortem brain provides a window into the affected biological mechanisms at the specific site of illness. Such informative findings would not be possible with the sampling of the cerebrospinal fluid (CSF), urine, serum, blood or other tissues. The problem with looking at RNA in the postmortem brain of schizophrenia is that the expression differences associated with the illness are so small. RNA is a useful quantitative phenotype intermediate between DNA and protein, and is particularly advantageous when examined using expression profiling technologies. Gene expression profiling enables whole genome exploration of RNA expression levels between two or more sample types. The differences observed between controls and affected samples, collectively referred to as an expression signature, constitute a useful endophenotype for the condition under study. An expression signature is especially attractive for polygenic, complex disorders such as schizophrenia, as we search beyond a single causative gene. The work presented in my dissertation is based on data obtained from microarray studies of RNA expression in schizophrenia. Microarrays provide an affordable option for gene expression profiling compared to sequence-based methods such as RNA-Seq. However, there has been a move towards NGS-based approaches for characterization of transcriptomes as they offer lower background noise, better sensitivity and quantitative measures [286].  These technologies remain challenging for the study of postmortem brain in schizophrenia as they require high-quality RNA to measure relatively small changes in brain gene expression (~15-20% fold change). There is uncertainty as to whether the small changes in expression are brought about by limitations of using postmortem brain tissue, or if the subtlety of expression change is a feature of the disease. One limitation is that postmortem human brain specimen can contain variable amounts of grey and white matter, each of which are heterogeneous in cell type [99]. This can result in a dilution of biological signals, whereby genes of interest exhibit changes in expression that might appear even smaller than they really are. The studies included in my analyses utilized bulk cortical tissue, where the cellular complexity of the  127  samples was lost during the harvesting procedure. The molecular deficits identified from our analysis cannot be definitively localized to a distinct cell type, and there may be changes from rare cell types that were undetected. Expression profiling studies are rarely applied in a cell-type specific manner in studies of schizophrenia, but if we are to continue to work with postmortem brain tissue it would be highly beneficial to do so. Cell-type specific transcriptomics relies on the successful identification of the cell-type of interest. A commonly used technique is LCM which uses a laser to excise cells of interest (identified under a microscope) from mounted thin-tissue sections that have been either fixed or frozen. LCM has been used in recent studies of schizophrenia to facilitate the identification of lamina-specific molecular markers [287] and to examine expression differences in the supragranular and infragranular layers of the PFC [112]. However, tissue fixation can degrade nucleic acids and there is heightened risk of contamination when extracting cells from intact tissue. These single-cell methods can also be more prone to producing false negatives (particularly for low abundance transcripts) due to the small amounts of collected RNA.  In general, the interpretation of results from postmortem studies is problematic because cause and effect are difficult to disentangle. Brain tissue is obtained from patients who have died after having lived with the disease for various lengths of time, often having received medications. There is uncertainty as to whether the expression changes we observe are involved in what is causing the disease, or if the changes are a downstream effect of having the disease. Furthermore, if it is an effect that we are observing, it is likely that it is not exclusively from the disease. For example, individuals were exposed to various environmental influences and have died from different causes. Also factors such as age, sex, brain pH and PMI which are associated with large gene expression changes can mask the disease effect. Standard practices for minimizing the effects of extraneous factors include sample matching or treating these factors as covariates in regression models. In Chapter 2 of my dissertation, I used control human brain data to assess the impact of these factors on gene expression. In Chapter 3, I incorporated these findings to assess possible confounding effects when evaluating gene expression changes associated with schizophrenia. Even with well-matched cohorts and inclusion of age as a covariate we detected a schizophrenia signature that manifested as aberrant matching of age. We observed an intersection  128  between our age gene list and the ‘schizophrenia genes’ which could indicate that genes affected by age are also affected by schizophrenia, but also raises the possibility of confounding effects. Thus, while these practices help reduce the variability introduced by extraneous factors we cannot completely rule out the presence of their effects.  When dealing with such small effect sizes, larger sample sizes become a necessity in order to obtain the statistical power to identify that change. The small sample sizes of current expression profiling studies of schizophrenia (N ≈ 20) are limited to the collections provided by current brain banks. Thus, there is a need to take advantage of integrative approaches. The findings of my thesis converge towards making a case for meta-analyses of postmortem human brain expression profiling studies. Meta-analysis improves the power to detect subtle changes in gene expression and identify consistent changes despite diversity of studies, as observed in my work. It is particularly useful for small datasets that show inconsistencies across studies, as is the case with gene expression studies in schizophrenia. Although I have contributed the largest meta-analyses of schizophrenia both for differential expression and coexpression, each could benefit from the inclusion of more datasets. Our coexpression networks could be substantially improved with more datasets of larger sample size. Aggregating across a larger number of datasets has been shown to result in networks more comparable to PPI networks [182, 246], and is likely to give more reliable interactions. As more schizophrenia expression profiling data become available, they can be incorporated into the meta-analytical frameworks developed in my thesis to contribute statistical power and help refine my results.  5.4 Interpretation of Findings There are many hypotheses about the underlying etiology of schizophrenia. Past and present research in the field demonstrate disturbances at different levels of brain functioning which translate into putative pathologies of schizophrenia. The research conducted in my thesis exploits different meta-analytical approaches for interrogating expression alterations in schizophrenia. The results demonstrate deficits at different levels of cell functioning which are suggestive of neuropathological alterations in the prefrontal  129  cortex of individuals with schizophrenia. In this section, I attempt to weave together speculative thoughts with evidence from the literature to provide a plausible biological interpretation of my findings.  Schizophrenia has been described as a disease of the synapse in which the fundamental pathology involves a convergence of factors leading to dysfunction of synaptic transmission [288]. In support with this theory, our meta-signature of differentially expressed ‘schizophrenia genes’ also included several genes related to synapse function. Thus, one would assume that a loss of communication between neurons may be resulting from of a decrease in the total number of neurons. However, morphometric studies of postmortem brain report that this is not the case, and rather there is a consensus for increased neuronal densities [289, 290]. The increased neuronal density results from a reduction in the overall volume of tissue, with no loss of neurons but a reduction in the neuropil (i.e. dendrites, axon terminals, synapses, glial cell processes and microvasculature).  In addition to synaptic functioning, the genes identified by differential expression in Chapter 3 involve a number of other biological processes. The relationship among these genes is still not clear but with the use of coexpression analysis as a complementary approach, a possible explanation emerged. Clustering of the coexpression networks revealed a neuron marker-enriched ‘oxidative phosphorylation’ module that was highly conserved between the control and schizophrenia brain. High conservation of the module between networks implies that the expression of genes in this module must remain highly correlated. More than half of the down-regulated genes in our schizophrenia meta-signature were in this module in both control and schizophrenia suggesting down-regulation of the ‘oxidative phosphorylation’ module. This is further supported by enrichment of ‘oxidative phosphorylation’ GO terms in the down-regulated meta-signature, indicating related genes are down-regulated but changes are not significant. Oxidative stress has been suggested to contribute to the pathophysiology of schizophrenia through mechanisms that likely involve aberrant inflammatory responses, mitochondrial dysfunction, hypoactive NMDA receptors and oligodendrocyte abnormalities [291]. Oxidative stress occurs when cellular antioxidant defense mechanisms fail to counterbalance endogenous reactive oxygen species (ROS) and reactive nitrogen species (RNS) generated from normal oxidative metabolism. Excessive amounts of ROS and  130  RNS can induce reactions that have detrimental effects to brain function. For instance, peroxide and hydroxyl radical are thought to react with the polyunsaturated fatty acids that are present in myelin sheaths, directly triggering demyelination [292]. In vitro studies using purified myelin have also demonstrated the vulnerability of myelin to oxidative stress [293].  There are numerous lines of evidence for myelin-related dysfunction in schizophrenia [294] including. imaging and neurocytochemical evidence [81, 295], similarities with demyelinating diseases (e.g. metachromatic leukodystrophy [296]), myelin-related gene abnormalities [106] and morphologic abnormalities in the oligodendrocytes [297]. Cluster analysis of coexpression networks in Chapter 4 of my thesis also supports this idea. A ‘myelination-related’ module was identified in both the control and schizophrenia brain; and while this module was mostly conserved there was a substantial loss of coexpression with the oligodendrocyte marker genes in schizophrenia. Thus, expression levels of oligodendrocyte marker genes are altered in schizophrenia such that they are no longer coexpressed with other genes in the module. A possible interpretation of these results is that the oligodendroglial dysfunction is a secondary event that results from an insult imposed by oxidative stress (demonstrated by down-regulation of the ‘oxidative phosphorylation’ module). The strength of the synapses is reinforced by the conductance of the axon provided by myelin sheath, keeping with the notion of schizophrenia as a disease of the synapse.  The loss of oligodendrocyte marker coexpression from the ‘myelination-related’ module was coupled with increased coexpression of astrocyte markers in schizophrenia. Astrocytes have a major influence on remyelination as demonstrated by the observation that oligodendrocytes preferentially remyelinate axons in areas containing astrocytes [281], and transplantation of astrocytes into demyelinated lesions enhanced endogenous remyelination [282]. A possible explanation for this is that in response to oligodendrocyte dysfunction in schizophrenia, the astrocytes exhibit expression patterns in concert with genes of the myelination module because they are being recruited in a partial attempt re-myelinate and restore synaptic function.  131  Taken together, the findings from my dissertation suggest a cascade of dysfunction at the molecular level which might lead to aberrant synaptic transmission in schizophrenia. Again, we are faced with the cause and effect dilemma. The cascade begins with oxidative stress; but we are unable to determine whether this is a cause of schizophrenia or a downstream effect. If we accept the latter, we must also consider that oxidative stress may be triggered by extraneous factors rather than the disease itself. For example, several studies of rat brain have demonstrated that chronic administration of antipsychotics (i.e. haloperidol, and clozapine) can induce oxidative damage [298, 299]. It is generally the case that for the populations used in these studies, most or all of the schizophrenia subjects have received antipsychotics. In our study we were unable to control for the effects of antipsychotics as we were unable to obtain this information for all subjects. Thus, it remains to be determined whether the hypothesized oxidative damage is exclusively a disease effect, exclusively a medication effect, or a combination of both whereby the use of antipsychotics impose an additional burden to cells.  An important aspect of my proposed interpretation is that it is based on differential coexpression patterns observed in the brains of controls and individuals with schizophrenia. A complex illness such as schizophrenia requires a systems level perspective, and as such I have provided an interpretation by which at least some aspects of the pathophysiology of schizophrenia can be explained.  5.5 Potential Applications and Future Directions In this thesis I have presented results obtained from the integration of data across large-scale gene expression datasets. I demonstrated the potential of combining data across studies to provide a ‘big picture’ perspective on molecular abnormalities associated with schizophrenia. Notably, incorporating additional data from independent cohorts will be a good source of validation for the changes identified. The meta-analytical approaches described in my thesis are primarily focused on microarray data. Thus, another area for future research will be the modification of these approaches to facilitate their application to sequence-based data.  132  The biological interpretation of my findings is limited to the information that can be harnessed from gene expression data. I believe that that the integration of other large-scale data types will be an important means of bridging these knowledge gaps. By coupling genome sequence information with gene expression profiling it may be possible to correlate changes in expression with nearby or distant polymorphisms and mutations that reside in regulatory regions. Expression quantitative trait locus (eQTL) analysis is a major advance in integrating large-scale genomic and genetic datasets to evaluate the consequences of genetic variation on gene expression [300]. The rationale for this approach is that expression levels are viewed as quantitative traits and these gene expression phenotypes can be mapped to particular genomic loci. In the context of schizophrenia, there are groups that have undertaken the task of investigating whether schizophrenia risk is mediated in part by common variants that influence gene expression [301-303]. If we were able to obtain SNP data for the cohorts used in our analyses, it would be particularly interesting to investigate possible disease associated polymorphic loci that modulate expression changes of the genes identified from my analysis.  The integration of my results with animal model studies will be instrumental in directly linking gene expression changes to observable phenotypes in brain tissue and behaviour. The meta-signature of ‘schizophrenia genes’ identified in this work represents expression changes that are consistent across individuals with the disease and reflect an underlying pathophysiology. To assess the impact of individual gene expression changes, one could mimic the change by creating knock-down or knock-in mouse models and evaluate the subsequent behavioural and neuropathological abnormalities. A recent study demonstrating the utility of this approach evaluated reduced expression levels of GAD1 in interneurons expressing NPY in mice [304].  Another promising avenue in psychiatric research, is that it is now possible to directly reprogram fibroblasts from affected patients into human induced pluripotent stem cells (hiPSCs) and subsequently differentiate these disorder-specific hiPSCs into neurons [305]. These neurons have been shown to exhibit schizophrenia-specific cellular phenotypes such as diminished neuronal connectivity, decreased neurite number, PSD95-protein levels and glutamate receptor expression [306]. As schizophrenia is  133  thought to be a neurodevelopmental disorder, the use of hiPSCs will be particularly useful in observing the abnormal development of neurons in vitro. One could also use these cells to examine specific synaptic defects that are believed to contribute to the illness. With genetic backgrounds that are known to result in schizophrenia, this group of neurons can be considered a close representation of neurons in the brain. Expression changes observed with these cells are therefore free of the limitations of postmortem studies, but are also exposed to an environment which is not entirely representative of in vivo conditions. A comparison of expression signatures obtained from hiPSC neurons against our meta-signatures obtained from postmortem brain will illustrate similarities and differences between the two appraoches.  Many genes have been identified in schizophrenia and many biological pathways that link these genes. There is a need for the concerted integration across the genetic and genomic studies in human and animal models to provide a more complete understanding of the underlying pathophysiology of schizophrenia. The overarching goal of future research in schizophrenia should strive towards identifying convergence at the level of molecular mechanisms. While the results of the meta-analyses described in this thesis stand alone, they are also a key component in helping accomplish this goal.  134  References 1.  Luo, Z. and D.H. Geschwind, Microarray applications in neuroscience. Neurobiol Dis, 2001. 8(2): p. 183-93.  2.  Mirnics, K., et al., Analysis of complex brain disorders with gene expression microarrays: schizophrenia as a disease of the synapse. Trends Neurosci, 2001. 24(8): p. 479-86.  3.  Colantuoni, C., et al., Temporal dynamics and genetic control of transcription in the human prefrontal cortex. Nature, 2011. 478(7370): p. 519-23.  4.  Kang, H.J., et al., Spatio-temporal transcriptome of the human brain. Nature, 2011. 478(7370): p. 483-9.  5.  Owen, M.J., N. Craddock, and M.C. O'Donovan, Schizophrenia: genes at last? Trends Genet, 2005. 21(9): p. 518-25.  6.  Plomin, R., M.J. Owen, and P. McGuffin, The genetic basis of complex human behaviors. Science, 1994. 264(5166): p. 1733-9.  7.  Choi, K.H., et al., Putative psychosis genes in the prefrontal cortex: combined analysis of gene expression microarrays. BMC Psychiatry, 2008. 8: p. 87.  8.  de Magalhaes, J.P., J. Curado, and G.M. Church, Meta-analysis of age-related gene expression profiles identifies common signatures of aging. Bioinformatics, 2009. 25(7): p. 875-81.  9.  Elashoff, M., et al., Meta-analysis of 12 genomic studies in bipolar disorder. J Mol Neurosci, 2007. 31(3): p. 221-43.  10.  Gaiteri, C. and E. Sibille, Differentially expressed genes in major depression reside on the periphery of resilient gene coexpression networks. Front Neurosci, 2011. 5: p. 95.  11.  Miller, J.A., S. Horvath, and D.H. Geschwind, Divergence of human and mouse brain transcriptome highlights Alzheimer disease pathways. Proc Natl Acad Sci U S A, 2010. 107(28): p. 12698-703.  12.  Oldham, M.C., et al., Functional organization of the transcriptome in human brain. Nat Neurosci, 2008. 11(11): p. 1271-82.  13.  Torkamani, A., et al., Coexpression network analysis of neural tissue reveals perturbations in developmental processes in schizophrenia. Genome Res, 2010. 20(4): p. 403-12.  14.  Gould, T.D. and H.K. Manji, The molecular medicine revolution and psychiatry: bridging the gap between basic neuroscience research and clinical psychiatry. J Clin Psychiatry, 2004. 65(5): p. 598-604.  15.  Uhl, G.R. and R.W. Grow, The burden of complex genetics in brain disorders. Arch Gen Psychiatry, 2004. 61(3): p. 223-9.  16.  Jablensky, A., Epidemiology of schizophrenia: the global burden of disease and disability. Eur Arch Psychiatry Clin Neurosci, 2000. 250(6): p. 274-85.  17.  Association, A.P., Diagnostic and statistical manual of mental disorders (4th ed., text rev.)2000, Washington, DC.  18.  Green, M.F., et al., Approaching a consensus cognitive battery for clinical trials in schizophrenia: the NIMH-MATRICS conference to select cognitive domains and test criteria. Biol Psychiatry, 2004. 56(5): p. 301-7.  19.  Kuperberg, G. and S. Heckers, Schizophrenia and cognitive function. Curr Opin Neurobiol, 2000. 10(2): p. 205-10.  135  20.  Hafner, H., Gender differences in schizophrenia. Psychoneuroendocrinology, 2003. 28 Suppl 2: p. 17-54.  21.  Goldner, E.M., et al., Prevalence and incidence studies of schizophrenic disorders: a systematic review of the literature. Can J Psychiatry, 2002. 47(9): p. 833-43.  22.  Carlsson, A., N. Waters, and M.L. Carlsson, Neurotransmitter interactions in schizophrenia-therapeutic implications. Biol Psychiatry, 1999. 46(10): p. 1388-95.  23.  Palermo-Neto, J., Dopaminergic systems. Dopamine receptors. Psychiatr Clin North Am, 1997. 20(4): p. 705-21.  24.  Carlsson, A., The current status of the dopamine hypothesis of schizophrenia. Neuropsychopharmacology, 1988. 1(3): p. 179-86.  25.  Castner, S.A. and P.S. Goldman-Rakic, Enhancement of working memory in aged monkeys by a sensitizing regimen of dopamine D1 receptor stimulation. J Neurosci, 2004. 24(6): p. 1446-50.  26.  Goldman-Rakic, P.S., E.C. Muly, 3rd, and G.V. Williams, D(1) receptors in prefrontal cells and circuits. Brain Res Brain Res Rev, 2000. 31(2-3): p. 295-301.  27.  Abi-Dargham, A., et al., Prefrontal dopamine D1 receptors and working memory in schizophrenia. J Neurosci, 2002. 22(9): p. 3708-19.  28.  Karlsson, P., et al., PET study of D(1) dopamine receptor binding in neuroleptic-naive patients with schizophrenia. Am J Psychiatry, 2002. 159(5): p. 761-7.  29.  Weinberger, D.R., Implications of normal brain development for the pathogenesis of schizophrenia. Arch Gen Psychiatry, 1987. 44(7): p. 660-9.  30.  Tsai, G. and J.T. Coyle, Glutamatergic mechanisms in schizophrenia. Annu Rev Pharmacol Toxicol, 2002. 42: p. 165-79.  31.  Krystal, J.H., et al., Subanesthetic effects of the noncompetitive NMDA antagonist, ketamine, in humans. Psychotomimetic, perceptual, cognitive, and neuroendocrine responses. Arch Gen Psychiatry, 1994. 51(3): p. 199-214.  32.  Luby, E.D., et al., Model psychoses and schizophrenia. Am J Psychiatry, 1962. 119: p. 61-7.  33.  Javitt, D.C. and S.R. Zukin, Recent advances in the phencyclidine model of schizophrenia. Am J Psychiatry, 1991. 148(10): p. 1301-8.  34.  Lane, H.Y., et al., Sarcosine or D-serine add-on treatment for acute exacerbation of schizophrenia: a randomized, double-blind, placebo-controlled study. Arch Gen Psychiatry, 2005. 62(11): p. 1196-204.  35.  Patil, S.T., et al., Activation of mGlu2/3 receptors as a new approach to treat schizophrenia: a randomized Phase 2 clinical trial. Nat Med, 2007. 13(9): p. 1102-7.  36.  Benes, F.M., et al., Deficits in small interneurons in prefrontal and cingulate cortices of schizophrenic and schizoaffective patients. Arch Gen Psychiatry, 1991. 48(11): p. 996-1001.  37.  Benes, F.M. and S. Berretta, GABAergic interneurons: implications for understanding schizophrenia and bipolar disorder. Neuropsychopharmacology, 2001. 25(1): p. 1-27.  38.  Akbarian, S., et al., Gene expression for glutamic acid decarboxylase is reduced without loss of neurons in prefrontal cortex of schizophrenics. Arch Gen Psychiatry, 1995. 52(4): p. 258-66.  39.  Volk, D.W., et al., Decreased glutamic acid decarboxylase67 messenger RNA expression in a subset of prefrontal cortical gamma-aminobutyric acid neurons in subjects with schizophrenia. Arch Gen Psychiatry, 2000. 57(3): p. 237-45.  40.  Kondziella, D., et al., How do glial-neuronal interactions fit into current neurotransmitter hypotheses of schizophrenia? Neurochem Int, 2007. 50(2): p. 291-301.  136  41.  Jentsch, J.D. and R.H. Roth, The neuropsychopharmacology of phencyclidine: from NMDA receptor hypofunction to the dopamine hypothesis of schizophrenia. Neuropsychopharmacology, 1999. 20(3): p. 201-25.  42.  Jafari, S., F. Fernandez-Enright, and X.F. Huang, Structural contributions of antipsychotic drugs to their therapeutic profiles and metabolic side effects. J Neurochem, 2012. 120(3): p. 371-84.  43.  Cardno, A.G. and Gottesman, II, Twin studies of schizophrenia: from bow-and-arrow concordances to star wars Mx and functional genomics. Am J Med Genet, 2000. 97(1): p. 12-7.  44.  Cardno, A.G., et al., Heritability estimates for psychotic disorders: the Maudsley twin psychosis series. Arch Gen Psychiatry, 1999. 56(2): p. 162-8.  45.  Kety, S.S., The significance of genetic factors in the etiology of schizophrenia: results from the national study of adoptees in Denmark. J Psychiatr Res, 1987. 21(4): p. 423-9.  46.  Chakravarti, A., Population genetics--making sense out of sequence. Nat Genet, 1999. 21(1 Suppl): p. 56-60.  47.  McClellan, J.M., E. Susser, and M.C. King, Schizophrenia: a common disease caused by multiple rare alleles. Br J Psychiatry, 2007. 190: p. 194-9.  48.  Straub, R.E., et al., Genetic variation in the 6p22.3 gene DTNBP1, the human ortholog of the mouse dysbindin gene, is associated with schizophrenia. Am J Hum Genet, 2002. 71(2): p. 33748.  49.  Funke, B., et al., Association of the DTNBP1 locus with schizophrenia in a U.S. population. Am J Hum Genet, 2004. 75(5): p. 891-8.  50.  Schwab, S.G., et al., Support for association of schizophrenia with genetic variation in the 6p22.3 gene, dysbindin, in sib-pair families with linkage and in an additional sample of triad families. Am J Hum Genet, 2003. 72(1): p. 185-90.  51.  Williams, N.M., et al., Identification in 2 independent samples of a novel schizophrenia risk haplotype of the dystrobrevin binding protein gene (DTNBP1). Arch Gen Psychiatry, 2004. 61(4): p. 336-44.  52.  Numakawa, T., et al., Evidence of novel neuronal functions of dysbindin, a susceptibility gene for schizophrenia. Hum Mol Genet, 2004. 13(21): p. 2699-708.  53.  Stefansson, H., et al., Neuregulin 1 and susceptibility to schizophrenia. Am J Hum Genet, 2002. 71(4): p. 877-92.  54.  Harrison, P.J. and A.J. Law, Neuregulin 1 and schizophrenia: genetics, gene expression, and neurobiology. Biol Psychiatry, 2006. 60(2): p. 132-40.  55.  Blackwood, D.H., et al., Schizophrenia and affective disorders--cosegregation with a translocation at chromosome 1q42 that directly disrupts brain-expressed genes: clinical and P300 findings in a family. Am J Hum Genet, 2001. 69(2): p. 428-33.  56.  Millar, J.K., et al., Disruption of two novel genes by a translocation co-segregating with schizophrenia. Hum Mol Genet, 2000. 9(9): p. 1415-23.  57.  Hennah, W., et al., Haplotype transmission analysis provides evidence of association for DISC1 to schizophrenia and suggests sex-dependent effects. Hum Mol Genet, 2003. 12(23): p. 3151-9.  58.  Hodgkinson, C.A., et al., Disrupted in schizophrenia 1 (DISC1): association with schizophrenia, schizoaffective disorder, and bipolar disorder. Am J Hum Genet, 2004. 75(5): p. 862-72.  59.  Bassett, A.S., et al., 22q11 deletion syndrome in adults with schizophrenia. Am J Med Genet, 1998. 81(4): p. 328-37.  60.  Murphy, K.C., L.A. Jones, and M.J. Owen, High rates of schizophrenia in adults with velo-cardiofacial syndrome. Arch Gen Psychiatry, 1999. 56(10): p. 940-5.  137  61.  Williams, N.M., et al., Strong evidence that GNB1L is associated with schizophrenia. Hum Mol Genet, 2008. 17(4): p. 555-66.  62.  Glaser, B., et al., No association between the putative functional ZDHHC8 single nucleotide polymorphism rs175174 and schizophrenia in large European samples. Biol Psychiatry, 2005. 58(1): p. 78-80.  63.  Coon, H., et al., Genomic scan for genes predisposing to schizophrenia. Am J Med Genet, 1994. 54(1): p. 59-71.  64.  St Clair, D., Copy number variation and schizophrenia. Schizophr Bull, 2009. 35(1): p. 9-12.  65.  Stefansson, H., et al., Common variants conferring risk of schizophrenia. Nature, 2009. 460(7256): p. 744-7.  66.  Steinberg, S., et al., Common variants at VRK2 and TCF4 conferring risk of schizophrenia. Hum Mol Genet, 2011. 20(20): p. 4076-81.  67.  Li, T., et al., Common variants in major histocompatibility complex region and TCF4 gene are significantly associated with schizophrenia in Han Chinese. Biol Psychiatry, 2010. 68(7): p. 671-3.  68.  Austin, J., Schizophrenia: an update and review. J Genet Couns, 2005. 14(5): p. 329-40.  69.  Carter, C.J., Schizophrenia susceptibility genes directly implicated in the life cycles of pathogens: cytomegalovirus, influenza, herpes simplex, rubella, and Toxoplasma gondii. Schizophr Bull, 2009. 35(6): p. 1163-82.  70.  Harrison, P.J., Schizophrenia susceptibility genes and neurodevelopment. Biol Psychiatry, 2007. 61(10): p. 1119-20.  71.  Walsh, T., et al., Rare structural variants disrupt multiple genes in neurodevelopmental pathways in schizophrenia. Science, 2008. 320(5875): p. 539-43.  72.  Harrison, P.J., The neuropathology of schizophrenia. A critical review of the data and their interpretation. Brain, 1999. 122 ( Pt 4): p. 593-624.  73.  Miller, E.K. and J.D. Cohen, An integrative theory of prefrontal cortex function. Annu Rev Neurosci, 2001. 24: p. 167-202.  74.  Paus, T., M. Keshavan, and J.N. Giedd, Why do many psychiatric disorders emerge during adolescence? Nat Rev Neurosci, 2008. 9(12): p. 947-57.  75.  Feinberg, I., Schizophrenia: caused by a fault in programmed synaptic elimination during adolescence? J Psychiatr Res, 1982. 17(4): p. 319-34.  76.  Keshavan, M.S., S. Anderson, and J.W. Pettegrew, Is schizophrenia due to excessive synaptic pruning in the prefrontal cortex? The Feinberg hypothesis revisited. J Psychiatr Res, 1994. 28(3): p. 239-65.  77.  Lawrie, S.M. and S.S. Abukmeil, Brain abnormality in schizophrenia. A systematic and quantitative review of volumetric magnetic resonance imaging studies. Br J Psychiatry, 1998. 172: p. 110-20.  78.  Shenton, M.E., et al., A review of MRI findings in schizophrenia. Schizophr Res, 2001. 49(1-2): p. 1-52.  79.  Pantelis, C., et al., Structural brain imaging evidence for multiple pathological processes at different stages of brain development in schizophrenia. Schizophr Bull, 2005. 31(3): p. 672-96.  80.  Mathalon, D.H., et al., Progressive brain volume changes and the clinical course of schizophrenia in men: a longitudinal magnetic resonance imaging study. Arch Gen Psychiatry, 2001. 58(2): p. 148-57.  138  81.  White, T., M. Nelson, and K.O. Lim, Diffusion tensor imaging in psychiatric disorders. Top Magn Reson Imaging, 2008. 19(2): p. 97-109.  82.  Kubicki, M., et al., Cingulate fasciculus integrity disruption in schizophrenia: a magnetic resonance diffusion tensor imaging study. Biol Psychiatry, 2003. 54(11): p. 1171-80.  83.  Minzenberg, M.J., et al., Meta-analysis of 41 functional neuroimaging studies of executive function in schizophrenia. Arch Gen Psychiatry, 2009. 66(8): p. 811-22.  84.  Plum, F., Prospects for research on schizophrenia. 3. Neurophysiology. Neuropathological findings. Neurosci Res Program Bull, 1972. 10(4): p. 384-8.  85.  Garey, L.J., et al., Reduced dendritic spine density on cerebral cortical pyramidal neurons in schizophrenia. J Neurol Neurosurg Psychiatry, 1998. 65(4): p. 446-53.  86.  Glantz, L.A. and D.A. Lewis, Decreased dendritic spine density on prefrontal cortical pyramidal neurons in schizophrenia. Arch Gen Psychiatry, 2000. 57(1): p. 65-73.  87.  Rajkowska, G., L.D. Selemon, and P.S. Goldman-Rakic, Neuronal and glial somal size in the prefrontal cortex: a postmortem morphometric study of schizophrenia and Huntington disease. Arch Gen Psychiatry, 1998. 55(3): p. 215-24.  88.  Pakkenberg, B., Post-mortem study of chronic schizophrenic brains. Br J Psychiatry, 1987. 151: p. 744-52.  89.  Selemon, L.D. and P.S. Goldman-Rakic, The reduced neuropil hypothesis: a circuit based model of schizophrenia. Biol Psychiatry, 1999. 45(1): p. 17-25.  90.  English, J.A., et al., The neuroproteomics of schizophrenia. Biol Psychiatry, 2011. 69(2): p. 16372.  91.  Akbarian, S., The molecular pathology of schizophrenia--focus on histone and DNA modifications. Brain Res Bull, 2010. 83(3-4): p. 103-7.  92.  Roth, T.L., et al., Epigenetic mechanisms in schizophrenia. Biochim Biophys Acta, 2009. 1790(9): p. 869-77.  93.  Huang, H.S. and S. Akbarian, GAD1 mRNA expression and DNA methylation in prefrontal cortex of subjects with schizophrenia. PLoS ONE, 2007. 2(8): p. e809.  94.  Abdolmaleky, H.M., et al., Hypermethylation of the reelin (RELN) promoter in the brain of schizophrenic patients: a preliminary report. Am J Med Genet B Neuropsychiatr Genet, 2005. 134B(1): p. 60-6.  95.  Grayson, D.R., et al., Reelin promoter hypermethylation in schizophrenia. Proc Natl Acad Sci U S A, 2005. 102(26): p. 9341-6.  96.  Abdolmaleky, H.M., et al., Hypomethylation of MB-COMT promoter is a major risk factor for schizophrenia and bipolar disorder. Hum Mol Genet, 2006. 15(21): p. 3132-45.  97.  Iwamoto, K., et al., DNA methylation status of SOX10 correlates with its downregulation and oligodendrocyte dysfunction in schizophrenia. J Neurosci, 2005. 25(22): p. 5376-81.  98.  Mirnics, K., et al., Molecular characterization of schizophrenia viewed by microarray analysis of gene expression in prefrontal cortex. Neuron, 2000. 28(1): p. 53-67.  99.  Sequeira, P.A., M.V. Martin, and M.P. Vawter, The first decade and beyond of transcriptional profiling in schizophrenia. Neurobiol Dis, 2012. 45(1): p. 23-36.  100.  Vawter, M.P., et al., Reduction of synapsin in the hippocampus of patients with bipolar disorder and schizophrenia. Mol Psychiatry, 2002. 7(6): p. 571-8.  101.  Hemby, S.E., et al., Gene expression profile for schizophrenia: discrete neuron transcription patterns in the entorhinal cortex. Arch Gen Psychiatry, 2002. 59(7): p. 631-40.  139  102.  Maycox, P.R., et al., Analysis of gene expression in two large schizophrenia cohorts identifies multiple changes associated with nerve terminal function. Mol Psychiatry, 2009. 14(12): p. 108394.  103.  Hashimoto, T., et al., Alterations in GABA-related transcriptome in the dorsolateral prefrontal cortex of subjects with schizophrenia. Mol Psychiatry, 2008. 13(2): p. 147-61.  104.  Duncan, C.E., et al., Prefrontal GABA(A) receptor alpha-subunit expression in normal postnatal human development and schizophrenia. J Psychiatr Res, 2010. 44(10): p. 673-81.  105.  Straub, R.E., et al., Allelic variation in GAD1 (GAD67) is associated with schizophrenia and influences cortical function and gene expression. Mol Psychiatry, 2007. 12(9): p. 854-69.  106.  Hakak, Y., et al., Genome-wide expression analysis reveals dysregulation of myelination-related genes in chronic schizophrenia. Proc Natl Acad Sci U S A, 2001. 98(8): p. 4746-51.  107.  Katsel, P., et al., Variations in differential gene expression patterns across multiple brain regions in schizophrenia. Schizophr Res, 2005. 77(2-3): p. 241-52.  108.  Arion, D., et al., Molecular evidence for increased expression of genes related to immune and chaperone function in the prefrontal cortex in schizophrenia. Biol Psychiatry, 2007. 62(7): p. 71121.  109.  Saetre, P., et al., Inflammation-related genes up-regulated in schizophrenia brains. BMC Psychiatry, 2007. 7: p. 46.  110.  Harrison, P.J., Using our brains: the findings, flaws, and future of postmortem studies of psychiatric disorders. Biol Psychiatry, 2011. 69(2): p. 102-3.  111.  Bezzi, P. and A. Volterra, A neuron-glia signalling network in the active brain. Curr Opin Neurobiol, 2001. 11(3): p. 387-94.  112.  Arion, D., et al., Infragranular gene expression disturbances in the prefrontal cortex in schizophrenia: signature of altered neural development? Neurobiol Dis, 2010. 37(3): p. 738-46.  113.  Benes, F.M., et al., Regulation of the GABA cell phenotype in hippocampus of schizophrenics and bipolars. Proc Natl Acad Sci U S A, 2007. 104(24): p. 10164-9.  114.  Harris, L.W., et al., The cerebral microvasculature in schizophrenia: a laser capture microdissection study. PLoS ONE, 2008. 3(12): p. e3964.  115.  Emmert-Buck, M.R., et al., Laser capture microdissection. Science, 1996. 274(5289): p. 9981001.  116.  Harrison, P.J., et al., The relative importance of premortem acidosis and postmortem interval for human brain gene expression studies: selective mRNA vulnerability and comparison with their encoded proteins. Neurosci Lett, 1995. 200(3): p. 151-4.  117.  Kingsbury, A.E., et al., Tissue pH as an indicator of mRNA preservation in human post-mortem brain. Brain Res Mol Brain Res, 1995. 28(2): p. 311-8.  118.  Lipska, B.K., et al., Critical factors in gene expression in postmortem human brain: Focus on studies in schizophrenia. Biol Psychiatry, 2006. 60(6): p. 650-8.  119.  Bahn, S., et al., Gene expression profiling in the post-mortem human brain--no cause for dismay. J Chem Neuroanat, 2001. 22(1-2): p. 79-94.  120.  Atz, M., et al., Methodological considerations for gene expression profiling of human brain. J Neurosci Methods, 2007. 163(2): p. 295-309.  121.  Deep-Soboslay, A., et al., Psychiatric brain banking: three perspectives on current trends and future directions. Biol Psychiatry, 2011. 69(2): p. 104-12.  140  122.  Haroutunian, V., et al., The human homolog of the QKI gene affected in the severe dysmyelination "quaking" mouse phenotype: downregulated in multiple brain regions in schizophrenia. Am J Psychiatry, 2006. 163(10): p. 1834-7.  123.  Oni-Orisan, A., et al., Altered vesicular glutamate transporter expression in the anterior cingulate cortex in schizophrenia. Biol Psychiatry, 2008. 63(8): p. 766-75.  124.  Li, J.Z., et al., Systematic changes in gene expression in postmortem human brains associated with tissue pH and terminal medical conditions. Hum Mol Genet, 2004. 13(6): p. 609-16.  125.  Tomita, H., et al., Effect of agonal and postmortem factors on gene expression profile: quality control in microarray analyses of postmortem human brain. Biol Psychiatry, 2004. 55(4): p. 34652.  126.  Li, J.Z., et al., Sample matching by inferred agonal stress in gene expression analyses of the brain. BMC Genomics, 2007. 8: p. 336.  127.  Lu, T., et al., Gene regulation and DNA damage in the ageing human brain. Nature, 2004. 429(6994): p. 883-91.  128.  Vawter, M.P., et al., Gender-specific gene expression in post-mortem human brain: localization to sex chromosomes. Neuropsychopharmacology, 2004. 29(2): p. 373-84.  129.  Mirnics, K. and J. Pevsner, Progress in the use of microarray technology to study the neurobiology of disease. Nat Neurosci, 2004. 7(5): p. 434-9.  130.  Velculescu, V.E., et al., Serial analysis of gene expression. Science, 1995. 270(5235): p. 484-7.  131.  Brenner, S., et al., Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays. Nat Biotechnol, 2000. 18(6): p. 630-4.  132.  Lee, J., et al., Effects of RNA degradation on gene expression analysis of human postmortem tissues. FASEB J, 2005. 19(10): p. 1356-8.  133.  Popova, T., et al., Effect of RNA quality on transcript intensity levels in microarray analysis of human post-mortem brain tissues. BMC Genomics, 2008. 9: p. 91.  134.  Schroeder, A., et al., The RIN: an RNA integrity number for assigning integrity values to RNA measurements. BMC Mol Biol, 2006. 7: p. 3.  135.  Mei, R., et al., Probe selection for high-density oligonucleotide arrays. Proc Natl Acad Sci U S A, 2003. 100(20): p. 11237-42.  136.  Flikka, K., et al., XHM: a system for detection of potential cross hybridizations in DNA microarrays. BMC Bioinformatics, 2004. 5: p. 117.  137.  Shi, L., et al., The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nat Biotechnol, 2006. 24(9): p. 1151-61.  138.  Hubbell, E., W.M. Liu, and R. Mei, Robust estimators for expression analysis. Bioinformatics, 2002. 18(12): p. 1585-92.  139.  Irizarry, R.A., et al., Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res, 2003. 31(4): p. e15.  140.  Wu Z, I.R., Gentleman R, Martinez-Murillo F, Spencer F, A Model-Based Background Adjustment for Oligonucleotide Expression Arrays. Journal of the American Statistical Association, 2004. 99(9): p. 909-917.  141.  Leek, J.T., et al., Tackling the widespread and critical impact of batch effects in high-throughput data. Nat Rev Genet, 2010. 11(10): p. 733-9.  142.  Benjamini, Y.H., Y., Controlling the False Discovery Rate: a Practical and Powerful Approach to Multiple Testing. J. R. Statist. Soc. B, 1995. 57(1): p. 11.  141  143.  Storey, J.D. and R. Tibshirani, Statistical significance for genomewide studies. Proc Natl Acad Sci U S A, 2003. 100(16): p. 9440-5.  144.  Borate, B.R., et al., Comparison of threshold selection methods for microarray gene coexpression matrices. BMC Res Notes, 2009. 2: p. 240.  145.  Ruan, J., A.K. Dean, and W. Zhang, A general co-expression network-based approach to gene expression analysis: comparison and applications. BMC Syst Biol, 2010. 4: p. 8.  146.  Elo, L.L., et al., Systematic construction of gene coexpression networks with applications to human T helper cell differentiation process. Bioinformatics, 2007. 23(16): p. 2096-103.  147.  Lee, H.K., et al., Coexpression analysis of human genes across many microarray data sets. Genome Res, 2004. 14(6): p. 1085-94.  148.  Zhang, B. and S. Horvath, A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol, 2005. 4: p. Article17.  149.  Oliver, S., Guilt-by-association goes global. Nature, 2000. 403(6770): p. 601-3.  150.  A.L., A.R.J.H.B., Internet: Diameter of the World-Wide Web. Nature, 1999. 401: p. 2.  151.  Watts, D.J. and S.H. Strogatz, Collective dynamics of 'small-world' networks. Nature, 1998. 393(6684): p. 440-2.  152.  Dijkstra, E.W., A note on two problems in connexion with graphs. Numerische Mathematik, 1959. 1: p. 269-271.  153.  Jordan, I.K., et al., Conservation and coevolution in the scale-free human gene coexpression network. Mol Biol Evol, 2004. 21(11): p. 2058-70.  154.  Tsaparas, P., et al., Global similarity and local divergence in human and mouse gene coexpression networks. BMC Evol Biol, 2006. 6: p. 70.  155.  van Noort, V., B. Snel, and M.A. Huynen, The yeast coexpression network has a small-world, scale-free architecture and can be explained by a simple model. EMBO Rep, 2004. 5(3): p. 2804.  156.  Stumpf, M.P. and M.A. Porter, Mathematics. Critical truths about power laws. Science, 2012. 335(6069): p. 665-6.  157.  Hanisch, D., et al., Co-clustering of biological networks and gene expression data. Bioinformatics, 2002. 18 Suppl 1: p. S145-54.  158.  Ideker, T., et al., Discovering regulatory and signalling circuits in molecular interaction networks. Bioinformatics, 2002. 18 Suppl 1: p. S233-40.  159.  Bader, G.D. and C.W. Hogue, An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics, 2003. 4: p. 2.  160.  Zhou, X., M.C. Kao, and W.H. Wong, Transitive functional annotation by shortest-path analysis of gene expression data. Proc Natl Acad Sci U S A, 2002. 99(20): p. 12783-8.  161.  Barrett, T., et al., NCBI GEO: mining tens of millions of expression profiles--database and tools update. Nucleic Acids Res, 2007. 35(Database issue): p. D760-5.  162.  Brazma, A., et al., ArrayExpress--a public repository for microarray gene expression data at the EBI. Nucleic Acids Res, 2003. 31(1): p. 68-71.  163.  Edgar, R., M. Domrachev, and A.E. Lash, Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res, 2002. 30(1): p. 207-10.  164.  Tseng, G.C., D. Ghosh, and E. Feingold, Comprehensive literature review and statistical considerations for microarray meta-analysis. Nucleic Acids Res, 2012.  142  165.  Campain, A. and Y.H. Yang, Comparison study of microarray meta-analysis methods. BMC Bioinformatics, 2010. 11: p. 408.  166.  Fisher, R.A., Statistical Methods for Research Workers1925, Edinburgh: Oliver and Boyd.  167.  Hwang, D., et al., A data integration methodology for systems biology. Proc Natl Acad Sci U S A, 2005. 102(48): p. 17296-301.  168.  Parmigiani, G., et al., A cross-study comparison of gene expression studies for the molecular classification of lung cancer. Clin Cancer Res, 2004. 10(9): p. 2922-7.  169.  Breitling, R., et al., Rank products: a simple, yet powerful, new method to detect differentially regulated genes in replicated microarray experiments. FEBS Lett, 2004. 573(1-3): p. 83-92.  170.  Choi, J.K., et al., Combining multiple microarray studies and modeling interstudy variation. Bioinformatics, 2003. 19 Suppl 1: p. i84-90.  171.  Dawany, N.B. and A. Tozeren, Asymmetric microarray data produces gene lists highly predictive of research literature on multiple cancer types. BMC Bioinformatics, 2010. 11: p. 483.  172.  Tusher, V.G., R. Tibshirani, and G. Chu, Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci U S A, 2001. 98(9): p. 5116-21.  173.  Park, T., et al., Combining multiple microarrays in the presence of controlling variables. Bioinformatics, 2006. 22(14): p. 1682-9.  174.  Yi, S.G. and T. Park, Integrated analysis of the heterogeneous microarray data. BMC Bioinformatics, 2011. 12 Suppl 5: p. S3.  175.  Yu, T., et al., Dimension reduction and mixed-effects model for microarray meta-analysis of cancer. Front Biosci, 2008. 13: p. 2714-20.  176.  Varrault, A., et al., Zac1 regulates an imprinted gene network critically involved in the control of embryonic growth. Dev Cell, 2006. 11(5): p. 711-22.  177.  Srivastava, G.P., et al., Identification of transcription factor's targets using tissue-specific transcriptomic data in Arabidopsis thaliana. BMC Syst Biol, 2010. 4 Suppl 2: p. S2.  178.  Choi, J.K., et al., Differential coexpression analysis using microarray data and its application to human cancer. Bioinformatics, 2005. 21(24): p. 4348-55.  179.  Segal, E., et al., A module map showing conditional activity of expression modules in cancer. Nat Genet, 2004. 36(10): p. 1090-8.  180.  Mabbott, N.A., et al., Meta-analysis of lineage-specific gene expression signatures in mouse leukocyte populations. Immunobiology, 2010. 215(9-10): p. 724-36.  181.  Ucar, D., et al., Construction of a reference gene association network from multiple profiling data: application to data analysis. Bioinformatics, 2007. 23(20): p. 2716-24.  182.  Gillis, J. and P. Pavlidis, The role of indirect connections in gene networks in predicting function. Bioinformatics, 2011. 27(13): p. 1860-6.  183.  Voineagu, I., et al., Transcriptomic analysis of autistic brain reveals convergent molecular pathology. Nature, 2011. 474(7351): p. 380-4.  184.  Piomelli, D., Cannabinoid activity curtails cocaine craving. Nat Med, 2001. 7(10): p. 1099-100.  185.  Mirnics, K., P. Levitt, and D.A. Lewis, Critical appraisal of DNA microarrays in psychiatric genomics. Biol Psychiatry, 2006. 60(2): p. 163-76.  186.  Erraji-Benchekroun, L., et al., Molecular aging in human prefrontal cortex is selective and continuous throughout adult life. Biol Psychiatry, 2005. 57(5): p. 549-58.  143  187.  Galfalvy, H.C., et al., Sex genes for genomic analysis in human brain: internal controls for comparison of probe level data extraction. BMC Bioinformatics, 2003. 4: p. 37.  188.  Reinius, B., et al., An evolutionarily conserved sexual signature in the primate brain. PLoS Genet, 2008. 4(6): p. e1000100.  189.  Mexal, S., et al., Brain pH has a significant impact on human postmortem hippocampal gene expression profiles. Brain Res, 2006. 1106(1): p. 1-11.  190.  Borozan, I., et al., MAID : an effect size based model for microarray data integration across laboratories and platforms. BMC Bioinformatics, 2008. 9: p. 305.  191.  Cahan, P., et al., Meta-analysis of microarray results: challenges, opportunities, and recommendations for standardization. Gene, 2007. 401(1-2): p. 12-8.  192.  Rhodes, D.R., et al., Large-scale meta-analysis of cancer microarray data identifies common transcriptional profiles of neoplastic transformation and progression. Proc Natl Acad Sci U S A, 2004. 101(25): p. 9309-14.  193.  RDevelopmentCoreTeam, R: A language and environment for statistical computing, 2005, R Foundation for Statistical Computing: Vienna, Austria.  194.  Fisher, R.A., Combining independent tests of significance. American Statistician, 1948. 2(3): p. 30.  195.  Rhodes, D.R., et al., Meta-analysis of microarrays: interstudy validation of gene expression profiles reveals pathway dysregulation in prostate cancer. Cancer Res, 2002. 62(15): p. 4427-33.  196.  Hess, A. and H. Iyer, Fisher's combined p-value for detecting differentially expressed genes using Affymetrix expression arrays. BMC Genomics, 2007. 8: p. 96.  197.  Ashburner, M., et al., Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet, 2000. 25(1): p. 25-9.  198.  Lee, H.K., et al., ErmineJ: tool for functional analysis of gene expression data sets. BMC Bioinformatics, 2005. 6: p. 269.  199.  Vawter, M.P., et al., Mitochondrial-related gene expression changes are sensitive to agonal-pH state: implications for brain disorders. Mol Psychiatry, 2006. 11(7): p. 615, 663-79.  200.  Colantuoni, C., et al., Age-related changes in the expression of schizophrenia susceptibility genes in the human prefrontal cortex. Brain Struct Funct, 2008.  201.  Hong, M.G., et al., Transcriptome-wide assessment of human brain and lymphocyte senescence. PLoS ONE, 2008. 3(8): p. e3024.  202.  Bartke, A., Impact of reduced insulin-like growth factor-1/insulin signaling on aging in mammals: novel findings. Aging Cell, 2008. 7(3): p. 285-90.  203.  Oh, S., G.C. Tseng, and E. Sibille, Reciprocal phylogenetic conservation of molecular aging in mouse and human brain. Neurobiol Aging, 2009.  204.  Canales, R.D., et al., Evaluation of DNA microarray results with quantitative gene expression platforms. Nat Biotechnol, 2006. 24(9): p. 1115-22.  205.  Shippy, R., et al., Using RNA sample titrations to assess microarray platform performance and normalization techniques. Nat Biotechnol, 2006. 24(9): p. 1123-31.  206.  Dobbin, K.K., et al., Interlaboratory comparability study of cancer gene expression analysis using oligonucleotide microarrays. Clin Cancer Res, 2005. 11(2 Pt 1): p. 565-72.  207.  Irizarry, R.A., et al., Multiple-laboratory comparison of microarray platforms. Nat Methods, 2005. 2(5): p. 345-50.  144  208.  Petersen, D., et al., Three microarray platforms: an analysis of their concordance in profiling gene expression. BMC Genomics, 2005. 6(1): p. 63.  209.  Pedotti, P., et al., Can subtle changes in gene expression be consistently detected with different microarray platforms? BMC Genomics, 2008. 9: p. 124.  210.  Ernst, C., et al., Confirmation of region-specific patterns of gene expression in the human brain. Neurogenetics, 2007. 8(3): p. 219-24.  211.  Khaitovich, P., et al., Regional patterns of gene expression in human and chimpanzee brains. Genome Res, 2004. 14(8): p. 1462-73.  212.  Roth, R.B., et al., Gene expression analyses reveal molecular relationships among 20 regions of the human CNS. Neurogenetics, 2006. 7(2): p. 67-80.  213.  Berchtold, N.C., et al., Gene expression changes in the course of normal brain aging are sexually dimorphic. Proc Natl Acad Sci U S A, 2008. 105(40): p. 15605-10.  214.  Johnson, W.E., C. Li, and A. Rabinovic, Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics, 2007. 8(1): p. 118-27.  215.  El Idrissi, A., Taurine improves learning and retention in aged mice. Neurosci Lett, 2008. 436(1): p. 19-22.  216.  Jiang, C.H., et al., The effects of aging on gene expression in the hypothalamus and cortex of mice. Proc Natl Acad Sci U S A, 2001. 98(4): p. 1930-4.  217.  Loerch, P.M., et al., Evolution of the aging brain transcriptome and synaptic regulation. PLoS ONE, 2008. 3(10): p. e3329.  218.  Terzi, D., et al., Regulators of G protein signaling in neuropsychiatric disorders. Prog Mol Biol Transl Sci, 2009. 86: p. 299-333.  219.  Levitt, P., et al., Making the case for a candidate vulnerability gene in schizophrenia: Convergent evidence for regulator of G-protein signaling 4 (RGS4). Biol Psychiatry, 2006. 60(6): p. 534-7.  220.  Falls, D.L., Neuregulins: functions, forms, and signaling strategies. Exp Cell Res, 2003. 284(1): p. 14-30.  221.  Corfas, G., K. Roy, and J.D. Buxbaum, Neuregulin 1-erbB signaling and the molecular/cellular basis of schizophrenia. Nat Neurosci, 2004. 7(6): p. 575-80.  222.  Cashion, A.B., M.J. Smith, and P.M. Wise, Glutamic acid decarboxylase 67 (GAD67) gene expression in discrete regions of the rostral preoptic area change during the oestrous cycle and with age. J Neuroendocrinol, 2004. 16(8): p. 711-6.  223.  Siegmund, K.D., et al., DNA methylation in the human cerebral cortex is dynamically regulated throughout the life span and involves differentiated neurons. PLoS ONE, 2007. 2(9): p. e895.  224.  Iwamoto, K., M. Bundo, and T. Kato, Altered expression of mitochondria-related genes in postmortem brains of patients with bipolar disorder or schizophrenia, as revealed by large-scale DNA microarray analysis. Hum Mol Genet, 2005. 14(2): p. 241-53.  225.  Pongrac, J., et al., Gene expression profiling with DNA microarrays: advancing our understanding of psychiatric disorders. Neurochemical research, 2002. 27(10): p. 1049-63.  226.  Vawter, M.P., et al., Microarray analysis of gene expression in the prefrontal cortex in schizophrenia: a preliminary study. Schizophr Res, 2002. 58(1): p. 11-20.  227.  Iwamoto, K. and T. Kato, Gene expression profiling in schizophrenia and related mental disorders. The Neuroscientist : a review journal bringing neurobiology, neurology and psychiatry, 2006. 12(4): p. 349-61.  145  228.  Altar, C.A., et al., Deficient hippocampal neuron expression of proteasome, ubiquitin, and mitochondrial genes in multiple schizophrenia cohorts. Biol Psychiatry, 2005. 58(2): p. 85-96.  229.  Middleton, F.A., et al., Gene expression profiling reveals alterations of specific metabolic pathways in schizophrenia. The Journal of neuroscience : the official journal of the Society for Neuroscience, 2002. 22(7): p. 2718-29.  230.  Aston, C., L. Jiang, and B.P. Sokolov, Microarray analysis of postmortem temporal cortex from patients with schizophrenia. Journal of neuroscience research, 2004. 77(6): p. 858-66.  231.  Dracheva, S., et al., Myelin-associated mRNA and protein expression deficits in the anterior cingulate cortex and hippocampus in elderly schizophrenia patients. Neurobiol Dis, 2006. 21(3): p. 531-40.  232.  Allen, N.C., et al., Systematic meta-analyses and field synopsis of genetic association studies in schizophrenia: the SzGene database. Nat Genet, 2008. 40(7): p. 827-34.  233.  Mathieson, I., M.R. Munafo, and J. Flint, Meta-analysis indicates that common variants at the DISC1 locus are not associated with schizophrenia. Mol Psychiatry, 2011.  234.  O'Donovan, M.C., et al., Identification of loci associated with schizophrenia by genome-wide association and follow-up. Nat Genet, 2008. 40(9): p. 1053-5.  235.  Mistry, M. and P. Pavlidis, A cross-laboratory comparison of expression profiling data from normal human postmortem brain. Neuroscience, 2010. 167(2): p. 384-95.  236.  Choi, K.H., et al., Gene expression and genetic variation data implicate PCLO in bipolar disorder. Biol Psychiatry, 2011. 69(4): p. 353-9.  237.  Liu, C., et al., Whole-genome association mapping of gene expression in the human prefrontal cortex. Mol Psychiatry, 2010. 15(8): p. 779-84.  238.  Barnes, M., et al., Experimental comparison and cross-validation of the Affymetrix and Illumina gene expression analysis platforms. Nucleic Acids Res, 2005. 33(18): p. 5914-23.  239.  Baum, A.E., et al., Meta-analysis of two genome-wide association studies of bipolar disorder reveals important points of agreement. Mol Psychiatry, 2008. 13(5): p. 466-7.  240.  Liu, Y., et al., Meta-analysis of genome-wide association data of bipolar disorder and major depressive disorder. Mol Psychiatry, 2011. 16(1): p. 2-4.  241.  Leek, J.T. and J.D. Storey, Capturing heterogeneity in gene expression studies by surrogate variable analysis. PLoS Genet, 2007. 3(9): p. 1724-35.  242.  Garbett, K., et al., Transcriptome alterations in the prefrontal cortex of subjects with schizophrenia who committed suicide. Neuropsychopharmacologia Hungarica : a Magyar Pszichofarmakologiai Egyesulet lapja = official journal of the Hungarian Association of Psychopharmacology, 2008. 10(1): p. 9-14.  243.  Team, R.D.C., R: A Language and Environment for Statistical Computing, R.F.f.S. Computing, Editor 2011: Vienna, Austria.  244.  Bolstad, B.M., et al., A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics, 2003. 19(2): p. 185-93.  245.  Sibille, E., et al., A molecular signature of depression in the amygdala. Am J Psychiatry, 2009. 166(9): p. 1011-24.  246.  Gillis, J., M. Mistry, and P. Pavlidis, Gene function analysis in complex data sets using ErmineJ. Nature protocols, 2010. 5(6): p. 1148-59.  247.  Cahoy, J.D., et al., A transcriptome database for astrocytes, neurons, and oligodendrocytes: a new resource for understanding brain development and function. The Journal of neuroscience : the official journal of the Society for Neuroscience, 2008. 28(1): p. 264-78.  146  248.  Chatr-aryamontri, A., et al., MINT: the Molecular INTeraction database. Nucleic Acids Res, 2007. 35(Database issue): p. D572-4.  249.  Chua, H.N., W.K. Sung, and L. Wong, Exploiting indirect neighbours and topological weight to predict protein function from protein-protein interactions. Bioinformatics, 2006. 22(13): p. 1623-30.  250.  Gilbert, D., Biomolecular interaction network database. Briefings in bioinformatics, 2005. 6(2): p. 194-8.  251.  Lynn, D.J., et al., InnateDB: facilitating systems-level analyses of the mammalian innate immune response. Molecular systems biology, 2008. 4: p. 218.  252.  Prasad, T.S., K. Kandasamy, and A. Pandey, Human Protein Reference Database and Human Proteinpedia as discovery tools for systems biology. Methods in molecular biology, 2009. 577: p. 67-79.  253.  Razick, S., G. Magklaras, and I.M. Donaldson, iRefIndex: a consolidated protein interaction database with provenance. BMC Bioinformatics, 2008. 9: p. 405.  254.  Glatt, S.J., et al., Comparative gene expression analysis of blood and brain provides concurrent validation of SELENBP1 up-regulation in schizophrenia. Proc Natl Acad Sci U S A, 2005. 102(43): p. 15533-8.  255.  Narayan, S., et al., Molecular profiles of schizophrenia in the CNS at different stages of illness. Brain Res, 2008. 1239: p. 235-48.  256.  Prabakaran, S., et al., Mitochondrial dysfunction in schizophrenia: evidence for compromised brain metabolism and oxidative stress. Mol Psychiatry, 2004. 9(7): p. 684-97, 643.  257.  Park, E., et al., Regulatory roles of hnRNP M and Nova-1 in the alternative splicing of the dopamine D2 receptor pre-mRNA. The Journal of biological chemistry, 2011.  258.  Eyles, D.W., J.J. McGrath, and G.P. Reynolds, Neuronal calcium-binding proteins and schizophrenia. Schizophr Res, 2002. 57(1): p. 27-34.  259.  Manji, H.K., G proteins: implications for psychiatry. Am J Psychiatry, 1992. 149(6): p. 746-60.  260.  Schwab, S.G., et al., Support for a chromosome 18p locus conferring susceptibility to functional psychoses in families with schizophrenia, by association and linkage analysis. Am J Hum Genet, 1998. 63(4): p. 1139-52.  261.  Halim, N.D., et al., Increased lactate levels and reduced pH in postmortem brains of schizophrenics: medication confounds. J Neurosci Methods, 2008. 169(1): p. 208-13.  262.  Thomas, E.A., Molecular profiling of antipsychotic drug function: convergent mechanisms in the pathology and treatment of psychiatric disorders. Molecular neurobiology, 2006. 34(2): p. 109-28.  263.  Mistry, M., J. Gillis, and P. Pavlidis, Genome-wide expression profiling of schizophrenia using a large combined cohort. Mol Psychiatry, 2012.  264.  Spirin, V. and L.A. Mirny, Protein complexes and functional modules in molecular networks. Proc Natl Acad Sci U S A, 2003. 100(21): p. 12123-8.  265.  Newman, M.E., Modularity and community structure in networks. Proc Natl Acad Sci U S A, 2006. 103(23): p. 8577-82.  266.  Gillis, J. and P. Pavlidis, A methodology for the analysis of differential coexpression across the human lifespan. BMC Bioinformatics, 2009. 10: p. 306.  267.  Miller, J.A., M.C. Oldham, and D.H. Geschwind, A systems level analysis of transcriptional changes in Alzheimer's disease and normal aging. J Neurosci, 2008. 28(6): p. 1410-20.  268.  Albert, R.J., H.; Barabasi, A.L., Internet: Diameter of the World-Wide Web. Nature, 1999. 401: p. 2.  147  269.  I.M., X.-B.R.S., Reshuffling scale-free networks: From random to assortative. Physical Review E, 2004. 70(6).  270.  Toro, R., et al., Key role for gene dosage and synaptic homeostasis in autism spectrum disorders. Trends Genet, 2010. 26(8): p. 363-72.  271.  Yip, A.M. and S. Horvath, Gene network interconnectedness and the generalized topological overlap measure. BMC Bioinformatics, 2007. 8: p. 22.  272.  Smoot, M.E., et al., Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics, 2011. 27(3): p. 431-2.  273.  Breslin, T., P. Eden, and M. Krogh, Comparing functional annotation analyses with Catmap. BMC Bioinformatics, 2004. 5: p. 193.  274.  Gillis, J. and P. Pavlidis, The impact of multifunctional genes on "guilt by association" analysis. PLoS One, 2011. 6(2): p. e17258.  275.  Walterfang, M., et al., Neuropathological, neurogenetic and neuroimaging evidence for white matter pathology in schizophrenia. Neurosci Biobehav Rev, 2006. 30(7): p. 918-48.  276.  Hansen, T., et al., Apolipoprotein D is associated with long-term outcome in patients with schizophrenia. Pharmacogenomics J, 2006. 6(2): p. 120-5.  277.  Qin, W., et al., A family-based association study of PLP1 and schizophrenia. Neurosci Lett, 2005. 375(3): p. 207-10.  278.  Yang, Y.F., et al., Possible association of the MAG locus with schizophrenia in a Chinese Han cohort of family trios. Schizophr Res, 2005. 75(1): p. 11-9.  279.  Haroutunian, V., et al., Variations in oligodendrocyte-related gene expression across multiple cortical regions: implications for the pathophysiology of schizophrenia. Int J Neuropsychopharmacol, 2007. 10(4): p. 565-73.  280.  Watkins, T.A., et al., Distinct stages of myelination regulated by gamma-secretase and astrocytes in a rapidly myelinating CNS coculture system. Neuron, 2008. 60(4): p. 555-69.  281.  Talbott, J.F., et al., Endogenous Nkx2.2+/Olig2+ oligodendrocyte precursor cells fail to remyelinate the demyelinated adult rat spinal cord in the absence of astrocytes. Exp Neurol, 2005. 192(1): p. 11-24.  282.  Franklin, R.J., A.J. Crang, and W.F. Blakemore, Transplanted type-1 astrocytes facilitate repair of demyelinating lesions by host oligodendrocytes in adult rat spinal cord. J Neurocytol, 1991. 20(5): p. 420-30.  283.  Schnieder, T.P. and A.J. Dwork, Searching for neuropathology: gliosis in schizophrenia. Biol Psychiatry, 2011. 69(2): p. 134-9.  284.  Brockmann, K., et al., Succinate in dystrophic white matter: a proton magnetic resonance spectroscopy finding characteristic for complex II deficiency. Ann Neurol, 2002. 52(1): p. 38-46.  285.  Xulvi-Brunet, R. and H. Li, Co-expression networks: graph properties and topological comparisons. Bioinformatics, 2010. 26(2): p. 205-14.  286.  Marioni, J.C., et al., RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res, 2008. 18(9): p. 1509-17.  287.  Arion, D., et al., Molecular markers distinguishing supragranular and infragranular layers in the human prefrontal cortex. Eur J Neurosci, 2007. 25(6): p. 1843-54.  288.  Frankle, W.G., J. Lerma, and M. Laruelle, The synaptic hypothesis of schizophrenia. Neuron, 2003. 39(2): p. 205-16.  148  289.  Chana, G., et al., Two-dimensional assessment of cytoarchitecture in the anterior cingulate cortex in major depressive disorder, bipolar disorder, and schizophrenia: evidence for decreased neuronal somal size and increased neuronal density. Biol Psychiatry, 2003. 53(12): p. 1086-98.  290.  Selemon, L.D., G. Rajkowska, and P.S. Goldman-Rakic, Abnormally high neuronal density in the schizophrenic cortex. A morphometric analysis of prefrontal area 9 and occipital area 17. Arch Gen Psychiatry, 1995. 52(10): p. 805-18; discussion 819-20.  291.  Bitanihirwe, B.K. and T.U. Woo, Oxidative stress in schizophrenia: an integrated approach. Neurosci Biobehav Rev, 2011. 35(3): p. 878-93.  292.  Halliwell, B., Reactive oxygen species and the central nervous system. J Neurochem, 1992. 59(5): p. 1609-23.  293.  Bongarzone, E.R., J.M. Pasquini, and E.F. Soto, Oxidative damage to proteins and lipids of CNS myelin produced by in vitro generated reactive oxygen species. Journal of neuroscience research, 1995. 41(2): p. 213-21.  294.  Davis, K.L., et al., White matter changes in schizophrenia: evidence for myelin-related dysfunction. Arch Gen Psychiatry, 2003. 60(5): p. 443-56.  295.  Cannon, T.D., et al., Regional gray matter, white matter, and cerebrospinal fluid distributions in schizophrenic patients, their siblings, and controls. Arch Gen Psychiatry, 1998. 55(12): p. 108491.  296.  Hyde, T.M., J.C. Ziegler, and D.R. Weinberger, Psychiatric disturbances in metachromatic leukodystrophy. Insights into the neurobiology of psychosis. Arch Neurol, 1992. 49(4): p. 401-6.  297.  Uranova, N., et al., Electron microscopy of oligodendroglia in severe mental illness. Brain Res Bull, 2001. 55(5): p. 597-610.  298.  Agostinho, F.R., et al., Effects of chronic haloperidol and/or clozapine on oxidative stress parameters in rat brain. Neurochemical research, 2007. 32(8): p. 1343-50.  299.  Martins, M.R., et al., Antipsychotic-induced oxidative stress in rat brain. Neurotox Res, 2008. 13(1): p. 63-9.  300.  Gilad, Y., S.A. Rifkin, and J.K. Pritchard, Revealing the architecture of gene regulation: the promise of eQTL studies. Trends Genet, 2008. 24(8): p. 408-15.  301.  de Jong, S., et al., Expression QTL analysis of top loci from GWAS meta-analysis highlights additional schizophrenia candidate genes. Eur J Hum Genet, 2012.  302.  Richards, A.L., et al., Schizophrenia susceptibility alleles are enriched for alleles that affect gene expression in adult human brain. Mol Psychiatry, 2012. 17(2): p. 193-201.  303.  Vawter, M.P., F. Mamdani, and F. Macciardi, An integrative functional genomics approach for discovering biomarkers in schizophrenia. Brief Funct Genomics, 2011. 10(6): p. 387-99.  304.  Garbett, K.A., et al., Novel animal models for studying complex brain disorders: BAC-driven miRNA-mediated in vivo silencing of gene expression. Mol Psychiatry, 2010. 15(10): p. 987-95.  305.  Brennand, K.J., et al., Modeling psychiatric disorders at the cellular and network levels. Mol Psychiatry, 2012.  306.  Brennand, K.J., et al., Modelling schizophrenia using human induced pluripotent stem cells. Nature, 2011. 473(7346): p. 221-5.  149  Appendix Appendix A: ‘Core’ meta-signature lists for age, brain pH, PMI and sex Age Down-regulated  GeneSymbol AAK1 ACOT8 ACP1 ACTN2 ACTR10 ACTR3B ACVR1B ADAM23 ADAMTSL1 ADCY1 ADCY2 ADD2 ADRA2A AGMAT AKT3 AL390170 ALAS1 AMACR ANK2 ANKRD13C AP1M1 AP1S1 AP2M1 AP2S1 APC ARF1 ARF3 ARFIP2 ARHGAP20 ARHGEF12 ARHGEF2 ARHGEF9 ARL4C ARL6 ARPC2 ARPC5 ARPP-21 ARRB2 ASTN1  GeneName AP2 associated kinase 1 acyl-CoA thioesterase 8 acid phosphatase 1, soluble actinin, alpha 2 actin-related protein 10 homolog (S. cerevisiae) ARP3 actin-related protein 3 homolog B (yeast) activin A receptor, type IB ADAM metallopeptidase domain 23 ADAMTS-like 1 adenylate cyclase 1 (brain) adenylate cyclase 2 (brain) adducin 2 (beta) adrenergic, alpha-2A-, receptor agmatine ureohydrolase (agmatinase) v-akt murine thymoma viral oncogene homolog 3 (protein kinase B, gamma) unknown aminolevulinate, delta-, synthase 1 alpha-methylacyl-CoA racemase ankyrin 2, neuronal ankyrin repeat domain 13C adaptor-related protein complex 1, mu 1 subunit adaptor-related protein complex 1, sigma 1 subunit adaptor-related protein complex 2, mu 1 subunit adaptor-related protein complex 2, sigma 1 subunit adenomatous polyposis coli ADP-ribosylation factor 1 ADP-ribosylation factor 3 ADP-ribosylation factor interacting protein 2 Rho GTPase activating protein 20 Rho guanine nucleotide exchange factor (GEF) 12 rho/rac guanine nucleotide exchange factor (GEF) 2 Cdc42 guanine nucleotide exchange factor (GEF) 9 ADP-ribosylation factor-like 4C ADP-ribosylation factor-like 6 actin related protein 2/3 complex, subunit 2, 34kDa actin related protein 2/3 complex, subunit 5, 16kDa cyclic AMP-regulated phosphoprotein, 21 kD arrestin, beta 2 astrotactin 1  Meta Q-value 3.48E-04 5.20E-05 1.09E-04 7.53E-06 5.81E-06 1.74E-03 2.35E-06 1.99E-05 4.41E-05 1.20E-07 1.18E-09 6.33E-07 1.28E-04 8.44E-05 1.44E-08 4.98E-08 3.43E-05 5.43E-05 1.64E-04 5.51E-09 2.78E-09 2.35E-05 1.39E-05 1.65E-06 7.15E-07 1.94E-05 4.70E-05 1.60E-04 6.85E-05 3.41E-05 1.29E-05 1.62E-07 2.05E-04 2.20E-05 1.12E-04 3.45E-04 2.79E-05 9.86E-07 2.18E-04  150  GeneSymbol  GeneName  ATP2B1 ATP2B2 ATP2C1  ATPase, Ca++ transporting, plasma membrane 1 ATPase, Ca++ transporting, plasma membrane 2 ATPase, Ca++ transporting, type 2C, member 1 ATP synthase, H+ transporting, mitochondrial F0 complex, subunit B1 ATP synthase, H+ transporting, mitochondrial F0 complex, subunit C3 (subunit 9) ATPase, H+ transporting, lysosomal accessory protein 2 ATPase, H+ transporting, lysosomal V0 subunit a1 ATPase, H+ transporting, lysosomal 21kDa, V0 subunit b ATPase, H+ transporting, lysosomal 38kDa, V0 subunit d1 ATPase, H+ transporting, lysosomal 70kDa, V1 subunit A ATPase, H+ transporting, lysosomal 42kDa, V1 subunit C1 ATPase, H+ transporting, lysosomal 34kDa, V1 subunit D ATPase, aminophospholipid transporter-like, class I, type 8A, member 2 attractin-like 1 ataxin 1 ataxin 10 antizyme inhibitor 1 UDP-Gal:betaGal beta 1,3-galactosyltransferase polypeptide 6 beta-1,3-glucuronyltransferase 1 (glucuronosyltransferase P) UDP-GlcNAc:betaGal beta-1,3-Nacetylglucosaminyltransferase 1 UDP-GlcNAc:betaGal beta-1,3-Nacetylglucosaminyltransferase 2 UDP-Gal:betaGlcNAc beta 1,4- galactosyltransferase, polypeptide 6 BCL2-associated agonist of cell death brain abundant, membrane attached signal protein 1 Bardet-Biedl syndrome 7 B-cell receptor-associated protein 29 branched chain ketoacid dehydrogenase kinase B-cell CLL/lymphoma 11A (zinc finger protein) brain-derived neurotrophic factor beclin 1, autophagy related brain expressed, X-linked 1 BRCA1 associated protein BRF1 homolog, subunit of RNA polymerase III transcription initiation factor IIIB (S. cerevisiae) BRF2, subunit of RNA polymerase III transcription initiation factor, BRF1-like brain protein 44 chromosome 10 open reading frame 46 chromosome 11 open reading frame 41 chromosome 12 open reading frame 44 chromosome 14 open reading frame 138 chromosome 16 open reading frame 42 chromosome 17 open reading frame 76 chromosome 18 open reading frame 1 chromosome 18 open reading frame 10  ATP5F1 ATP5G3 ATP6AP2 ATP6V0A1 ATP6V0B ATP6V0D1 ATP6V1A ATP6V1C1 ATP6V1D ATP8A2 ATRNL1 ATXN1 ATXN10 AZIN1 B3GALT6 B3GAT1 B3GNT1 B3GNT2 B4GALT6 BAD BASP1 BBS7 BCAP29 BCKDK BCL11A BDNF BECN1 BEX1 BRAP BRF1 BRF2 BRP44 C10orf46 C11orf41 C12orf44 C14orf138 C16orf42 C17orf76 C18orf1 C18orf10  Meta Q-value 3.24E-06 1.04E-04 9.69E-05 8.29E-05 1.28E-04 4.26E-07 7.96E-06 4.15E-05 3.38E-04 4.24E-05 7.44E-10 2.40E-04 1.55E-05 2.37E-08 1.46E-07 1.77E-05 1.16E-05 4.86E-06 5.23E-04 5.43E-04 5.53E-05 8.34E-05 3.55E-04 2.30E-07 8.38E-05 1.88E-06 2.24E-06 1.29E-05 5.50E-04 2.14E-06 2.97E-07 1.07E-04 5.43E-06 4.37E-06 2.77E-04 4.59E-04 2.04E-07 8.89E-05 1.48E-05 5.58E-06 3.21E-05 1.59E-05 1.32E-05  151  GeneSymbol  GeneName  C19orf10 C19orf42 C1orf31 C1orf59 C1orf95 C1orf96 C1QTNF4 C20orf112 C5orf13 C5orf30 C6orf106 C6orf153 C6orf154 C6orf206 C7orf44 C8orf46 C9orf127 C9orf16 CA10 CA11 CABP1 CACNB1 CACNB2 CACNB3 CACNG3 CALB1 CALB2 CALM1 CALM3 CALU CAMK1 CAMK2B CAMK4 CAMKK2 CAMSAP1 CAP2 CAPRIN1 CAPZA2 CARS  chromosome 19 open reading frame 10 chromosome 19 open reading frame 42 chromosome 1 open reading frame 31 chromosome 1 open reading frame 59 chromosome 1 open reading frame 95 chromosome 1 open reading frame 96 C1q and tumor necrosis factor related protein 4 chromosome 20 open reading frame 112 chromosome 5 open reading frame 13 chromosome 5 open reading frame 30 chromosome 6 open reading frame 106 chromosome 6 open reading frame 153 chromosome 6 open reading frame 154 chromosome 6 open reading frame 206 chromosome 7 open reading frame 44 chromosome 8 open reading frame 46 chromosome 9 open reading frame 127 chromosome 9 open reading frame 16 carbonic anhydrase X carbonic anhydrase XI calcium binding protein 1 calcium channel, voltage-dependent, beta 1 subunit calcium channel, voltage-dependent, beta 2 subunit calcium channel, voltage-dependent, beta 3 subunit calcium channel, voltage-dependent, gamma subunit 3 calbindin 1, 28kDa calbindin 2 calmodulin 1 (phosphorylase kinase, delta) calmodulin 3 (phosphorylase kinase, delta) calumenin calcium/calmodulin-dependent protein kinase I calcium/calmodulin-dependent protein kinase II beta calcium/calmodulin-dependent protein kinase IV calcium/calmodulin-dependent protein kinase kinase 2, beta calmodulin regulated spectrin-associated protein 1 CAP, adenylate cyclase-associated protein, 2 (yeast) cell cycle associated protein 1 capping protein (actin filament) muscle Z-line, alpha 2 cysteinyl-tRNA synthetase calcium/calmodulin-dependent serine protein kinase (MAGUK family) cerebellin 4 precursor chromobox homolog 6 coiled-coil domain containing 85A coiled-coil domain containing 85B cholecystokinin cholecystokinin B receptor cerebral cavernous malformation 2  CASK CBLN4 CBX6 CCDC85A CCDC85B CCK CCKBR CCM2  Meta Q-value 2.18E-04 1.49E-05 1.60E-10 7.17E-09 3.93E-04 1.71E-06 3.21E-07 5.20E-05 4.29E-05 2.78E-09 1.00E-05 1.29E-05 4.43E-05 1.10E-04 2.88E-05 2.44E-06 1.08E-03 7.24E-10 8.96E-09 3.15E-06 3.54E-04 4.71E-06 2.06E-05 1.29E-06 4.18E-04 1.02E-06 1.40E-05 3.78E-04 4.30E-05 2.35E-06 1.31E-07 1.08E-04 1.47E-04 5.47E-08 2.27E-04 1.07E-03 1.36E-04 3.27E-05 3.02E-04 9.59E-08 4.51E-06 1.14E-03 3.07E-06 1.57E-04 1.63E-04 1.44E-04 2.05E-05  152  GeneSymbol  GeneName  CCNA1 CCNC CCND2 CCNG2 CCNY CCRK CD200 CD47 CDC34 CDC40 CDC42 CDC42EP3 CDH12 CDH13 CDH8 CDK5 CDK5R1 CDK5R2 CDKL5 CDKN2D CDKN3 CDV3 CECR6 CHGB CHRM3 CHST2 CINP CKAP4 CLCN4 CLTA CLTB CLYBL CNIH2 CNKSR2 CNR1 COMMD7  cyclin A1 cyclin C cyclin D2 cyclin G2 cyclin Y cell cycle related kinase CD200 molecule CD47 molecule cell division cycle 34 homolog (S. cerevisiae) cell division cycle 40 homolog (S. cerevisiae) cell division cycle 42 (GTP binding protein, 25kDa) CDC42 effector protein (Rho GTPase binding) 3 cadherin 12, type 2 (N-cadherin 2) cadherin 13, H-cadherin (heart) cadherin 8, type 2 cyclin-dependent kinase 5 cyclin-dependent kinase 5, regulatory subunit 1 (p35) cyclin-dependent kinase 5, regulatory subunit 2 (p39) cyclin-dependent kinase-like 5 cyclin-dependent kinase inhibitor 2D (p19, inhibits CDK4) cyclin-dependent kinase inhibitor 3 CDV3 homolog (mouse) cat eye syndrome chromosome region, candidate 6 chromogranin B (secretogranin 1) cholinergic receptor, muscarinic 3 carbohydrate (N-acetylglucosamine-6-O) sulfotransferase 2 cyclin-dependent kinase 2-interacting protein cytoskeleton-associated protein 4 chloride channel 4 clathrin, light chain (Lca) clathrin, light chain (Lcb) citrate lyase beta like cornichon homolog 2 (Drosophila) connector enhancer of kinase suppressor of Ras 2 cannabinoid receptor 1 (brain) COMM domain containing 7 COP9 constitutive photomorphogenic homolog subunit 4 (Arabidopsis) COP9 constitutive photomorphogenic homolog subunit 7A (Arabidopsis) COP9 constitutive photomorphogenic homolog subunit 8 (Arabidopsis) complexin 3 corticotropin releasing hormone corticotropin releasing hormone binding protein cysteine-rich protein 2 v-crk sarcoma virus CT10 oncogene homolog (avian) collapsin response mediator protein 1 crystallin, mu  COPS4 COPS7A COPS8 CPLX3 CRH CRHBP CRIP2 CRK CRMP1 CRYM  Meta Q-value 2.20E-06 1.69E-04 4.15E-05 3.27E-05 2.75E-06 1.09E-04 5.43E-06 1.65E-04 7.42E-06 1.46E-07 6.87E-05 2.09E-04 6.24E-05 1.08E-04 1.22E-07 2.59E-04 7.58E-06 8.07E-07 1.93E-04 4.87E-06 1.09E-06 4.98E-08 3.96E-05 5.99E-06 1.61E-04 6.75E-06 3.41E-09 4.49E-05 4.04E-07 8.29E-05 1.28E-03 3.20E-05 4.04E-06 2.62E-05 6.04E-04 2.08E-05 8.85E-06 4.36E-05 4.31E-05 9.84E-05 3.76E-04 2.14E-06 1.17E-04 3.33E-04 3.35E-05 4.71E-05  153  GeneSymbol  GeneName  CSNK2A1 CTSB CUGBP1 CUGBP2 CUL2 CX3CL1 CXADR CYB561D1 CYC1 CYCS CYP26B1  casein kinase 2, alpha 1 polypeptide cathepsin B CUG triplet repeat, RNA binding protein 1 CUG triplet repeat, RNA binding protein 2 cullin 2 chemokine (C-X3-C motif) ligand 1 coxsackie virus and adenovirus receptor cytochrome b-561 domain containing 1 cytochrome c-1 cytochrome c, somatic cytochrome P450, family 26, subfamily B, polypeptide 1 dapper, antagonist of beta-catenin, homolog 3 (Xenopus laevis) deleted in bladder cancer 1 discoidin, CUB and LCCL domain containing 1 dynactin 3 (p22) DCN1, defective in cullin neddylation 1, domain containing 5 (S. cerevisiae) doublecortin dendrin dolichyl-diphosphooligosaccharide-protein glycosyltransferase D-dopachrome tautomerase density-regulated protein diacylglycerol kinase, beta 90kDa diacylglycerol kinase, iota deoxyguanosine kinase deoxyhypusine synthase DIRAS family, GTP-binding RAS-like 1 DKFZP564O0823 protein dihydrolipoamide S-acetyltransferase discs, large homolog 1 (Drosophila) discs, large homolog 2 (Drosophila) discs, large homolog 3 (Drosophila) discs, large homolog 4 (Drosophila) discs, large (Drosophila) homolog-associated protein 1 discs, large (Drosophila) homolog-associated protein 2 distal-less homeobox 1 distal-less homeobox 2 Dmx-like 2 DnaJ (Hsp40) homolog, subfamily A, member 1 dynamin 1-like docking protein 5 docking protein 6 D4, zinc and double PHD fingers family 1 dipeptidyl-peptidase 6 dihydropyrimidinase-like 4 down-regulator of transcription 1, TBP-binding (negative cofactor 2) dual specificity phosphatase 14  DACT3 DBC1 DCBLD1 DCTN3 DCUN1D5 DCX DDN DDOST DDT DENR DGKB DGKI DGUOK DHPS DIRAS1 DKFZP564O0823 DLAT DLG1 DLG2 DLG3 DLG4 DLGAP1 DLGAP2 DLX1 DLX2 DMXL2 DNAJA1 DNM1L DOK5 DOK6 DPF1 DPP6 DPYSL4 DR1 DUSP14  Meta Q-value 3.63E-04 5.95E-08 3.22E-04 1.99E-04 2.23E-04 1.45E-05 6.13E-05 2.34E-04 3.87E-04 1.61E-04 2.14E-04 4.51E-06 6.31E-09 1.55E-04 1.30E-05 2.45E-05 1.56E-04 8.43E-05 1.20E-05 5.47E-05 2.64E-05 2.97E-07 6.89E-05 1.79E-04 2.29E-06 2.96E-04 3.90E-04 2.20E-06 8.56E-07 1.81E-05 7.65E-08 1.36E-06 6.65E-05 5.95E-08 4.04E-06 1.98E-06 1.05E-05 9.45E-06 2.93E-05 2.58E-06 4.28E-06 2.21E-05 3.92E-04 5.19E-04 1.90E-04 7.56E-07  154  GeneSymbol  GeneName  DUSP3 DYNC1I1 DYNC1LI1 DYRK2 ECHDC1 EDF1 EEF1A2 EFNB3 EGR3 EHBP1 EIF2S1 EIF4E2  dual specificity phosphatase 3 dynein, cytoplasmic 1, intermediate chain 1 dynein, cytoplasmic 1, light intermediate chain 1 dual-specificity tyrosine-(Y)-phosphorylation regulated kinase 2 enoyl Coenzyme A hydratase domain containing 1 endothelial differentiation-related factor 1 eukaryotic translation elongation factor 1 alpha 2 ephrin-B3 early growth response 3 EH domain binding protein 1 eukaryotic translation initiation factor 2, subunit 1 alpha, 35kDa eukaryotic translation initiation factor 4E family member 2 ELAV (embryonic lethal, abnormal vision, Drosophila)-like 1 (Hu antigen R) ELOVL family member 6, elongation of long chain fatty acids (FEN1/Elo2, SUR4/Elo3-like, yeast) ectodermal-neural cortex (with BTB-like domain) endosulfine alpha EPH receptor A4 EPH receptor A5 EPH receptor B2 epidermal growth factor receptor pathway substrate 15 ELKS/RAB6-interacting/CAST family member 2 excision repair cross-complementing rodent repair deficiency, complementation group 1 (includes overlapping antisense sequence) ets variant 1 exocyst complex component 6 exosome component 2 exosome component 4 exophilin 5 exostoses (multiple) 1 fatty acid binding protein 3, muscle and heart (mammaryderived growth inhibitor) family with sequence similarity 110, member B family with sequence similarity 131, member A family with sequence similarity 32, member A family with sequence similarity 49, member A family with sequence similarity 5, member B family with sequence similarity 5, member C family with sequence similarity 83, member H phenylalanyl-tRNA synthetase, alpha subunit F-box and leucine-rich repeat protein 16 F-box and leucine-rich repeat protein 2 F-box protein 31 F-box and WD repeat domain containing 11 F-box and WD repeat domain containing 2 F-box and WD repeat domain containing 5 F-box and WD repeat domain containing 7 fibroblast growth factor 12  ELAVL1 ELOVL6 ENC1 ENSA EPHA4 EPHA5 EPHB2 EPS15 ERC2  ERCC1 ETV1 EXOC6 EXOSC2 EXOSC4 EXPH5 EXT1 FABP3 FAM110B FAM131A FAM32A FAM49A FAM5B FAM5C FAM83H FARSA FBXL16 FBXL2 FBXO31 FBXW11 FBXW2 FBXW5 FBXW7 FGF12  Meta Q-value 7.65E-08 1.51E-05 1.13E-04 2.62E-04 7.05E-07 2.64E-08 1.52E-07 1.98E-04 1.12E-05 9.61E-05 3.81E-05 2.57E-04 1.68E-05 4.49E-05 6.81E-05 3.56E-05 1.70E-06 4.78E-05 3.83E-04 8.89E-06 2.04E-05 4.19E-05 2.41E-06 4.57E-04 4.29E-05 2.37E-04 2.64E-08 3.41E-04 5.96E-05 1.69E-04 1.75E-04 1.67E-04 6.22E-07 5.95E-05 5.40E-04 9.19E-06 4.37E-05 2.75E-06 2.07E-07 4.89E-06 1.03E-04 1.12E-04 5.60E-07 2.35E-04 1.01E-05  155  GeneSymbol  GeneName  FGF13 FJX1 FKBP11 FKBP1B FLJ11506 FLJ22536 FLJ25076 FRMPD4 GABBR2 GABRA1 GAD1 GAD2  GNB1 GNB5 GNG3 GOSR2 GPHN GPI GPM6A GPR26 GPR6 GRB2 GRIA1 GRIK2 GRIK5 GRLF1 GRM7 GRPEL1 GSPT1 GTF2F1 GUCY1B3 GUF1 GULP1 HAGH HCCS  fibroblast growth factor 13 four jointed box 1 (Drosophila) FK506 binding protein 11, 19 kDa FK506 binding protein 1B, 12.6 kDa alpha- and gamma-adaptin-binding protein p34 hypothetical locus LOC401237 similar to CG4502-PA FERM and PDZ domain containing 4 gamma-aminobutyric acid (GABA) B receptor, 2 gamma-aminobutyric acid (GABA) A receptor, alpha 1 glutamate decarboxylase 1 (brain, 67kDa) glutamate decarboxylase 2 (pancreatic islets and brain, 65kDa) UDP-N-acetyl-alpha-D-galactosamine:polypeptide Nacetylgalactosaminyltransferase 2 (GalNAc-T2) growth associated protein 43 glycyl-tRNA synthetase glutamine-fructose-6-phosphate transaminase 1 GDNF family receptor alpha 2 growth hormone inducible transmembrane protein glycine receptor, beta guanine nucleotide binding protein (G protein), alpha activating activity polypeptide O guanine nucleotide binding protein (G protein), alpha z polypeptide guanine nucleotide binding protein (G protein), beta polypeptide 1 guanine nucleotide binding protein (G protein), beta 5 guanine nucleotide binding protein (G protein), gamma 3 golgi SNAP receptor complex member 2 gephyrin glucose phosphate isomerase glycoprotein M6A G protein-coupled receptor 26 G protein-coupled receptor 6 growth factor receptor-bound protein 2 glutamate receptor, ionotropic, AMPA 1 glutamate receptor, ionotropic, kainate 2 glutamate receptor, ionotropic, kainate 5 glucocorticoid receptor DNA binding factor 1 glutamate receptor, metabotropic 7 GrpE-like 1, mitochondrial (E. coli) G1 to S phase transition 1 general transcription factor IIF, polypeptide 1, 74kDa guanylate cyclase 1, soluble, beta 3 GUF1 GTPase homolog (S. cerevisiae) GULP, engulfment adaptor PTB domain containing 1 hydroxyacylglutathione hydrolase holocytochrome c synthase (cytochrome c heme-lyase)  HDGFRP3  hepatoma-derived growth factor, related protein 3  GALNT2 GAP43 GARS GFPT1 GFRA2 GHITM GLRB GNAO1 GNAZ  Meta Q-value 1.33E-04 2.41E-06 9.33E-05 1.40E-03 5.64E-04 1.17E-06 1.70E-06 2.13E-04 2.79E-05 2.48E-06 7.83E-07 3.48E-08 2.67E-04 1.46E-04 7.51E-05 1.65E-08 1.27E-06 9.65E-06 4.14E-07 4.14E-07 5.94E-07 5.04E-05 2.32E-06 1.77E-04 7.52E-09 5.67E-07 3.58E-04 6.49E-05 6.98E-05 3.42E-08 4.32E-06 4.94E-05 3.80E-04 1.20E-05 9.01E-05 2.83E-09 6.72E-07 6.02E-06 9.98E-05 3.00E-05 2.71E-05 2.42E-05 2.41E-04 1.52E-06 1.21E-04  156  GeneSymbol  GeneName  HIST1H2BK  histone cluster 1, H2bk human immunodeficiency virus type I enhancer binding protein 2 hepatic leukemia factor 3-hydroxymethyl-3-methylglutaryl-Coenzyme A lyase-like 1 3-hydroxy-3-methylglutaryl-Coenzyme A reductase heme oxygenase (decycling) 2 HMP19 protein hippocalcin hippocalcin like 4 v-Ha-ras Harvey rat sarcoma viral oncogene homolog HRAS-like suppressor heparan sulfate 2-O-sulfotransferase 1 heparan sulfate 6-O-sulfotransferase 3 heat shock factor binding protein 1 hydroxysteroid (17-beta) dehydrogenase 12 heat shock 70kDa protein 12A heat shock 70kDa protein 4 5-hydroxytryptamine (serotonin) receptor 2A 5-hydroxytryptamine (serotonin) receptor 2C isocitrate dehydrogenase 3 (NAD+) beta iduronate 2-sulfatase insulin-like growth factor 1 (somatomedin C) interleukin enhancer binding factor 3, 90kDa IMP4, U3 small nucleolar ribonucleoprotein, homolog (yeast) inositol polyphosphate-4-phosphatase, type I, 107kDa IQ motif and Sec7 domain 1 inositol 1,4,5-trisphosphate 3-kinase A jumping translocation breakpoint kalirin, RhoGEF kinase katanin p80 (WD repeat containing) subunit B 1 potassium voltage-gated channel, shaker-related subfamily, beta member 1 potassium voltage-gated channel, subfamily F, member 1 Kv channel interacting protein 1 Kv channel interacting protein 3, calsenilin potassium inwardly-rectifying channel, subfamily J, member 3 potassium inwardly-rectifying channel, subfamily J, member 6 potassium inwardly-rectifying channel, subfamily J, member 9 potassium channel, subfamily K, member 1 potassium channel, subfamily K, member 3 potassium voltage-gated channel, KQT-like subfamily, member 2 KIAA0090 KIAA0317 KIAA1045 KIAA1468 KIAA1549  HIVEP2 HLF HMGCLL1 HMGCR HMOX2 HMP19 HPCA HPCAL4 HRAS HRASLS HS2ST1 HS6ST3 HSBP1 HSD17B12 HSPA12A HSPA4 HTR2A HTR2C IDH3B IDS IGF1 ILF3 IMP4 INPP4A IQSEC1 ITPKA JTB KALRN KATNB1 KCNAB1 KCNF1 KCNIP1 KCNIP3 KCNJ3 KCNJ6 KCNJ9 KCNK1 KCNK3 KCNQ2 KIAA0090 KIAA0317 KIAA1045 KIAA1468 KIAA1549  Meta Q-value 2.50E-04 3.05E-06 3.48E-08 9.17E-08 3.57E-08 7.20E-04 4.08E-04 8.14E-08 1.42E-04 1.27E-07 1.98E-06 8.87E-08 1.30E-04 3.15E-06 1.25E-09 5.19E-07 1.23E-06 2.60E-08 1.70E-06 1.85E-04 2.98E-06 2.50E-04 1.53E-04 9.61E-09 8.99E-08 1.52E-04 5.47E-05 1.67E-05 9.05E-05 5.00E-06 4.38E-07 6.87E-05 7.41E-07 1.97E-04 1.96E-04 2.18E-04 5.67E-06 3.40E-06 1.66E-05 1.77E-04 3.15E-06 2.45E-05 1.01E-06 3.65E-04 5.64E-06  157  GeneSymbol  GeneName  KIF2A KIF3B KIF3C KIFAP3 KITLG KLF16 KLHDC5 KPNA1 KPNA6 KRAS LAGE3 LANCL1 LANCL2 LARGE LARP5 LDOC1 LDOC1L LINGO1 LINGO2 LMO4 LOC150568 LOC283951 LOC552889 LPPR4  kinesin heavy chain member 2A kinesin family member 3B kinesin family member 3C kinesin-associated protein 3 KIT ligand Kruppel-like factor 16 kelch domain containing 5 karyopherin alpha 1 (importin alpha 5) karyopherin alpha 6 (importin alpha 7) v-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog L antigen family, member 3 LanC lantibiotic synthetase component C-like 1 (bacterial) LanC lantibiotic synthetase component C-like 2 (bacterial) like-glycosyltransferase La ribonucleoprotein domain family, member 5 leucine zipper, down-regulated in cancer 1 leucine zipper, down-regulated in cancer 1-like leucine rich repeat and Ig domain containing 1 leucine rich repeat and Ig domain containing 2 LIM domain only 4 hypothetical LOC150568 hypothetical protein LOC283951 hypothetical protein LOC552889 plasticity related gene 1 low density lipoprotein receptor-related protein associated protein 1 leucine rich repeat containing 20 leucine rich repeat containing 7 leucine rich repeat containing 8 family, member B leucine rich repeat transmembrane neuronal 1 LSM4 homolog, U6 small nuclear RNA associated (S. cerevisiae) lymphocyte antigen 6 complex, locus E leucine zipper, putative tumor suppressor 1 MAD2L1 binding protein melanoma antigen family D, 1 mal, T-cell differentiation protein 2 mannosidase, alpha, class 1A, member 1 mannosidase, endo-alpha-like microtubule-associated protein 1 light chain 3 alpha microtubule-associated protein 2 mitogen-activated protein kinase kinase kinase 13 mitogen-activated protein kinase 1 mitogen-activated protein kinase 10 mitogen-activated protein kinase 11 mitogen-activated protein kinase 14 mitogen-activated protein kinase 8 mitogen-activated protein kinase 9 mitogen-activated protein kinase associated protein 1  LRPAP1 LRRC20 LRRC7 LRRC8B LRRTM1 LSM4 LY6E LZTS1 MAD2L1BP MAGED1 MAL2 MAN1A1 MANEAL MAP1LC3A MAP2 MAP3K13 MAPK1 MAPK10 MAPK11 MAPK14 MAPK8 MAPK9 MAPKAP1  Meta Q-value 8.41E-07 3.23E-05 9.82E-06 8.99E-08 1.02E-04 3.56E-04 1.98E-04 1.39E-05 1.12E-03 2.59E-06 3.23E-04 7.67E-05 1.38E-05 1.59E-06 5.03E-06 4.01E-06 9.98E-05 7.66E-06 8.11E-06 1.53E-06 1.07E-04 2.19E-04 9.32E-06 8.44E-08 1.70E-06 2.55E-07 1.16E-06 6.63E-04 2.20E-04 4.39E-07 1.24E-04 8.06E-05 2.42E-06 4.52E-05 3.27E-04 1.65E-09 2.93E-05 5.12E-04 3.04E-04 5.23E-06 1.66E-05 6.65E-07 5.53E-06 2.27E-05 2.48E-04 2.59E-06 1.47E-05  158  GeneSymbol  GeneName  MAPT MARCH4 MAST3 MCAT MCHR1 MCOLN3 MDH2 MED6 MEF2A MEF2C MEF2D MFSD3 MFSD4  microtubule-associated protein tau membrane-associated ring finger (C3HC4) 4 microtubule associated serine/threonine kinase 3 malonyl CoA:ACP acyltransferase (mitochondrial) melanin-concentrating hormone receptor 1 mucolipin 3 malate dehydrogenase 2, NAD (mitochondrial) mediator complex subunit 6 myocyte enhancer factor 2A myocyte enhancer factor 2C myocyte enhancer factor 2D major facilitator superfamily domain containing 3 major facilitator superfamily domain containing 4 myeloid/lymphoid or mixed-lineage leukemia (trithorax homolog, Drosophila); translocated to, 11 monocyte to macrophage differentiation-associated modulator of apoptosis 1 metallophosphoesterase domain containing 1 mitochondrial ribosomal protein L14 mitochondrial ribosomal protein L28 mitochondrial ribosomal protein L33 mitochondrial ribosomal protein L9 mitochondrial ribosomal protein S12 mutS homolog 2, colon cancer, nonpolyposis type 1 (E. coli) myotubularin related protein 9 metaxin 2 myosin, heavy chain 10, non-muscle myosin XVI myosin VA (heavy chain 12, myoxin) N-ethylmaleimide-sensitive factor attachment protein, alpha N-acetyltransferase 14 (GCN5-related, putative) neuron navigator 3 neuroblastoma, suppression of tumorigenicity 1 neurocalcin delta neural cell adhesion molecule 1 Nedd4 family interacting protein 1 necdin homolog (mouse) NDRG family member 3 NADH dehydrogenase (ubiquinone) Fe-S protein 2, 49kDa (NADH-coenzyme Q reductase) NADH dehydrogenase (ubiquinone) Fe-S protein 4, 18kDa (NADH-coenzyme Q reductase) NADH dehydrogenase (ubiquinone) flavoprotein 1, 51kDa neurofilament, light polypeptide neuronal growth regulator 1 NEL-like 2 (chicken) neuropilin (NRP) and tolloid (TLL)-like 2 nuclear factor of kappa light polypeptide gene enhancer in Bcells inhibitor, epsilon NHP2 non-histone chromosome protein 2-like 1 (S. cerevisiae)  MLLT11 MMD MOAP1 MPPED1 MRPL14 MRPL28 MRPL33 MRPL9 MRPS12 MSH2 MTMR9 MTX2 MYH10 MYO16 MYO5A NAPA NAT14 NAV3 NBL1 NCALD NCAM1 NDFIP1 NDN NDRG3 NDUFS2 NDUFS4 NDUFV1 NEFL NEGR1 NELL2 NETO2 NFKBIE NHP2L1  Meta Q-value 6.70E-04 3.43E-04 3.84E-04 6.35E-06 2.55E-05 8.69E-05 2.05E-05 4.31E-05 1.75E-04 1.63E-08 1.02E-07 4.09E-07 1.71E-06 1.62E-04 1.78E-04 8.12E-04 3.85E-05 7.44E-10 1.06E-05 7.81E-05 1.80E-04 1.02E-07 2.38E-04 1.15E-08 6.81E-05 4.29E-07 1.80E-04 5.10E-04 1.75E-04 3.13E-06 6.61E-05 3.23E-04 1.93E-05 2.62E-04 2.52E-04 4.65E-05 3.49E-05 2.64E-08 4.19E-04 3.96E-05 3.92E-08 8.34E-05 6.86E-11 6.94E-04 2.61E-07 2.02E-07  159  GeneSymbol  GeneName  NIPSNAP1 NLGN4X NLK NMU NNAT NOVA1 NPM3 NPTX2 NPTXR NRCAM NRG1 NRGN NRN1 NRSN1 NSF NUDCD1 NUDT21 NXPH1 OCIAD1 OLFM1 OPCML OPTN ORC5L OSBPL1A OTUB1 OTUB2 OXCT1 P2RX5 PACSIN1 PAK1 PAK6 PANK2 PAP2D PARK2 PARP2 PART1 PCDH7 PCLO PCMT1 PCTK1 PCTK2 PDCD2 PDE2A PDIA6 PDK3 PENK PFN2 PGBD5  nipsnap homolog 1 (C. elegans) neuroligin 4, X-linked nemo-like kinase neuromedin U neuronatin neuro-oncological ventral antigen 1 nucleophosmin/nucleoplasmin, 3 neuronal pentraxin II neuronal pentraxin receptor neuronal cell adhesion molecule neuregulin 1 neurogranin (protein kinase C substrate, RC3) neuritin 1 neurensin 1 N-ethylmaleimide-sensitive factor NudC domain containing 1 nudix (nucleoside diphosphate linked moiety X)-type motif 21 neurexophilin 1 OCIA domain containing 1 olfactomedin 1 opioid binding protein/cell adhesion molecule-like optineurin origin recognition complex, subunit 5-like (yeast) oxysterol binding protein-like 1A OTU domain, ubiquitin aldehyde binding 1 OTU domain, ubiquitin aldehyde binding 2 3-oxoacid CoA transferase 1 purinergic receptor P2X, ligand-gated ion channel, 5 protein kinase C and casein kinase substrate in neurons 1 p21 protein (Cdc42/Rac)-activated kinase 1 p21 protein (Cdc42/Rac)-activated kinase 6 pantothenate kinase 2 phosphatidic acid phosphatase type 2 Parkinson disease (autosomal recessive, juvenile) 2, parkin poly (ADP-ribose) polymerase 2 prostate androgen-regulated transcript 1 protocadherin 7 piccolo (presynaptic cytomatrix protein) protein-L-isoaspartate (D-aspartate) O-methyltransferase PCTAIRE protein kinase 1 PCTAIRE protein kinase 2 programmed cell death 2 phosphodiesterase 2A, cGMP-stimulated protein disulfide isomerase family A, member 6 pyruvate dehydrogenase kinase, isozyme 3 proenkephalin profilin 2 piggyBac transposable element derived 5  Meta Q-value 5.29E-04 1.64E-04 3.74E-07 1.15E-08 5.99E-06 2.04E-05 2.46E-04 1.13E-07 3.42E-06 2.14E-06 3.24E-06 4.18E-04 1.00E-04 2.38E-04 5.19E-05 4.37E-08 8.01E-06 3.93E-04 3.69E-05 7.98E-06 8.68E-07 1.94E-05 1.82E-06 2.91E-05 2.38E-04 1.92E-04 8.44E-05 1.02E-07 1.02E-07 2.66E-06 3.42E-04 5.03E-06 1.15E-07 1.75E-04 3.80E-05 6.56E-05 4.96E-09 6.05E-05 1.37E-03 1.98E-04 6.43E-05 2.12E-04 3.05E-06 5.33E-04 1.02E-04 9.06E-05 6.19E-05 5.33E-04  160  GeneSymbol  GeneName  PGM2L1 PHF14 PHF20L1 PHTF2 PIAS2 PIN1 PINK1 PIP5K1B PITPNA PITPNB PKIG PKP4 PLCB1 PLCL2 PLD3  phosphoglucomutase 2-like 1 PHD finger protein 14 PHD finger protein 20-like 1 putative homeodomain transcription factor 2 protein inhibitor of activated STAT, 2 peptidylprolyl cis/trans isomerase, NIMA-interacting 1 PTEN induced putative kinase 1 phosphatidylinositol-4-phosphate 5-kinase, type I, beta phosphatidylinositol transfer protein, alpha phosphatidylinositol transfer protein, beta protein kinase (cAMP-dependent, catalytic) inhibitor gamma plakophilin 4 phospholipase C, beta 1 (phosphoinositide-specific) phospholipase C-like 2 phospholipase D family, member 3 pleckstrin homology domain containing, family B (evectins) member 2 polo-like kinase 2 (Drosophila) phosphomannomutase 1 paraneoplastic antigen MA1 prepronociceptin polymerase (DNA directed), beta polymerase (RNA) I polypeptide D, 16kDa processing of precursor 4, ribonuclease P/MRP subunit (S. cerevisiae) protein phosphatase, EF-hand calcium binding domain 1 protein tyrosine phosphatase, receptor type, f polypeptide (PTPRF), interacting protein (liprin), alpha 2 peptidylprolyl isomerase E (cyclophilin E) protein phosphatase 1E (PP2C domain containing) protein phosphatase 1, regulatory (inhibitor) subunit 14C protein phosphatase 1, regulatory (inhibitor) subunit 1A protein phosphatase 1, regulatory (inhibitor) subunit 7 protein phosphatase 2 (formerly 2A), regulatory subunit A, alpha isoform protein phosphatase 2 (formerly 2A), regulatory subunit B, gamma isoform protein phosphatase 2, regulatory subunit B', gamma isoform protein phosphatase 3 (formerly 2B), catalytic subunit, beta isoform protein phosphatase 5, catalytic subunit PRA1 domain family, member 2 prolyl endopeptidase prickle homolog 2 (Drosophila) protein kinase, cAMP-dependent, regulatory, type I, alpha (tissue specific extinguisher 1) protein kinase C, delta protein kinase C, gamma protein kinase C, zeta protein arginine methyltransferase 6 phosphoribosyl pyrophosphate synthetase 2  PLEKHB2 PLK2 PMM1 PNMA1 PNOC POLB POLR1D POP4 PPEF1 PPFIA2 PPIE PPM1E PPP1R14C PPP1R1A PPP1R7 PPP2R1A PPP2R2C PPP2R5C PPP3CB PPP5C PRAF2 PREP PRICKLE2 PRKAR1A PRKCD PRKCG PRKCZ PRMT6 PRPS2  Meta Q-value 8.85E-04 2.29E-05 9.57E-06 2.51E-06 1.53E-05 3.86E-04 7.59E-05 3.41E-09 1.46E-04 4.61E-05 1.17E-06 3.31E-05 1.95E-05 1.82E-06 6.55E-06 2.50E-04 2.15E-09 5.37E-11 4.31E-05 3.61E-06 8.80E-06 3.91E-09 2.78E-04 8.62E-06 7.04E-04 2.16E-08 1.27E-04 6.24E-06 5.96E-05 7.96E-06 7.85E-09 2.29E-04 1.72E-06 3.74E-10 7.47E-06 2.59E-04 1.37E-05 1.73E-07 4.22E-07 2.91E-05 4.70E-04 3.24E-09 4.86E-06 6.27E-06  161  GeneSymbol  GeneName  PRR7 PSMA1 PSMB2 PSMB5 PSMB6 PSMB7  proline rich 7 (synaptic) proteasome (prosome, macropain) subunit, alpha type, 1 proteasome (prosome, macropain) subunit, beta type, 2 proteasome (prosome, macropain) subunit, beta type, 5 proteasome (prosome, macropain) subunit, beta type, 6 proteasome (prosome, macropain) subunit, beta type, 7 proteasome (prosome, macropain) 26S subunit, non-ATPase, 1 proteasome (prosome, macropain) 26S subunit, non-ATPase, 13 proteasome (prosome, macropain) 26S subunit, non-ATPase, 7 proteasome (prosome, macropain) 26S subunit, non-ATPase, 8 proteasome (prosome, macropain) activator subunit 3 (PA28 gamma; Ki) parathyroid hormone-like hormone protein tyrosine phosphatase type IVA, member 1 protein tyrosine phosphatase, non-receptor type 3 protein tyrosine phosphatase, receptor type, D protein tyrosine phosphatase, receptor type, F protein tyrosine phosphatase, receptor type, N protein tyrosine phosphatase, receptor type, N polypeptide 2 protein tyrosine phosphatase, receptor type, O protein tyrosine phosphatase, receptor type, R 6-pyruvoyltetrahydropterin synthase pumilio homolog 2 (Drosophila) R3H domain containing 1 RAB11 family interacting protein 2 (class I) RAB11 family interacting protein 4 (class II) RAB15, member RAS onocogene family RAB2A, member RAS oncogene family RAB33A, member RAS oncogene family RAB3A, member RAS oncogene family RAB3B, member RAS oncogene family RAB40B, member RAS oncogene family RAB40C, member RAS oncogene family RAB4A, member RAS oncogene family RAB6A, member RAS oncogene family RAB6B, member RAS oncogene family RAB interacting factor ras-related C3 botulinum toxin substrate 3 (rho family, small GTP binding protein Rac3) RAD51 homolog C (S. cerevisiae) v-ral simian leukemia viral oncogene homolog A (ras related) RAN binding protein 9 RAP1 GTPase activating protein RAP1, GTP-GDP dissociation stimulator 1 RAS protein activator like 1 (GAP1 like) Ras protein-specific guanine nucleotide-releasing factor 1  PSMD1 PSMD13 PSMD7 PSMD8 PSME3 PTHLH PTP4A1 PTPN3 PTPRD PTPRF PTPRN PTPRN2 PTPRO PTPRR PTS PUM2 R3HDM1 RAB11FIP2 RAB11FIP4 RAB15 RAB2A RAB33A RAB3A RAB3B RAB40B RAB40C RAB4A RAB6A RAB6B RABIF RAC3 RAD51C RALA RANBP9 RAP1GAP RAP1GDS1 RASAL1 RASGRF1  Meta Q-value 1.39E-04 6.97E-06 4.81E-04 1.60E-10 5.13E-04 1.65E-06 5.62E-06 1.17E-05 3.10E-06 7.66E-06 2.60E-08 1.44E-05 1.89E-06 1.35E-05 7.42E-06 5.44E-06 1.36E-06 1.12E-05 7.65E-08 4.08E-04 4.28E-04 3.58E-05 3.74E-10 1.84E-04 2.01E-07 6.88E-06 4.04E-06 3.57E-08 1.84E-13 3.74E-10 2.15E-09 1.92E-04 1.46E-07 8.38E-07 1.84E-04 7.60E-07 2.14E-06 3.40E-04 4.59E-05 1.22E-04 8.44E-08 2.39E-06 6.31E-09 1.33E-04  162  GeneSymbol  GeneName  RASGRP1 RASL10A RASL10B RBP4 RCN2 REEP1 REEP5  RAS guanyl releasing protein 1 (calcium and DAG-regulated) RAS-like, family 10, member A RAS-like, family 10, member B retinol binding protein 4, plasma reticulocalbin 2, EF-hand calcium binding domain receptor accessory protein 1 receptor accessory protein 5 RER1 retention in endoplasmic reticulum 1 homolog (S. cerevisiae) replication factor C (activator 1) 3, 38kDa raftlin, lipid raft linker 1 regulator of G-protein signaling 12 regulator of G-protein signaling 4 regulator of G-protein signaling 6 regulator of G-protein signaling 7 rhomboid domain containing 2 RIMS binding protein 2 Ras-like without CAAX 2 ribonuclease H1 ring finger protein 150 ring finger protein 187 reprimo, TP53 dependent G2 arrest mediator candidate ribosomal protein S6 kinase, 90kDa, polypeptide 3 RNA pseudouridylate synthase domain containing 3 Rtf1, Paf1/RNA polymerase II complex component, homolog (S. cerevisiae) reticulon 1 reticulon 2 reticulon 4 runt-related transcription factor 1; translocated to, 1 (cyclin Drelated) SUMO1 activating enzyme subunit 1 seryl-tRNA synthetase secretory carrier membrane protein 1 secretory carrier membrane protein 5 secretogranin II (chromogranin C) secretogranin V (7B2 protein) sodium channel, voltage-gated, type II, alpha subunit sodium channel, voltage-gated, type II, beta sodium channel, voltage-gated, type III, beta sodium channel, voltage gated, type VIII, alpha subunit SEC23 interacting protein septin 11 septin 6 serine incorporator 3 serpin peptidase inhibitor, clade F (alpha-2 antiplasmin, pigment epithelium derived factor), member 1 serpin peptidase inhibitor, clade I (neuroserpin), member 1 seizure related 6 homolog (mouse)-like splicing factor, arginine/serine-rich 2  RER1 RFC3 RFTN1 RGS12 RGS4 RGS6 RGS7 RHBDD2 RIMBP2 RIT2 RNASEH1 RNF150 RNF187 RPRM RPS6KA3 RPUSD3 RTF1 RTN1 RTN2 RTN4 RUNX1T1 SAE1 SARS SCAMP1 SCAMP5 SCG2 SCG5 SCN2A SCN2B SCN3B SCN8A SEC23IP SEPT11 SEPT6 SERINC3 SERPINF1 SERPINI1 SEZ6L SFRS2  Meta Q-value 9.24E-08 1.08E-05 2.32E-06 3.25E-05 3.58E-07 2.10E-04 1.19E-04 2.05E-05 3.25E-06 3.26E-09 1.15E-05 1.20E-07 3.25E-06 1.17E-05 9.74E-07 4.67E-04 1.03E-04 2.27E-06 2.26E-05 1.24E-07 1.84E-08 4.29E-07 1.97E-05 1.25E-05 1.12E-04 2.25E-04 1.34E-04 9.03E-07 2.41E-07 2.97E-05 6.37E-08 1.28E-09 2.39E-04 9.34E-07 4.78E-04 3.76E-04 1.55E-07 1.09E-04 3.28E-09 3.23E-05 6.04E-06 1.94E-05 1.13E-07 2.85E-04 6.90E-05 1.29E-07  163  GeneSymbol  GeneName  SFRS2B SGIP1 SKAP2  splicing factor, arginine/serine-rich 2B SH3-domain GRB2-like (endophilin) interacting protein 1 src kinase associated phosphoprotein 2 solute carrier family 1 (neuronal/epithelial high affinity glutamate transporter, system Xag), member 1 solute carrier family 24 (sodium/potassium/calcium exchanger), member 3 solute carrier family 25 (mitochondrial carrier; oxoglutarate carrier), member 11 solute carrier family 25 (mitochondrial carrier; adenine nucleotide translocator), member 4 solute carrier family 25, member 44 solute carrier family 25, member 46 solute carrier family 27 (fatty acid transporter), member 2 solute carrier family 30 (zinc transporter), member 3 solute carrier family 30 (zinc transporter), member 5 solute carrier family 30 (zinc transporter), member 9 solute carrier family 39 (zinc transporter), member 3 SLIT and NTRK-like family, member 1 SLIT and NTRK-like family, member 5 SMAD family member 3 small ArfGAP 1 SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a, member 2 SMAD specific E3 ubiquitin protein ligase 1 synaptosomal-associated protein, 25kDa sorting nexin 3 sorting nexin 4 suppressor of cytokine signaling 2 SRY (sex determining region Y)-box 11 sperm autoantigenic protein 17 squalene epoxidase steroid-5-alpha-reductase, alpha polypeptide 1 (3-oxo-5 alphasteroid delta 4-dehydrogenase alpha 1) signal recognition particle receptor, B subunit somatostatin somatostatin receptor 1 signal transducer and activator of transcription 1, 91kDa serine/threonine kinase 24 (STE20 homolog, yeast) serine/threonine kinase 25 (STE20 homolog, yeast) serine/threonine kinase 32C stathmin 1/oncoprotein 18 stathmin-like 3 stomatin (EPB72)-like 1 serine/threonine/tyrosine kinase 1 SUB1 homolog (S. cerevisiae) succinate-CoA ligase, ADP-forming, beta subunit SMT3 suppressor of mif two 3 homolog 3 (S. cerevisiae) SV2 related protein homolog (rat) synapsin II  SLC1A1 SLC24A3 SLC25A11 SLC25A4 SLC25A44 SLC25A46 SLC27A2 SLC30A3 SLC30A5 SLC30A9 SLC39A3 SLITRK1 SLITRK5 SMAD3 SMAP1 SMARCA2 SMURF1 SNAP25 SNX3 SNX4 SOCS2 SOX11 SPA17 SQLE SRD5A1 SRPRB SST SSTR1 STAT1 STK24 STK25 STK32C STMN1 STMN3 STOML1 STYK1 SUB1 SUCLA2 SUMO3 SVOP SYN2  Meta Q-value 2.32E-05 2.41E-07 2.87E-04 3.64E-04 5.39E-09 5.79E-05 9.61E-09 5.70E-09 3.35E-06 7.24E-10 1.08E-03 1.10E-07 8.64E-07 8.81E-04 6.23E-04 2.02E-05 6.87E-05 5.31E-04 6.37E-06 1.67E-04 4.94E-05 9.01E-05 1.08E-04 3.00E-08 2.41E-06 1.17E-06 8.89E-04 1.38E-06 1.85E-04 4.88E-04 3.14E-05 1.58E-04 1.11E-07 6.59E-04 6.66E-09 1.46E-06 7.05E-07 1.43E-05 1.59E-05 9.41E-09 2.52E-06 1.26E-05 1.38E-06 7.70E-07  164  GeneSymbol  GeneName  SYNGR3 SYP SYT1 SYT11 SYT5 TAC1 TAC3  synaptogyrin 3 synaptophysin synaptotagmin I synaptotagmin XI synaptotagmin V tachykinin, precursor 1 tachykinin 3 tetratricopeptide repeat, ankyrin repeat and coiled-coil containing 2 TBC1 domain family, member 9 (with GRAM domain) t-complex 11 (mouse)-like 1 transcription factor CP2 transferrin receptor (p90, CD71) THO complex 7 homolog (Drosophila) translocase of inner mitochondrial membrane 13 homolog (yeast) translocase of inner mitochondrial membrane 17 homolog A (yeast) talin 2 TM2 domain containing 2 TM2 domain containing 3 transmembrane protein 121 transmembrane protein 132B transmembrane protein 132D transmembrane protein 155 transmembrane protein 158 transmembrane protein 160 transmembrane protein 169 transmembrane protein 59-like transmembrane protein 65 transmembrane protein 9 tumor necrosis factor, alpha-induced protein 1 (endothelial) target of myb1-like 2 (chicken) torsin family 1, member A (torsin A) TOX high mobility group box family member 3 trophoblast glycoprotein tropomyosin 1 (alpha) TNF receptor-associated factor 5 trafficking protein particle complex 2-like triple functional domain (PTPRF interacting) transient receptor potential cation channel, subfamily C, member 1 Ts translation elongation factor, mitochondrial tetraspanin 17 tumor suppressing subtransferable candidate 1 tetratricopeptide repeat domain 9B tubby homolog (mouse) tubulin, alpha 4a tubulin, beta 2A tumor suppressor candidate 3  TANC2 TBC1D9 TCP11L1 TFCP2 TFRC THOC7 TIMM13 TIMM17A TLN2 TM2D2 TM2D3 TMEM121 TMEM132B TMEM132D TMEM155 TMEM158 TMEM160 TMEM169 TMEM59L TMEM65 TMEM9 TNFAIP1 TOM1L2 TOR1A TOX3 TPBG TPM1 TRAF5 TRAPPC2L TRIO TRPC1 TSFM TSPAN17 TSSC1 TTC9B TUB TUBA4A TUBB2A TUSC3  Meta Q-value 1.28E-05 5.21E-04 1.29E-06 1.62E-07 1.01E-03 8.89E-06 4.97E-06 2.78E-07 5.63E-07 4.70E-06 1.38E-05 8.30E-13 1.63E-08 4.45E-07 2.16E-08 1.65E-06 1.89E-06 1.93E-06 9.86E-07 5.98E-05 2.94E-07 2.30E-05 6.72E-05 1.93E-07 8.37E-09 2.53E-06 1.52E-06 3.18E-05 6.47E-06 2.67E-04 9.19E-06 2.39E-06 8.69E-05 3.18E-05 6.13E-05 1.67E-04 4.80E-05 7.19E-08 2.68E-05 1.10E-05 5.87E-07 3.65E-09 2.90E-04 1.51E-05 6.28E-05 6.62E-07  165  GeneSymbol  GeneName  UBE2A UBE2B UBE2D2 UBE2G1 UBE2J1 UBE2N UBE2T UBE2V2 UBE3A UCHL1 ULK2 UROS VAMP2 VIP VKORC1L1 VLDLR VSNL1 WARS WDR13 WDR23 WDR86 WIPI2 WSB2 XKR4 YAF2 YARS YKT6  ubiquitin-conjugating enzyme E2A (RAD6 homolog) ubiquitin-conjugating enzyme E2B (RAD6 homolog) ubiquitin-conjugating enzyme E2D 2 (UBC4/5 homolog, yeast) ubiquitin-conjugating enzyme E2G 1 (UBC7 homolog, yeast) ubiquitin-conjugating enzyme E2, J1 (UBC6 homolog, yeast) ubiquitin-conjugating enzyme E2N (UBC13 homolog, yeast) ubiquitin-conjugating enzyme E2T (putative) ubiquitin-conjugating enzyme E2 variant 2 ubiquitin protein ligase E3A ubiquitin carboxyl-terminal esterase L1 (ubiquitin thiolesterase) unc-51-like kinase 2 (C. elegans) uroporphyrinogen III synthase vesicle-associated membrane protein 2 (synaptobrevin 2) vasoactive intestinal peptide vitamin K epoxide reductase complex, subunit 1-like 1 very low density lipoprotein receptor visinin-like 1 tryptophanyl-tRNA synthetase WD repeat domain 13 WD repeat domain 23 WD repeat domain 86 WD repeat domain, phosphoinositide interacting 2 WD repeat and SOCS box-containing 2 XK, Kell blood group complex subunit-related family, member 4 YY1 associated factor 2 tyrosyl-tRNA synthetase YKT6 v-SNARE homolog (S. cerevisiae) tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein, eta polypeptide unknown zinc finger, CCHC domain containing 17 zinc finger, DHHC-type containing 3 zinc finger, matrin type 2 zinc finger protein 263 zinc finger protein 689 zinc finger protein 711 zinc and ring finger 1  YWHAH YWHAZ.bApr07 ZCCHC17 ZDHHC3 ZMAT2 ZNF263 ZNF689 ZNF711 ZNRF1  Meta Q-value 8.42E-07 7.54E-07 1.16E-05 2.84E-06 1.09E-04 5.62E-06 5.04E-05 1.05E-10 1.29E-05 5.16E-07 8.89E-06 4.79E-05 7.44E-08 2.59E-04 6.55E-06 5.95E-05 2.72E-09 9.35E-06 1.33E-05 2.85E-06 5.45E-08 2.20E-06 9.85E-06 3.95E-05 2.24E-07 2.00E-04 1.03E-04 6.81E-05 5.94E-05 6.57E-06 1.86E-05 1.44E-04 1.62E-04 3.86E-04 3.08E-05 1.77E-04  Age Up-regulated GeneSymbol SPEN ZAK ANTXR1 ZBTB16 ZFP36L1 SEPT9  GeneName spen homolog, transcriptional regulator (Drosophila) sterile alpha motif and leucine zipper containing kinase AZK anthrax toxin receptor 1 zinc finger and BTB domain containing 16 zinc finger protein 36, C3H type-like 1 septin 9  Meta Q-value 2.86E-05 1.17E-05 1.62E-06 1.97E-04 2.51E-05 3.26E-04  166  GeneSymbol  GeneName  ZXDC AAAS ZIC2 IL28RA ZNF302 ZBTB20 ZNF423 ZC3HAV1 SRRM2 ZBTB7A CFH KIAA0841 SEPT6 ZMYM6 SIGLEC8 DPF3 ZNF609 ZFAND6 ZBTB33 IGFBP5 SCN7A PHF3 ARHGEF6 HBP1 AKAP1  ZXD family zinc finger C achalasia, adrenocortical insufficiency, alacrimia (Allgrove, triple-A) Zic family member 2 (odd-paired homolog, Drosophila) interleukin 28 receptor, alpha (interferon, lambda receptor) zinc finger protein 302 zinc finger and BTB domain containing 20 zinc finger protein 423 zinc finger CCCH-type, antiviral 1 serine/arginine repetitive matrix 2 zinc finger and BTB domain containing 7A complement factor H KIAA0841 septin 6 zinc finger, MYM-type 6 sialic acid binding Ig-like lectin 8 D4, zinc and double PHD fingers, family 3 zinc finger protein 609 zinc finger, AN1-type domain 6 zinc finger and BTB domain containing 33 insulin-like growth factor binding protein 5 sodium channel, voltage-gated, type VII, alpha PHD finger protein 3 Rac/Cdc42 guanine nucleotide exchange factor (GEF) 6 HMG-box transcription factor 1 A kinase (PRKA) anchor protein 1 SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily d, member 2 zinc finger protein 235 integrin, beta 4 phosphoribosylaminoimidazole carboxylase, phosphoribosylaminoimidazole succinocarboxamide synthetase dystrobrevin, alpha eukaryotic elongation factor-2 kinase spondin 1, extracellular matrix protein pre-B-cell leukemia homeobox interacting protein 1 metastasis suppressor 1 scaffold attachment factor B2 S-adenosylhomocysteine hydrolase-like 1 nucleoporin 160kDa nuclear transport factor 2-like export factor 2 Rho-related BTB domain containing 3 regucalcin (senescence marker protein-30) SMAD family member 5 guanosine monophosphate reductase copine III CD59 molecule, complement regulatory protein TNF receptor-associated factor 1 protein phosphatase 2 (formerly 2A), regulatory subunit A, beta isoform adenosine A3 receptor  SMARCD2 ZNF235 ITGB4 PAICS DTNA EEF2K SPON1 PBXIP1 MTSS1 SAFB2 AHCYL1 NUP160 NXT2 RHOBTB3 RGN SMAD5 GMPR CPNE3 CD59 TRAF1 PPP2R1B ADORA3  Meta Q-value 2.41E-04 1.07E-03 5.54E-04 1.35E-04 5.23E-05 6.86E-04 1.31E-04 1.34E-04 1.38E-04 6.96E-04 4.58E-06 6.90E-04 2.00E-04 2.59E-04 1.67E-04 1.08E-04 2.08E-04 1.04E-05 8.13E-04 3.60E-06 6.82E-05 2.17E-04 2.31E-04 2.70E-04 3.87E-03 4.86E-04 2.96E-04 8.21E-04 3.21E-06 1.73E-03 2.49E-03 1.11E-03 5.02E-04 4.98E-04 3.51E-03 4.59E-04 7.59E-04 4.13E-04 4.78E-04 5.89E-05 2.86E-05 2.07E-04 2.35E-04 1.08E-04 2.53E-04 2.10E-05 4.57E-05  167  GeneSymbol  GeneName  SMC1A TMEM63A CALCOCO2 LRP10  structural maintenance of chromosomes 1A transmembrane protein 63A calcium binding and coiled-coil domain 2 low density lipoprotein receptor-related protein 10 solute carrier family 13 (sodium-dependent dicarboxylate transporter), member 3 SCAN domain containing 2 pseudogene perilipin versican jumonji domain containing 2B lipoprotein lipase glycosylphosphatidylinositol specific phospholipase D1 Ras suppressor protein 1 calmodulin-like 4 carbamoyl-phosphate synthetase 1, mitochondrial stromal antigen 2 chromodomain helicase DNA binding protein 6 Gardner-Rasheed feline sarcoma viral (v-fgr) oncogene homolog FYN oncogene related to SRC, FGR, YES structural maintenance of chromosomes 5 protein phosphatase 1, regulatory (inhibitor) subunit 3C inhibitor of kappa light polypeptide gene enhancer in B-cells, kinase beta oxysterol binding protein-like 2 protein phosphatase 1, regulatory (inhibitor) subunit 3D laminin, alpha 4 neurotrophic tyrosine kinase, receptor, type 2 apolipoprotein D osteomodulin ras homolog gene family, member Q KIAA0323 sphingomyelin synthase 1 nephronophthisis 3 (adolescent) LSM14A, SCD6 homolog A (S. cerevisiae) DNA-damage-inducible transcript 4 protein phosphatase 1, regulatory (inhibitor) subunit 1B angiopoietin 1 mitogen-activated protein kinase kinase kinase 6 tyrosine kinase, non-receptor, 2 solute carrier family 7 (cationic amino acid transporter, y+ system), member 2 adducin 3 (gamma) son of sevenless homolog 2 (Drosophila) Rho guanine nucleotide exchange factor (GEF) 10 dihydrolipoamide branched chain transacylase E2 signal-induced proliferation-associated 1 like 3 alcohol dehydrogenase 1B (class I), beta polypeptide translocation associated membrane protein 1 bromodomain containing 8 AF4/FMR2 family, member 1  SLC13A3 SCAND2 PLIN VCAN JMJD2B LPL GPLD1 RSU1 CALML4 CPS1 STAG2 CHD6 FGR FYN SMC5 PPP1R3C IKBKB OSBPL2 PPP1R3D LAMA4 NTRK2 APOD OMD RHOQ KIAA0323 SGMS1 NPHP3 LSM14A DDIT4 PPP1R1B ANGPT1 MAP3K6 TNK2 SLC7A2 ADD3 SOS2 ARHGEF10 DBT SIPA1L3 ADH1B TRAM1 BRD8 AFF1  Meta Q-value 5.88E-05 8.82E-04 4.95E-06 3.21E-04 1.66E-03 1.01E-05 1.05E-06 1.20E-04 1.96E-05 1.93E-04 3.71E-07 2.00E-03 2.43E-03 7.80E-06 1.41E-04 2.44E-04 5.43E-06 9.37E-05 5.89E-05 1.67E-04 2.18E-05 2.52E-06 2.06E-05 2.01E-04 7.68E-05 1.59E-04 5.23E-04 1.35E-04 4.89E-05 3.53E-04 9.58E-05 2.82E-03 1.92E-04 2.50E-04 1.75E-04 2.15E-04 3.70E-04 1.52E-05 1.57E-03 1.27E-07 1.21E-03 4.73E-05 3.71E-07 4.02E-05 3.94E-04 7.76E-04 2.04E-04  168  GeneSymbol  GeneName  RELA CD58 PCGF2 C19orf36 BCL2 CALD1 GFAP TIMP3 CADM1 MTCP1 DAB2 PDLIM3 CD22 TNS1 EPS8L1 AEBP2 ASPH NKTR KTN1 SORBS1 CXCR4 EMP3 LEPR BAIAP3 WWOX SNAP23 WNK1 RYR1 COL6A1 STK3 LATS2 CNOT2 LAMA2 RBL2 HEBP2 KIAA0240 TBL1X AKAP13 BCL6 CD40 ITPR2 MXI1 ARAF WHSC2 CCDC69 CDKN2A HLA-DPB1 RNASE4  v-rel reticuloendotheliosis viral oncogene homolog A (avian) CD58 molecule polycomb group ring finger 2 chromosome 19 open reading frame 36 B-cell CLL/lymphoma 2 caldesmon 1 glial fibrillary acidic protein TIMP metallopeptidase inhibitor 3 cell adhesion molecule 1 mature T-cell proliferation 1 disabled homolog 2, mitogen-responsive phosphoprotein (Drosophila) PDZ and LIM domain 3 CD22 molecule tensin 1 EPS8-like 1 AE binding protein 2 aspartate beta-hydroxylase natural killer-tumor recognition sequence kinectin 1 (kinesin receptor) sorbin and SH3 domain containing 1 chemokine (C-X-C motif) receptor 4 epithelial membrane protein 3 leptin receptor BAI1-associated protein 3 WW domain containing oxidoreductase synaptosomal-associated protein, 23kDa WNK lysine deficient protein kinase 1 ryanodine receptor 1 (skeletal) collagen, type VI, alpha 1 serine/threonine kinase 3 (STE20 homolog, yeast) LATS, large tumor suppressor, homolog 2 (Drosophila) CCR4-NOT transcription complex, subunit 2 laminin, alpha 2 retinoblastoma-like 2 (p130) heme binding protein 2 KIAA0240 transducin (beta)-like 1X-linked A kinase (PRKA) anchor protein 13 B-cell CLL/lymphoma 6 CD40 molecule, TNF receptor superfamily member 5 inositol 1,4,5-triphosphate receptor, type 2 MAX interactor 1 v-raf murine sarcoma 3611 viral oncogene homolog Wolf-Hirschhorn syndrome candidate 2 coiled-coil domain containing 69 cyclin-dependent kinase inhibitor 2A (melanoma, p16, inhibits CDK4) major histocompatibility complex, class II, DP beta 1 ribonuclease, RNase A family, 4  Meta Q-value 4.00E-04 2.71E-04 4.37E-05 3.87E-05 1.12E-04 1.21E-03 1.43E-03 5.23E-06 3.02E-04 2.54E-04 9.45E-05 4.55E-04 1.72E-05 5.20E-05 1.84E-04 2.64E-04 6.94E-04 2.14E-07 2.98E-04 1.28E-04 6.22E-05 4.28E-04 9.30E-06 4.53E-04 1.33E-06 3.50E-06 1.08E-04 8.68E-05 2.20E-03 1.03E-04 2.56E-04 3.74E-06 7.53E-04 7.45E-04 6.29E-04 2.10E-05 2.66E-07 2.08E-04 3.94E-06 1.89E-04 1.29E-06 1.84E-04 5.43E-04 2.50E-04 1.44E-04 8.38E-05 1.18E-04 3.65E-04  169  GeneSymbol  GeneName  IQCK CEP350 AHNAK CSF1 BBS2 ITGAV CAPN2 EPOR MBOAT5 NR2C1 FMO3 NEBL INSR OFD1 ITPKB LGALS3 SSPN TRIM4 SPN C9orf150 HSPBAP1 MYST3 GGA2 ACACB CBFB TREX1 RBBP6 EMP2 ANXA4 LPP MBD2 SWAP70 PTGER3 SORBS3 CAST GYPC FER1L3 FMNL2 AXL TPP1 VIM FZD7 FAM107A ECHDC2 PRKCH SMOX MTM1 EIF2AK2  IQ motif containing K centrosomal protein 350kDa AHNAK nucleoprotein colony stimulating factor 1 (macrophage) Bardet-Biedl syndrome 2 integrin, alpha V (vitronectin receptor, alpha polypeptide, antigen CD51) calpain 2, (m/II) large subunit erythropoietin receptor unknown nuclear receptor subfamily 2, group C, member 1 flavin containing monooxygenase 3 nebulette insulin receptor oral-facial-digital syndrome 1 inositol 1,4,5-trisphosphate 3-kinase B lectin, galactoside-binding, soluble, 3 sarcospan (Kras oncogene-associated gene) tripartite motif-containing 4 sialophorin chromosome 9 open reading frame 150 HSPB (heat shock 27kDa) associated protein 1 MYST histone acetyltransferase (monocytic leukemia) 3 golgi associated, gamma adaptin ear containing, ARF binding protein 2 acetyl-Coenzyme A carboxylase beta core-binding factor, beta subunit three prime repair exonuclease 1 retinoblastoma binding protein 6 epithelial membrane protein 2 annexin A4 LIM domain containing preferred translocation partner in lipoma methyl-CpG binding domain protein 2 SWAP-70 protein prostaglandin E receptor 3 (subtype EP3) sorbin and SH3 domain containing 3 calpastatin glycophorin C (Gerbich blood group) unknown formin-like 2 AXL receptor tyrosine kinase tripeptidyl peptidase I vimentin frizzled homolog 7 (Drosophila) family with sequence similarity 107, member A enoyl Coenzyme A hydratase domain containing 2 protein kinase C, eta spermine oxidase myotubularin 1 eukaryotic translation initiation factor 2-alpha kinase 2  Meta Q-value 9.86E-04 5.68E-05 1.44E-04 7.05E-04 5.63E-04 1.89E-04 8.68E-05 1.58E-05 1.04E-05 9.73E-06 6.11E-05 4.02E-04 2.00E-03 1.03E-05 1.17E-05 2.94E-07 9.29E-06 1.50E-06 1.87E-06 1.33E-06 1.42E-03 1.97E-04 3.05E-06 2.94E-07 1.07E-03 3.96E-05 6.14E-06 2.90E-03 2.94E-06 1.23E-04 1.97E-04 1.41E-04 6.82E-05 5.88E-04 3.76E-06 1.07E-03 1.92E-05 1.35E-04 1.34E-04 1.45E-05 1.27E-04 1.35E-04 1.17E-05 3.55E-04 6.17E-04 4.06E-05 6.95E-05 1.72E-04  170  GeneSymbol  GeneName  BCAR3 DGKG PTRF TNRC6A EHD1 ERG MYST4 PGCP RAB31 KIAA1627 PHKA2 GPR125 MGST2 TAZ RBPMS BCL9 USP54 EMCN N4BP1  breast cancer anti-estrogen resistance 3 diacylglycerol kinase, gamma 90kDa polymerase I and transcript release factor trinucleotide repeat containing 6A EH-domain containing 1 v-ets erythroblastosis virus E26 oncogene homolog (avian) MYST histone acetyltransferase (monocytic leukemia) 4 plasma glutamate carboxypeptidase RAB31, member RAS oncogene family KIAA1627 protein phosphorylase kinase, alpha 2 (liver) G protein-coupled receptor 125 microsomal glutathione S-transferase 2 tafazzin RNA binding protein with multiple splicing B-cell CLL/lymphoma 9 ubiquitin specific peptidase 54 endomucin NEDD4 binding protein 1 TAF4 RNA polymerase II, TATA box binding protein (TBP)-associated factor, 135kDa bromodomain containing 1 chromosome 10 open reading frame 104 KIAA0494 major histocompatibility complex, class II, DP alpha 1 acidic (leucine-rich) nuclear phosphoprotein 32 family, member E ribosome binding protein 1 homolog 180kDa (dog) nuclear factor I/A stomatin solute carrier family 22 (organic cation/carnitine transporter), member 5 interferon regulatory factor 2 transglutaminase 2 (C polypeptide, protein-glutamine-gammaglutamyltransferase) unknown THO complex 2 prostaglandin D2 synthase 21kDa (brain) enhancer of zeste homolog 1 (Drosophila) plectin 1, intermediate filament binding protein 500kDa chromodomain helicase DNA binding protein 1 potassium intermediate/small conductance calcium-activated channel, subfamily N, member 3 fibroblast growth factor receptor 1 Ras association (RalGDS/AF-6) domain family (N-terminal) member 8 carbohydrate (chondroitin 6) sulfotransferase 3 hypothetical gene supported by AK128882 spectrin, beta, non-erythrocytic 1 chromosome 2 open reading frame 24 thioredoxin interacting protein nuclear receptor coactivator 3  TAF4 BRD1 C10orf104 KIAA0494 HLA-DPA1 ANP32E RRBP1 NFIA STOM SLC22A5 IRF2 TGM2 ASAHL THOC2 PTGDS EZH1 PLEC1 CHD1 KCNN3 FGFR1 RASSF8 CHST3 LOC441108 SPTBN1 C2orf24 TXNIP NCOA3  Meta Q-value 3.35E-04 7.69E-04 1.00E-03 1.37E-04 5.08E-04 5.50E-06 2.08E-05 4.96E-04 3.45E-05 1.52E-08 8.70E-08 1.11E-04 1.67E-04 4.58E-05 1.52E-08 2.51E-06 1.44E-04 1.89E-04 6.30E-05 1.94E-09 5.97E-04 2.81E-04 2.44E-04 4.24E-04 5.23E-05 9.45E-05 1.44E-03 1.62E-06 7.60E-08 4.85E-04 3.38E-04 6.95E-04 7.28E-04 2.78E-04 1.92E-04 2.04E-04 1.53E-04 5.00E-04 2.66E-07 2.75E-05 8.31E-06 1.71E-05 2.94E-07 4.06E-05 1.03E-04 4.53E-04  171  GeneSymbol  GeneName  DDR2 KIAA0913 ECM2 GSTM5 MED12 ACIN1 SLC14A1 NFATC3 SOX13 ST7L TFPI CDC14A  discoidin domain receptor tyrosine kinase 2 KIAA0913 extracellular matrix protein 2, female organ and adipocyte specific glutathione S-transferase mu 5 mediator complex subunit 12 apoptotic chromatin condensation inducer 1 solute carrier family 14 (urea transporter), member 1 (Kidd blood group) nuclear factor of activated T-cells, cytoplasmic, calcineurin-dependent 3 SRY (sex determining region Y)-box 13 suppression of tumorigenicity 7 like tissue factor pathway inhibitor (lipoprotein-associated coagulation inhibitor) CDC14 cell division cycle 14 homolog A (S. cerevisiae) protein tyrosine phosphatase, receptor type, f polypeptide (PTPRF), interacting protein (liprin), alpha 1 CDC42 binding protein kinase alpha (DMPK-like) gelsolin (amyloidosis, Finnish type) inhibitor of DNA binding 4, dominant negative helix-loop-helix protein glypican 4 syndecan 2 MYC associated factor X methyltransferase like 7A SMAD family member 4 serine/threonine kinase 10 fibroblast growth factor 2 (basic) lipin 1 aldehyde dehydrogenase 1 family, member L1 coiled-coil domain containing 101 achaete-scute complex homolog 1 (Drosophila) aldehyde dehydrogenase 6 family, member A1 galactose mutarotase (aldose 1-epimerase) dishevelled, dsh homolog 2 (Drosophila) chromosome 16 open reading frame 35 polypyrimidine tract binding protein 1 mannan-binding lectin serine peptidase 1 (C4/C2 activating component of Rareactive factor) AE binding protein 1 aldehyde dehydrogenase 4 family, member A1 arachidonate 5-lipoxygenase branched chain aminotransferase 2, mitochondrial chromosome 1 open reading frame 162 calnexin coiled-coil and C2 domain containing 1A coiled-coil domain containing 66 cyclin-dependent kinase inhibitor 2C (p18, inhibits CDK4) CASP8 and FADD-like apoptosis regulator cystic fibrosis transmembrane conductance regulator (ATP-binding cassette sub-family C, member 7) ceroid-lipofuscinosis, neuronal 5 CCR4-NOT transcription complex, subunit 4  PPFIA1 CDC42BPA GSN ID4 GPC4 SDC2 MAX METTL7A SMAD4 STK10 FGF2 LPIN1 ALDH1L1 CCDC101 ASCL1 ALDH6A1 GALM DVL2 C16orf35 PTBP1 MASP1 AEBP1 ALDH4A1 ALOX5 BCAT2 C1orf162 CANX CC2D1A CCDC66 CDKN2C CFLAR CFTR CLN5 CNOT4  Meta Q-value 3.07E-05 2.14E-04 1.84E-04 4.55E-04 5.42E-05 2.26E-04 3.50E-06 4.93E-04 1.89E-04 1.71E-05 3.51E-05 6.95E-05 4.08E-04 1.59E-03 1.71E-05 3.05E-06 2.32E-04 3.49E-04 1.69E-04 2.94E-05 4.77E-04 6.11E-05 1.11E-03 6.70E-04 1.72E-04 1.90E-06 1.04E-05 1.49E-03 3.34E-05 3.32E-05 5.30E-04 2.28E-03 8.82E-04 1.22E-04 1.20E-04 2.29E-04 9.83E-04 3.38E-04 1.11E-03 1.29E-05 3.07E-05 2.74E-06 3.35E-04 9.37E-05 8.68E-05 8.56E-04  172  GeneSymbol  GeneName  FAM105B GATM HFE IFT140 IGF1R IL13RA1 IQGAP1 LIMK2 LMNA LSS MGST1 MYO1C NEK3 NRP1 NUPR1 PCBP2 PELI2 PHF1 PLXNA2  family with sequence similarity 105, member B glycine amidinotransferase (L-arginine:glycine amidinotransferase) hemochromatosis intraflagellar transport 140 homolog (Chlamydomonas) insulin-like growth factor 1 receptor interleukin 13 receptor, alpha 1 IQ motif containing GTPase activating protein 1 LIM domain kinase 2 lamin A/C lanosterol synthase (2,3-oxidosqualene-lanosterol cyclase) microsomal glutathione S-transferase 1 myosin IC NIMA (never in mitosis gene a)-related kinase 3 neuropilin 1 nuclear protein 1 poly(rC) binding protein 2 pellino homolog 2 (Drosophila) PHD finger protein 1 plexin A2 protein tyrosine phosphatase, non-receptor type 13 (APO-1/CD95 (Fas)associated phosphatase) quaking homolog, KH domain RNA binding (mouse) RIO kinase 3 (yeast) solute carrier family 15 (oligopeptide transporter), member 1 solute carrier family 16, member 9 (monocarboxylic acid transporter 9) SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily c, member 1 SP100 nuclear antigen transcription factor 3 (E2A immunoglobulin enhancer binding factors E12/E47) TIMP metallopeptidase inhibitor 4 transducin-like enhancer of split 1 (E(sp1) homolog, Drosophila) torsin A interacting protein 1  PTPN13 QKI RIOK3 SLC15A1 SLC16A9 SMARCC1 SP100 TCF3 TIMP4 TLE1 TOR1AIP1  Meta Q-value 8.43E-04 6.05E-05 3.02E-04 8.31E-06 2.66E-05 2.54E-04 8.21E-04 9.73E-06 4.65E-05 8.82E-04 2.41E-03 1.37E-04 1.04E-05 1.86E-04 7.80E-06 1.46E-05 1.74E-04 6.17E-04 4.65E-05 7.97E-05 2.52E-04 2.35E-05 5.20E-04 9.46E-05 1.18E-04 2.41E-04 9.58E-05 2.56E-04 3.55E-04 1.22E-04  pH Down-regulated GeneSymbol  GeneName  MAPKAPK2  mitogen-activated protein kinase-activated protein kinase 2  Meta Q-value 2.45E-005  pH Up-regulated GeneSymbol LARGE HISPPD2A KCNAB1  GeneName like-glycosyltransferase histidine acid phosphatase domain containing 2A potassium voltage-gated channel, shaker-related subfamily, beta member 1  Meta Q-value 7.04E-04 2.96E-04 7.04E-04  173  GeneSymbol  GeneName  RPS6KA3 SYN2 TUSC3 NNAT IDH3B SARS KIF3C NPTX2 ENSA KATNB1  ribosomal protein S6 kinase, 90kDa, polypeptide 3 synapsin II tumor suppressor candidate 3 neuronatin isocitrate dehydrogenase 3 (NAD+) beta seryl-tRNA synthetase kinesin family member 3C neuronal pentraxin II endosulfine alpha katanin p80 (WD repeat containing) subunit B 1  Meta Q-value 9.71E-04 7.04E-04 1.39E-03 1.67E-03 1.03E-03 7.04E-04 1.94E-04 5.98E-04 7.04E-04 1.03E-03  PMI Down-regulated GeneSymbol  GeneName  BRD8 ARHGEF7 DAPK1 CAMK2G  bromodomain containing 8 Rho guanine nucleotide exchange factor (GEF) 7 death-associated protein kinase 1 calcium/calmodulin-dependent protein kinase II gamma  Meta Q-value 1.34E-04 8.64E-04 1.89E-03 3.86E-03  PMI Up-regulated GeneSymbol  GeneName  GOSR2 CYB5B MAX MGMT GRLF1 PTGER3 EXT1 TPM1  golgi SNAP receptor complex member 2 cytochrome b5 type B (outer mitochondrial membrane) MYC associated factor X O-6-methylguanine-DNA methyltransferase glucocorticoid receptor DNA binding factor 1 prostaglandin E receptor 3 (subtype EP3) exostoses (multiple) 1 tropomyosin 1 (alpha) syntrophin, beta 1 (dystrophin-associated protein A1, 59kDa, basic component 1) EPH receptor B2 CD44 molecule (Indian blood group) microtubule-associated protein 2 glutamate receptor, metabotropic 7 iduronate 2-sulfatase SMAD family member 3 neuregulin 1 spectrin, beta, non-erythrocytic 1 ATPase, Ca++ transporting, plasma membrane 4 regulator of G-protein signaling 12 chromosome 9 open reading frame 116  SNTB1 EPHB2 CD44 MAP2 GRM7 IDS SMAD3 NRG1 SPTBN1 ATP2B4 RGS12 C9orf116  Meta Q-value 5.96E-04 8.63E-04 1.56E-03 1.17E-03 4.77E-04 1.18E-03 1.01E-03 1.79E-03 5.22E-03 2.16E-03 1.01E-03 2.24E-04 1.92E-03 1.60E-03 4.21E-04 2.23E-04 2.22E-03 1.17E-03 9.45E-04 2.24E-04  174  GeneSymbol  GeneName  FCAR STAT1 PPP1R1A UBE2D2 GLP1R POU6F1 CPSF6 ARID4A RARRES1 CDK5R1 SFRP4  Fc fragment of IgA, receptor for signal transducer and activator of transcription 1, 91kDa protein phosphatase 1, regulatory (inhibitor) subunit 1A ubiquitin-conjugating enzyme E2D 2 (UBC4/5 homolog, yeast) glucagon-like peptide 1 receptor POU class 6 homeobox 1 cleavage and polyadenylation specific factor 6, 68kDa AT rich interactive domain 4A (RBP1-like) retinoic acid receptor responder (tazarotene induced) 1 cyclin-dependent kinase 5, regulatory subunit 1 (p35) secreted frizzled-related protein 4 dopachrome tautomerase (dopachrome delta-isomerase, tyrosine-related protein 2) fizzy/cell division cycle 20 related 1 (Drosophila) ATP-binding cassette, sub-family C (CFTR/MRP), member 9 tripartite motif-containing 3 calsyntenin 2 guanine nucleotide binding protein (G protein), alpha activating activity polypeptide O NudC domain containing 3 G protein-coupled receptor 161 muscleblind-like 3 (Drosophila) coiled-coil domain containing 28B RNA binding protein with multiple splicing trinucleotide repeat containing 4 nuclear respiratory factor 1 minichromosome maintenance complex component 4 BRCA1 associated protein DEAD (Asp-Glu-Ala-Asp) box polypeptide 54 G-2 and S-phase expressed 1 ATPase, aminophospholipid transporter-like, class I, type 8A, member 2 artemin adrenergic, alpha-1A-, receptor myocyte enhancer factor 2B cadherin 6, type 2, K-cadherin (fetal kidney) nuclear receptor subfamily 4, group A, member 2 v-myb myeloblastosis viral oncogene homolog (avian) diaphanous homolog 2 (Drosophila) IKAROS family zinc finger 1 (Ikaros) acrosomal vesicle protein 1 v-ets erythroblastosis virus E26 oncogene homolog (avian) 4-aminobutyrate aminotransferase WW domain containing oxidoreductase integrin, beta 3 (platelet glycoprotein IIIa, antigen CD61) estrogen receptor 2 (ER beta) gastrin-releasing peptide hairy and enhancer of split 2 (Drosophila) kallikrein-related peptidase 11 dual-specificity tyrosine-(Y)-phosphorylation regulated kinase 1A  DCT FZR1 ABCC9 TRIM3 CLSTN2 GNAO1 NUDCD3 GPR161 MBNL3 CCDC28B RBPMS TNRC4 NRF1 MCM4 BRAP DDX54 GTSE1 ATP8A2 ARTN ADRA1A MEF2B CDH6 NR4A2 MYB DIAPH2 IKZF1 ACRV1 ERG ABAT WWOX ITGB3 ESR2 GRP HES2 KLK11 DYRK1A  Meta Q-value 1.45E-04 3.79E-03 1.10E-04 2.00E-03 1.92E-03 6.06E-05 2.33E-03 1.16E-03 4.03E-04 5.10E-04 1.01E-03 6.38E-04 3.65E-05 7.67E-04 6.67E-03 4.47E-03 8.64E-04 1.85E-03 1.01E-03 6.12E-05 9.91E-04 3.83E-03 8.64E-04 8.58E-04 8.78E-04 2.26E-04 4.12E-04 5.89E-04 1.77E-03 7.67E-04 4.36E-03 2.45E-04 2.51E-03 8.64E-04 1.17E-03 1.01E-03 2.24E-04 1.55E-03 9.24E-05 7.58E-04 6.12E-05 1.44E-04 8.78E-04 1.92E-04 1.43E-03 3.14E-05 2.24E-04  175  GeneSymbol  GeneName  TOP3A SCAND2 HPGD PIK3CG BCL2  topoisomerase (DNA) III alpha SCAN domain containing 2 pseudogene hydroxyprostaglandin dehydrogenase 15-(NAD) phosphoinositide-3-kinase, catalytic, gamma polypeptide B-cell CLL/lymphoma 2  Meta Q-value 1.54E-03 1.16E-03 3.34E-03 9.99E-04 1.56E-03  Male Up-regulated GeneSymbol JARID1D USP9Y EIF1AY CYorf15B DDX3Y UTY RPS4Y1 TTTY15 CYorf15A ZFY NBL1 PRMT2  GeneName jumonji, AT rich interactive domain 1D ubiquitin specific peptidase 9, Y-linked (fat facets-like, Drosophila) eukaryotic translation initiation factor 1A, Y-linked chromosome Y open reading frame 15B DEAD (Asp-Glu-Ala-Asp) box polypeptide 3, Y-linked ubiquitously transcribed tetratricopeptide repeat gene, Y-linked ribosomal protein S4, Y-linked 1 testis-specific transcript, Y-linked 15 chromosome Y open reading frame 15A zinc finger protein, Y-linked neuroblastoma, suppression of tumorigenicity 1 protein arginine methyltransferase 2  Meta Q-value 5.05E-58 5.04E-35 1.14E-96 1.61E-138 8.84E-38 3.49E-81 7.87E-05 4.25E-04 9.58E-107 4.80E-18 2.75E-71 4.95E-83  Female Up-regulated GeneSymbol  GeneName  XIST HDHD1A UTX JARID1C USP9X STS ZFX PNPLA4 MAP2K3 LYST SRRM2 EMCN NKTR PCM1 EPS8L1  X (inactive)-specific transcript (non-protein coding) haloacid dehalogenase-like hydrolase domain containing 1A ubiquitously transcribed tetratricopeptide repeat, X chromosome jumonji, AT rich interactive domain 1C ubiquitin specific peptidase 9, X-linked steroid sulfatase (microsomal), isozyme S zinc finger protein, X-linked patatin-like phospholipase domain containing 4 mitogen-activated protein kinase kinase 3 lysosomal trafficking regulator serine/arginine repetitive matrix 2 endomucin natural killer-tumor recognition sequence pericentriolar material 1 EPS8-like 1  Meta Q-value 1.30E-03 9.75E-04 1.25E-14 9.41E-95 5.06E-04 9.75E-04 1.12E-04 3.52E-09 1.99E-03 1.64E-10 9.31E-08 1.72E-12 3.10E-05 1.38E-07 1.45E-05  176  


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items