UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Meta-analysis of gene expression in individuals with Autism Spectrum Disorders Ch'ng, Carolyn Lin Wei 2013

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


24-ubc_2013_fall_chng_carolyn.pdf [ 4.77MB ]
JSON: 24-1.0074209.json
JSON-LD: 24-1.0074209-ld.json
RDF/XML (Pretty): 24-1.0074209-rdf.xml
RDF/JSON: 24-1.0074209-rdf.json
Turtle: 24-1.0074209-turtle.txt
N-Triples: 24-1.0074209-rdf-ntriples.txt
Original Record: 24-1.0074209-source.json
Full Text

Full Text

Meta-analysis of Gene Expression in Individuals withAutism Spectrum DisordersbyCarolyn Lin Wei Ch?ngBSc., University of Michigan Ann Arbor, 2011A THESIS SUBMITTED IN PARTIAL FULFILLMENTOF THE REQUIREMENTS FOR THE DEGREE OFMaster of ScienceinTHE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES(Bioinformatics)The University of British Columbia(Vancouver)August 2013c? Carolyn Lin Wei Ch?ng, 2013AbstractAutism spectrum disorders (ASD) are clinically heterogeneous and biologically complex.State of the art genetics research has unveiled a large number of variants linked to ASD. Butin general it remains unclear, what biological factors lead to changes in the brains of autisticindividuals. We build on the premise that these heterogeneous genetic or genomic aberra-tions will converge towards a common impact downstream, which might be reflected in thetranscriptomes of individuals with ASD. Similarly, a considerable number of transcriptomeanalyses have been performed in attempts to address this question, but their findings lack aclear consensus. As a result, each of these individual studies has not led to any significantadvance in understanding the autistic phenotype as a whole. The goal of this research is tocomprehensively re-evaluate these expression profiling studies by conducting a systematicmeta-analysis. Here, we report a meta-analysis of over 1000 microarrays across twelveindependent studies on expression changes in ASD compared to unaffected individuals,in blood and brain. We identified a number of genes that are consistently differentiallyexpressed across studies of the brain, suggestive of effects on mitochondrial function. Inblood, consistent changes were more difficult to identify, despite individual studies tendingto exhibit larger effects than the brain studies. Our results are the strongest evidence to dateof a common transcriptome signature in the brains of individuals with ASD.iiPrefaceUnder the supervision of Dr. Paul Pavlidis, I conducted and authored the work presentedhenceforth. Willie Kwok performed preliminary research under the mentorship of Dr.Sanja Rogic, who, together with Dr. Paul Pavlidis, contributed to the development of thisproject.A version of this work will be submitted to a peer reviewed journal for publication. CarolynCh?ng, Willie Kwok, Sanja Rogic, Paul Pavlidis. Meta-analysis of expression profiles inthe blood and brains of individuals with autism spectrum disorders (in preparation).Eloi Mercier provided all the aggregated networks for the network analysis in Chapter 2.Portions of Chapter 2 are used with permission from Portales-Casamar et al., of which Iam a second author. Elodie Portales-Casamar, Carolyn Ch?ng, Frances Lui, Nicolas St-Georges, Anton Zoubarev, Artemis Y. Lai, Mark Lee, Cathy Kwok, Willie Kwok, LuchiaTseng, and Paul Pavlidis. Neurocarta: aggregating and sharing disease-gene relationsfor the neurosciences. BMC Genomics, 14(1):129, February 2013. ISSN 1471-2164.doi:10.1186/1471-2164-14-129. URL http://www.biomedcentral.com/1471-2164/14/129/abstract. PMID: 23442263.iiiTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ixGlossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiAcknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiiDedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 History and early theories in autism . . . . . . . . . . . . . . . . . . . . . 21.2 Emerging theories in autism genetics . . . . . . . . . . . . . . . . . . . . . 31.3 The search for convergence in the autism spectrum . . . . . . . . . . . . . 51.3.1 Transcriptomic analysis in ASD . . . . . . . . . . . . . . . . . . . 71.4 Meta-analysis in neuropsychiatry . . . . . . . . . . . . . . . . . . . . . . . 82 Meta-analysis of gene expression profiles in the blood and brain tissues ofindividuals with autism spectrum disorders . . . . . . . . . . . . . . . . . . . 102.1 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.1.1 Data retrieval, preprocessing and quality control . . . . . . . . . . . 102.1.2 Re-analysis of differential expression in existing autism data sets . . 172.1.3 Meta-analysis of differentially expressed genes . . . . . . . . . . . 182.1.4 Functional enrichment analysis . . . . . . . . . . . . . . . . . . . . 21iv2.1.5 Literature derived ASD candidates . . . . . . . . . . . . . . . . . . 222.1.6 Copy number variation enrichment analysis and prediction classifier 222.1.7 Network analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242.2.1 Systematic review shows technical differences and heterogeneity inindependent Autism Spectrum Disorders (ASD) transcriptome studies 242.2.2 Re-analysis for differential expression . . . . . . . . . . . . . . . . 272.2.3 Meta-analysis of differential expression . . . . . . . . . . . . . . . 292.2.4 Robust molecular commonalities are more evident in brain samplescompared to blood . . . . . . . . . . . . . . . . . . . . . . . . . . 322.2.5 Functional analyses reveal perturbations in metabolic processes . . 392.2.6 Shared signatures between autism and other neurodevelopmentalsyndromes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402.2.7 Meta-signature genes in rare structural variants associated with ASD 412.2.8 Network analysis and candidate gene characterization . . . . . . . . 453 Discussion and conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503.1 Similarities and differences between key findings and previous results . . . 503.2 Biological interpretations of meta-analyzed ASD expression profiles . . . . 543.3 Limitations and future directions . . . . . . . . . . . . . . . . . . . . . . . 553.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57A Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72vList of TablesTable 1.1 Data sets from transcriptomic analysis in ASD. . . . . . . . . . . . . . . 6Table 2.1 Summary of platform annotations from Gemma. Number of probes andunique genes for each platform were obtained from the Gemma platformdatabase. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12Table 2.2 Summary of tissue sources. . . . . . . . . . . . . . . . . . . . . . . . . 13Table 2.3 Samples excluded in each study. . . . . . . . . . . . . . . . . . . . . . . 14Table 2.4 Summary of diagnosis criteria and ASD phenotypes in the original stud-ies. Refer to Table 1.1 for study citations. . . . . . . . . . . . . . . . . . 25Table 2.5 Demographics I - Gender. Gender imbalance is seen in some data sets,such as GSE37772. OR: Odds ratio. . . . . . . . . . . . . . . . . . . . . 26Table 2.6 Demographics II - Age, PMI and race of subjects in each study. C:Caucasian or white; AA: African American; A: Asian; M: Mixed ormultiracial; U: Unknown . . . . . . . . . . . . . . . . . . . . . . . . . . 26Table 2.7 Differentially expressed genes in each data set after re-analysis. DE:Differentially expressed genes at FDR threshold of 0.05; Up: Up-regulatedgenes; Down: Down-regulated genes; Number of genes: Number ofgenes after applying filters. . . . . . . . . . . . . . . . . . . . . . . . . 27Table 2.8 Overlap between results reported in the literature and individual re- anal-ysis of differential expression. Significant probes: Per data set signifi-cant probes from re-analysis, reported at an false discovery rate (FDR)threshold of 0.05. Probes reported: Differentially expressed probespublished in original papers of each study. Gene symbols are used asa proxy for probes in GSE18123.1; GenBank accessions are used inGSE15451 and GSE15402; Spot IDs are used for GSE7329. GSE25507computed differences in expression variance instead of differential ex-pression; GSE37772 reported outlier genes instead of differentially ex-pressed genes; GSE32136 is not published. . . . . . . . . . . . . . . . . 29viTable 2.9 Overlap (overlap/total up or down-regulated in data set) between meta-signature (FDR <0.05) and significantly differentially expressed genesper data set (FDR <0.05), as well as enrichment of meta-signatures inthe results of individual differential expression analysis. One sided p-values were used to compute FDR here. AU-ROC: area under receiveroperating characteristic curve; AP: average precision. . . . . . . . . . . . 31Table 2.10 Comparisons of blood and brain signatures. AU-ROC reported for sig-nature of tissue A on ranked gene list from meta-analysis of tissue B(A-B). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32Table 2.11 Top genes in the ?cellular respiration? Gene Ontology (GO) category ata meta-analysis raw p-value threshold of 0.0001. There are a total of116 genes in this functional group. . . . . . . . . . . . . . . . . . . . . 39Table 2.12 Top genes in the Simons Foundation Autism Research Initiative (SFARI)?syndromic? category at a meta-analysis raw p-value threshold of 0.05.There are a total of 19 genes in this gene set. . . . . . . . . . . . . . . . 40Table 2.13 Dysregulated genes (FDR<0.05, meta-signature) within ASD-associatedCNV. Fisher?s exact test was used to compute significance. NS: Not sig-nificant. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42Table 2.14 Dysregulated genes in the brain that are found in known ASD CNVs.CNVs that span the same gene or set of genes are grouped together. . . . 42Table 2.15 Dysregulated genes in the blood that are found in known ASD CNVs. . . 44Table 2.16 Predictions on GSE37772 samples using preliminary copy number vari-ation (CNV) classifier. CV: cross validated; SV: support vectors; AU-ROC: AU-ROC computed for other 15q samples (originally predictedbut not confirmed). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45Table 2.17 Categorization of our candidate genes based on Neurocarta. Ndev.: neu-rodevelopment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48Table 2.18 Meta-signature genes that are also dysregulated in schizophrenia. Meta-analysis FDR <0.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49Table 3.1 Comparisons between core signature genes in blood and differentiallyexpressed genes reported in original studies. Total hits: Total hits re-ported in original study (Genes); Total genes analyzed: Estimated totalnumber of genes analyzed in each study based on Gemma platform an-notations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52Table 3.2 Similar to Table 3.1, for core signature genes in the brain. . . . . . . . . 53viiTable A.1 Up-regulated brain meta-signature. FDR Computed before removal ofsex-biased genes. A: Known Candidate; B: Gender Biased; C: KnownCNV. Y: Yes; N: No. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72Table A.2 Down-regulated brain meta-signature. FDR Computed before removalof sex-biased genes. A: Known Candidate; B: Gender Biased; C: KnownCNV. Y: Yes; N: No. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73Table A.3 Up-regulated blood meta-signature. FDR Computed before removal ofsex-biased genes. A: Known Candidate; B: Gender Biased; C: KnownCNV. Y: Yes; N: No. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75Table A.4 Down-regulated blood meta-signature. FDR Computed before removalof sex-biased genes. A: Known Candidate; B: Gender Biased; C: KnownCNV. Y: Yes; N: No. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79Table A.5 Genes that have been shown to exhibit sexual dimorphism in blood andbrain. Asterisks denote known ASD candidates. . . . . . . . . . . . . . 83viiiList of FiguresFigure 1.1 Two possible models leading to similar behavioral characteristics in ASD. 4Figure 2.1 Overview of analysis pipeline. . . . . . . . . . . . . . . . . . . . . . . 11Figure 2.2 Expression profiles of samples from the cerebellum differ from that ofthe cortex, as seen in the sample correlation matrix of GSE28521. . . . 14Figure 2.3 Samples obtained from the temporal cortex and frontal cortex of thesame individual exhibit highly correlated expression values. Data fromGSE28521 shown here. . . . . . . . . . . . . . . . . . . . . . . . . . . 15Figure 2.4 Batch effects: a) Clustering of datapoints into distinct batches with re-spective percentage variances, suggesting the presence of batch effects.b) Batch effects were removed after batch correction (includes robustprobes only). Each point marks a sample. Colours represent differentbatches; shapes indicate ASD or control. . . . . . . . . . . . . . . . . . 17Figure 2.5 Comparison between results obtained from the Fisher?s method and theMeta-Rank method. The peaks on the left suggests that genes rankedat the top are similar for both methods. . . . . . . . . . . . . . . . . . . 21Figure 2.6 pi0 values for each study against sample size. Error bars denote stan-dard errors for 100 bootstrap iterations. PB: peripheral blood; PBL:peripheral blood lymphocytes; LCL: lymphoblastoid cell lines; WB:whole blood. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28Figure 2.7 Profiles of meta-signatures from the blood and brain: raw p-values foreach individual data set are plotted against corrected p-values FDR ofthe meta-signatures. Local Polynomial Regression (LOESS) is used toobtain a smooth fit. The shaded areas represent 95% confidence inter-vals of the prediction using the t-based approximation (see ?stat smooth?in the ggplot2 R package) . . . . . . . . . . . . . . . . . . . . . . . . . 30ixFigure 2.8 Heat map visualizations of core-signatures expression values in eachof the brain data sets. Batch corrected expression values were scaledacross samples within each data set. Relative expression levels: yellow- high; blue - low. A different visualization for each core signature genecan be seen in Fig. 2.11 . . . . . . . . . . . . . . . . . . . . . . . . . . 33Figure 2.9 Gene expression levels of core blood signature. Relative expressionlevels: yellow - high; blue - low; grey - missing values. . . . . . . . . . 34Figure 2.10 P-values of core blood signature in individual studies. Deviation fromthe diagonal for quantile-quantile plots a) Up-regulated genes. b) Down-regulated genes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36Figure 2.11 P-values of core brain signature in individual studies. a) Up-regulatedgenes. b) Down-regulated genes. . . . . . . . . . . . . . . . . . . . . . 38Figure 2.12 Raw p-values of genes located in 15q11-13 (UBE3A, CYFIP1) Xp22(CDKL5), and 7q11.23 (RFC2). Top(Q-Q plots): The lack of an over-all deviation from the uniform diagonal suggests that the signals areskewed. Bottom: Per-data set p-value with a p-value threshold of 0.05(dashed grey); genes that meet an FDR threshold of 0.05 in the dataset are marked with a triangle. Compare with core-blood signatures inFigure 2.10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41Figure 2.13 PPIN network properties of core candidate genes in the blood and braincompared to that of respective random gene sets. . . . . . . . . . . . . 46Figure 2.14 Brain co-expression network properties of core candidate genes in theblood and brain compared to that of respective random gene sets. . . . . 47Figure 2.15 Liver co-expression network properties of core candidate genes in theblood and brain compared to that of respective random gene sets. . . . . 47Figure 2.16 Common brain meta-signatures between the autism (current study) andschizophrenia meta-analyses by Mistry et al. a) Up-regulated genes; b)Down-regulated genes. . . . . . . . . . . . . . . . . . . . . . . . . . . 49xGlossaryACRD Autism Chromosomal Rearrangement DatabaseADI-R Autism Diagnostic Interview, RevisedADOS Autism Diagnostic Observation ScheduleASD Autism Spectrum DisordersAP average precision, equivalent to the area under the precision-recall curve.AU-ROC area under receiver operating characteristic curve, equivalent to the Wilcoxonrank-sum test.CNV copy number variationDSM-IV Diagnostic and Statistical Manual of Mental Disorders, 4th editionDSM-5 Diagnostic and Statistical Manual of Mental Disorders, 5th editionFDR false discovery rateGO Gene OntologyMD mitochondrial dysfunctionPPIN protein-protein interaction networkPDD-NOS pervasive developmental disorder not otherwise specifiedSFARI Simons Foundation Autism Research InitiativexiAcknowledgmentsFirst and foremost I would like to thank my supervisor, Dr. Paul Pavlidis, whose brillianceand tenacity brought me this far.I would like to express my deepest appreciation for my thesis committee, comprising Dr.Jennifer Bryan and Dr. Suzanne Lewis, for their time and effort in reviewing my work.Special thanks to Dr. Steven Jones, the program director and examination chair.To research associates, Dr. Sanja Rogic and Dr. Elodie Portales-Casamar, thank you forproviding valuable advice throughout my research. I would also like to thank past andpresent Pavlidis lab members for their support.I am grateful to all investigators and institutions who made their data made publicly avail-able, as well as Dr. Christian Marshall (The Centre for Applied Genomics, The Hospital forSick Children, Toronto), who shared data from the Autism Chromosomal RearrangementDatabase (ACRD).Finally, many thanks to faculty members, staff members and funding agencies of the Cana-dian Institutes of Health Research (CIHR) Strategic Training Program in Bioinformaticsfor making this program a fulfilling one.xiiDedicationFor my family.xiiiChapter 1IntroductionNeuropsychiatry has come a long way in the last century. Thanks to technologicalingenuity, we saw a shift from Freudian psychoanalysis to the rigorous analyses of high-throughput biological information we have today. Apart from its magnitude, high-throughputbiology offers systems level insights which complement targeted approaches in molecularneurobiology.Biological information is encoded in various molecular forms, such as DNA, RNA andamino acids. The fundamental relationship between these components, as described in thecentral dogma, is that the most basic encoding, the DNA, will be transcribed to RNA andconsequently translated to proteins. New technologies quickly emerged after the comple-tion of the human genome project. These high-throughput technologies yielded a massiveamount of molecular data on human illnesses, which were digitalized and deposited intobio-repositories. These repositories continue to grow. But for many disorders, particularlycomplex ones like heart diseases or mental illnesses, the underlying biology remains cryp-tic. Knowledge accumulated over the years has proven that the mechanisms are far morecomplicated, even more so when the biological system in question is the human brain.Discussions on the cause of mental illnesses have spanned many disciplines. But ulti-mately the symptoms are manifestations of biological processes that occur in a life form,governed by the genome. To better understand the biological basis of a common and com-plex neuropsychiatric disorder, I conducted a comprehensive investigation on the geneexpression profiles (transcriptomes) of individuals with autism. Autism Spectrum Dis-orders (ASD) encompass a range of neuropsychiatric disorders that together, manifest ahighly heritable neurodevelopmental disease [2]. ASD is currently characterized as a setof behavioral characteristics including social communication deficits, as well as restrictive1and repetitive behaviors (DSM-5 299.00) [3]. In this introduction, I will first review thehistory of ASD and the current state of research. I will then describe analytical approachespreviously applied in neuropsychiatry, in pursuit of biological commonalities that mightexplain the autistic phenotype.1.1 History and early theories in autismAccording to the Oxford Dictionary, the word autism stems from the Greek word autos,meaning ?self?. In 1911, Eugen Bleuler used the term ?autism? to describe one of the fun-damental symptoms in schizophrenia [4]. He likened the autistic behavior to that of monksin monasteries or scientists absorbed with their studies. But schizophrenia has an adult on-set. The ?autism? that occurs in early childhood so known today was first described by LeoKanner in 1943. He presented eleven cases with what he called ?infantile autism?, repre-senting symptoms that distinguished these children from those with childhood schizophre-nia [5]. Some of the symptoms Kanner documented, such as communication impairment(echolalia) and obsessive repetitiousness, are used to this day. Remarkably, Hans Aspergerindependently reported similar findings around the same time (1944). Asperger describedan ?autistic psychopathy? in four boys, whose symptoms were similar to cases reportedby Kanner [6]. However, these boys did not have communication impairments. Basedon DSM-IV, individuals with this condition, later termed Asperger?s syndrome, would re-ceive a separate diagnosis from those with classical autism. The other autism ?subtypes?in DSM-IV are autistic disorder and pervasive developmental disorder not otherwise spec-ified (PDD-NOS). But these subtypes were removed in DSM-5 for several non-biologicalreasons that are debated.The implications of Kanner?s original report were threefold. Following its publicationin the 1940s, several psychogenic theories emerged, putting the blame on parents or ?refrig-erator mothers? for their children?s autism [7]. This was partially influenced by Kanner?sending remarks, that ?there are very few really warmhearted fathers and mothers? in thefamilies of the eleven children studied. Secondly, along with the advent of neuroimag-ing techniques, Kanner?s documentation of enlarged head size in five of his eleven casesshifted some of the focus to neuroanatomical abnormalities in individuals with autism [8].Another major implication which arose later in the 1970s, is the possibility of a heritablecomponent or inborn defect in this disorder, given the fact that it occurs in early infancy.Because initial single family or twin pair studies were inconclusive, it was not until 1977when Folstein and Rutter reported a study of 21 twin pairs that genetic influences in autismcame to light [9]. Folstein and Rutter also acknowledged the potential impact of environ-2mental factors. The exact mode of inheritance was unclear. But as evidence accrued, morestudies explored the neurobiology and genetics of autism, which in effect deemed claimsof ?bad parenting? invalid.1.2 Emerging theories in autism geneticsASD is strongly (perhaps primarily) influenced by genetics, but variations in single genesaccount for only a small fraction of cases. A relatively common ASD associated singlegene aberration, FMR1, constitutes merely 1-3% of the cases [10]. To understand the ge-netic basis of idiopathic autism, large collaborations and consortiums such as the AutismGenome Project (AGP) and Autism Genetic Resource Exchange (AGRE) were formed. Liketheir predecessors in linkage mapping, genome wide association studies (GWAS) on differ-ent cohorts yielded a few variants or loci that confer risk of ASD. But the effects were weakin that associations were only established with combined cohorts or ?mega-analysis? [11].Also the results were not entirely reproducible across studies [12].Other than monogenetic forms of autism, rare copy number variations (CNVS), bothtransmitted and de novo are perhaps the next ?well-established? genetic variation categorycontributing to ASD (in terms of the fraction of cases accounted for). Among the structuralvariants that show compelling evidence of associations with ASD are 7q11.23, 15q11.2-13.1and 16p11.2 [13, 14]. These findings were replicated in multiple individuals. While someof these structural variants have been implicated in other neurodevelopmental disorders,their etiological role in the brain remains poorly understood.More variants were identified with the availability of affordable next generation se-quencing technology. In a single issue of Nature (Vol. 485, 2012), three high resolutionexome sequencing studies were published [15?17], reporting many point mutations in cod-ing regions that presumably originated from germ line mutations (though there could beexceptions where mutations occur in the embryonic stages of development). I emphasizethat the ?genetics? I discuss here do not necessarily imply some form of inheritance, asthe term de novo suggests that pathological changes in risk genes occur in autistic childrenof healthy or unaffected parents. This is also the case in simplex families where only asingle child is diagnosed with autism or where siblings are discordant for autism. So find-ings of sporadic cases might explain the incomplete penetrance and variable expressivity.On the other hand, studies that focused on the heritable components have indeed identifieddeleterious variants in multiplex or consanguineous families [18, 19]. But because theseare isolated cases, and like known monogenetic forms, they usually come in a form of asyndrome, it is difficult to validate how they lead to ASD.3Figure 1.1: Two possible models leading to similar behavioral characteristics in ASD.4The underlying etiology of ASD is still unknown despite the recognition of a commonset of behavioral traits and intensive research. At present, there are several hundreds ofASD candidate genes. What molecular autism genetics have unfolded is the genetic het-erogeneity of the neurodevelopmental disorder. There is substantial biological variabilityamong individuals with autism, such that genetic variants identified in small fractions of in-dividuals may not be sufficient for solving the ?big-picture? or understanding the functionalarchitecture in an autistic brain. Given this complexity, there appears to be two models forhow ASD arises (Fig. 1.1). One is that many different genetic lesions lead to a common setof changes in the brain, which gives rise to a common range of behavioral traits. Alterna-tively, the behavioral manifestations may be due to widely varying underlying pathologies.The truth may lie between these two extremes, moreover complicated by the phenotypicheterogeneity of ASD. In recognizing the different manifestations of the disorder, Uta Frithfirst coined the term Autism Spectrum Disorder [20]. But there is a feeling that there must,at some level, be aspects in common beyond the behavior so diagnosed or documented.As the search for genetic biomarkers has not been tremendously fruitful, systems levelapproaches began to take place in search for a unifying theory in autism neurobiology.1.3 The search for convergence in the autism spectrumAutism was thought to be an uncommon disorder back in the 1970s when genetic influenceswere first explored. But with the widening of diagnostic criteria and increasing awareness,the number of autism cases has been rising, currently reporting an average prevalence ofapproximately 1% (1 in 100 children) worldwide according to the Centers for DiseaseControl and Prevention (CDC)1. Therefore, while quests for a unitary genetic element havediminished, there is growing interest in exploring other biological dimensions to find a?common ground? for autism.Besides GWAS, the search for converging endophenotypes or biomarkers has spannedmany modalities, including neuroanatomy, proteomics and transcriptomics. Markers high-lighted in imaging studies include facial features [21] and neural responses to facial ex-pressions [22]. In a recent protein interactome study, Sakai et al. revealed new interactionsamong ASD-associated genes [23]. Comparing transcriptomes of groups of individuals withASD to individuals without ASD has been another approach in the search for biological con-vergences among ASD cases (Table 1.1).Of the different biological strata, I am particularly interested in the transcriptome be-cause it marks the initial expression of the genome. An oversimplified concept is that a1http://www.cdc.gov/ncbddd/autism/data.html, retrieved in 2013.5common transcriptome profile gives rise to higher level similarities observed in the pro-teome, as well as physiological and anatomical properties of the brain. Unfortunately, eventhough some of them individually reported striking results, there seems to be little agree-ment when research findings are compared across related ASD transcriptome studies. I willprovide some background on transcriptome analysis in the section below.Number of samplesData sets Platform Reference Tissue type ASD:ControlBrainGSE28475 GPL6883 Chow et al. DLPFC, middlefrontal gyrus13 : 21GSE28521 GPL6883 Voineagu et al. Frontal (BA9)/Temporal Cortex(BA41,42,22)12 : 15GSE38322 GPL10558 Ginsberg et al. Occipital Cortex(BA19)4 : 629 : 42 = 71BloodGSE6575 GPL570 Gregg et al. Whole Blood 33 : 11GSE7329 GPL1708 Nishimura et al. LCL 7 : 6GSE15402 GPL3427 Hu et al. LCL 77 : 29GSE15451 GPL3427 Hu et al. LCL 15 : 12GSE18123.1 GPL570 Kong et al. Whole Blood 64 : 28GSE18123.2 GPL6244 Kong et al. Whole Blood 93 : 63GSE25507 GPL570 Alter et al. PBL 80 : 63GSE32136 GPL3427 Unpublished LCL 9 : 7GSE37772 GPL6883 Luo et al. LCL 232 : 199610 : 418 = 1028DLPFC: dorsolateral prefrontal cortexLCL: lymphoblastoid cell linePBL: peripheral blood lymphocytesTable 1.1: Data sets from transcriptomic analysis in ASD.61.3.1 Transcriptomic analysis in ASDI now take up the idea that commonalities among ASD cases might be discerned in thetranscriptome. One of the potential benefits of transcriptome analyses is it is removedfrom what are likely to be diverse genetic influences, though there are exceptions, suchas copy number sensitive genes. Furthermore it has been relatively easy and practical toanalyze transcriptomes, compared to analyzing proteomes, which involves more moleculardynamics and kinetics. To provide some perspective on the popularity of transcriptomeanalysis, the Gene Expression Omnibus (GEO)2 currently holds 732,789 RNA samples,173,588 genomic samples and 6,421 protein samples. The hypothesis is that molecularcommonalities might be revealed across transcriptomes of individuals, helping to explainthe autistic phenotype regardless of their genetic background or specific causal variantsunderlying their autism.In agreement with this, two previous studies reported some convergence in the tran-scriptomes of independent ASD cohorts [25, 28]. Nishimura et al. (2007) studied ASD indi-viduals with either maternally derived 15q11-13 duplications (15q for brevity), or fragile-Xmutations (FMR1-FM). They reported similarities in the molecular pathways affected be-tween the two syndromes. Voineagu et al. (2011) found evidence for convergent molecularabnormalities between gene expressions in post mortem brain samples and an independentcohort from a GWAS. However, while these reports described some agreements withinstudies, it is not clear how much agreement there is across different studies. For example,Nishimura et al.?s gene list was most enriched for ?cell communication?; Voineagu et al.reported enrichment of genes involved in ?synaptic function?, ?vesicular transport? and?neuronal projection?. Other transcriptome studies have implicated an even more diversearray of biological functions, ranging from circadian rhythms [29] to metabolism [26].Discrepancies in research findings can be partially attributed to methodological differ-ences. Among the differences seen in previous transcriptome studies are tissue type andthe expression profiling platform used. In autism, many researchers have turned to exam-ining biological samples that are more accessible, namely samples from the blood tissue.Although analyses in brain samples may be more relevant to the disorder, limited resourceposes problems for achieving significant statistical power. Unlike the genome that is, intheory, similar throughout an organism, the transcriptome comprises different composi-tions of RNA transcripts in different cell types and developmental stages. Because theblood and brain are fundamentally two different tissues with different biological roles, onewonders what the tradeoff is in using blood RNA samples to boost statistical power. A di-2http://www.ncbi.nlm.nih.gov/geo/summary/?type=samples, retrieved on July 30, 2013.7rect comparison between gene expressions from the blood and brain was inconclusive [34].A more recent study suggests that transcriptome profiles of different tissue types are distin-guishable [35], so cross tissue comparisons should be done with caution. However, crosstissue comparisons are often omitted in blood transcriptome studies, raising questions onwhether accurate functional inferences can be made from the results. I decided to take amore conservative approach, analyzing blood and brain samples separately.Autism research is evolving rapidly with next generation sequencing technology, butcurrently the larger fraction of transcriptome studies has been performed on microarrayplatforms. Therefore, we will focus on microarray expression profiling data. There is avariety of microarray platforms. But the fundamental process in microarray expressionprofiling is the hybridization of labeled RNA samples onto the DNA of known genes, alsoknown as probes. The specific details of how it is done depend on the manufacturer of themicroarrays or platform used. The readouts are similar, such that the abundance of eachRNA transcript probed is quantified (within a certain range). However, differences amongthese platforms have been documented. Experts found that the inconsistencies originatefrom varying experiment protocols, including preprocessing, quality control, and gene an-notations [36]. These incompatibilities become a problem when we are making compar-isons across studies, potentially contributing to the discrepancies in current research find-ings. Thankfully, such issues have been extensively addressed due to the widespread useof microarray technology [37]. It is possible to reconcile these differences by conduct-ing a systematic re-analysis and obtain comparable expression profiles that are platform orlaboratory independent. As far as we know, a detailed comparison or meta-analysis hasnot been conducted for ASD expression profiling studies. It remains unclear whether theremight be more subtle similarities among expression profiles of independent cohorts.1.4 Meta-analysis in neuropsychiatryThe explosion of genomic data is a double-edged sword. The information we have hasthrusted biomedical research in an unprecedented manner. But what comes with that isthe unrelenting chaos in research findings. There are many possible reasons why previousstudies report different genes and pathways as being affected in ASD, even if there are com-monalities present. One is the difference in tissues or cell types analyzed. Another likelysource of lack of consensus is clinical heterogeneity [38, 39], which might lead to somedifferences in research populations among studies. Also contributing are methodologicaldifferences in the design and implementation of analyses as mentioned earlier. Finally,small sample sizes of individual studies might not provide sufficient statistical power to8uncover subtle perturbations. These issues may mask reproducible aspects of the transcrip-tome in ASD, which might be revealed by re-examining the original data and performinga meta-analysis. A systematic meta-analysis can overcome sample size limitations andreduce the effects of methodological differences.The term ?meta-analysis? can be interpreted as the ?analysis of analyses? [40]. Likereview articles, meta-analyses aim to summarize findings of multiple independent researchin the field. However, meta-analyses can be more thorough, such that the primary data ofeach study is integrated and combined quantitatively. The summaries provided in literaturereviews are sometimes inconclusive and uninformative due to divergent findings [41] . Onthe other hand, a meta-analysis can yield new insights that were not previously discoveredin individual studies, leading to advances in the research area.Meta-analysis techniques have been successfully applied in diverse fields, including so-cial sciences and pharmaceutical studies. It is also gaining traction in neuropsychiatry inrecent years [42?44]. In designing a meta-analysis, one has to account for a number of fac-tors, including the information available for re-analysis and whether they are compatibleacross studies. Partly driven by the data, different methods were used in previous meta-analyses of expression profiles in neuropsychiatry: Mistry et al. (2013) first combinedexpression profiles across studies and subsequently computed for differential expressionwith a fixed effects linear model; Rogic and Pavlidis (2009) reanalyzed individual studieswith a fixed effects model and then combined the p-values; Choi et al. (2008) computeda consensus fold change. To our knowledge, cross-cohort gene expression analyses haveonly been done in at most two independent ASD cohorts, primarily for cross validation pur-poses [25, 31]. Other ASD related meta-analyses are geared towards examining pathogenicvariations in whole exomes of individuals [46, 47], not transcriptomes. A systematic inte-gration of expression data across multiple independent ASD cohorts will add value to theexisting data, and may yield novel insights. In the next chapters I will present researchmethods used, followed by the findings and a discussion on the subject matter.9Chapter 2Meta-analysis of gene expression profilesin the blood and brain tissues ofindividuals with autism spectrumdisordersWe report the meta-analysis of expression data from twelve ASD transcriptome studies(Table 1.1). Together, they comprise over 1000 human sample microarrays, 639 of whichare from ASD individuals (Fig. 2.1). Our analysis reveals a small number of genes with con-sistently altered expression levels in the brains or blood of individuals with ASD. The bloodand brain profiles are dissimilar, and thus have profound implications in the interpretationsof our findings. Functional analysis performed on the results of the meta-analysis sug-gests pathological convergence towards neurological and metabolic co-morbidities, both ofwhich have been previously associated with the disorder.2.1 Methods2.1.1 Data retrieval, preprocessing and quality controlWe retrieved gene expression data sets matching the keywords ?autism? or ?autistic? fromthe GEO1 [48]. There were no additional unique data sets found in ArrayExpress2. Short-1http://www.ncbi.nlm.nih.gov/geo/, retrieved on September 10, 2012.2http://www.ebi.ac.uk/arrayexpress/10Figure 2.1: Overview of analysis pipeline.listed data sets include human blood and brain expression profiling studies with case-control experiment designs only. A preliminary analysis was conducted on these datasets. Two studies in the initial pool, GSE4187 and GSE26415 were disqualified for anal-ysis. GSE4187 was removed because all the autistic subjects (case-control channels) inGSE4187 are also included in GSE15402 (after removing outlying samples), so it was re-moved to avoid biasing the analysis. GSE26415 was disqualified as an outlier, based onpreliminary analysis. This data set exhibits an implausibly large number of differentiallyexpressed genes, to the extent that it provided some evidence of differential expression fornearly all genes (estimated overall fraction of differentially expressed genes is 70% basedon q-value analysis; applying a false discovery rate (FDR) threshold of 0.05 yields 4857differentially expressed genes, nearly all being up-regulated in the ASD cases). The reasonfor this unusual finding is unclear but is in agreement with the original report by Kuwanoet al. [49], who found that 9784 probes were differentially expressed at an FDR of 0.05.The final set of twelve studies (Table 1.1) consist of data collected on a variety ofplatforms, including one channel intensity data from Affymetrix and Illumina platforms,11and two channel intensity data from Agilent and TIGR platforms (Table 2.1).Platforms Platform Name Gemma Probes Unique GenesGPL10558 Illumina HumanHT-12 V4.0 ex-pression beadchip47323 22020GPL1708 Agilent-012391 Whole Hu-man Genome Oligo MicroarrayG4112A44347 19123GPL3427 TIGR 40k Human Array 41472 15422GPL570 Affymetrix Human Genome U133Plus 2.0 Array54681 19460GPL6244 Affymetrix Human Gene 1.0 STArray33297 20404GPL6883 Illumina HumanRef-8 v3.0 expres-sion beadchip24526 17562Table 2.1: Summary of platform annotations from Gemma. Number of probesand unique genes for each platform were obtained from the Gemma platformdatabase.Raw data (e.g., .CEL, .MEV) are often processed with various methods that differacross studies. Whenever possible, we downloaded raw data files for data sets on Affymetrixand Illumina platforms and uniformly preprocessed them locally. Affymetrix data sets weresubjected to Robust Multi-array Analysis (RMA) from the affy [50] package in Bioconduc-tor. Illumina data sets were quantile normalized and log2 transformed using the lumi [51]package. data sets on the TIGR platform were not locally preprocessed as the submitters?preprocessing methods are similar across the studies. There is only one data set that usesAgilent arrays. Standard preprocessing is not necessary. Sample sources and tissue typesfor each data set are specified in Table 2.2.The processed data were then subjected to an additional set of quality controls. Weidentified and excluded 17 samples that were used in more than one study, retaining datafor the samples in the study with a smaller samples size. We removed eight samples fromsubjects with syndromic disorders of known genetic etiology (Fragile-X syndrome), ninenon-ASD cases with mental retardation, as well as samples which were prepared differentlythan the rest of the samples (e.g. formalin fixed). Cerebellar expression profiles differ fromthat of the cortex (Fig. 2.2). Ideally we would analyze the cerebellum separately, howeveronly two out of the three brain data sets had samples from the cerebellum, so we excluded12Tissue Type Tissue SourceBrainGSE28475 Dorsal lateral prefrontal cortex, middlefrontal gyrusHBTRC, NICHDGSE28521 Frontal cortex (BA9), temporal cortex(BA41/42 or BA22)ATP, HBBGSE38322 BA19 ATP, HBTRC, NICHDBloodGSE6575 Whole blood CHARGEGSE7329 Lymphoblastoid cell lines AGREGSE15402 Lymphoblastoid cell lines AGREGSE15451 Lymphoblastoid cell lines AGREGSE18123.1 Whole blood CHB, ACBGSE18123.2 Whole blood CHB, ACBGSE25507 Peripheral blood lymphocytes PhoenixGSE32136 Lymphoblastoid cell lines AGREGSE37772 Lymphoblastoid cell lines Simon Simplex CollectionHBTRC: Harvard Brain Tissue Resource Centre.NICHD: National Institute for Child Health and Human Development Brain and Tissue Bank.ATP: Autism Tissue Program.HBB: Harvard Brain Bank.CHARGE: Childhood Autism Risks from Genetics and the Environment.AGRE: Autism Genetic Resource Exchange.CHB: Children?s Hospital Boston.ACB: Autism Consortium Boston.Table 2.2: Summary of tissue sources.them entirely from the analysis. Further details of exclusion are in Table 2.3.We removed sample outliers in each study using a sample correlation analysis. Outlyingsamples were identified as those with correlation more than two standard deviations fromthe mean sample-to-sample expression profile correlation, and removed iteratively untilno samples met the threshold for removal. This resulted in the removal of a total of 54samples, affecting seven studies. The remaining samples in the data set were renormalizedusing quantile normalization.The identification of independent units is crucial for statistical analysis, because hiddencorrelations (Figure 2.3) can lead to biases and inflate statistical significance [52]. As thequestion of interest in our study concerns a biological comparison between whole organ-13Figure 2.2: Expression profiles of samples from the cerebellum differ from that of thecortex, as seen in the sample correlation matrix of GSE28521.Accession Samples excludedGSE6575 Removed non ASD subjects with mental retardation or developmentaldelay.GSE7329 Removed samples with a Fragile X (FMR1-FM) mutation.Remove sam-ples in 2005 batch (scan date).GSE15402 Remove samples in batches not performed by user ?KyungS?.GSE15451 Removed tissue samples that overlap with GSE32136 [Blood ID:HI0779, HI2022, HI2772, HI3143, HI3914, HI2044, HI2769, HI3144,HI4360, HI0777], as well as a subject that overlap with GSE15402[Subject ID: AU0325].GSE18123.2 Excluded samples without an assigned batch.GSE28475 Excluded seizure samples, in-vitro transcription (IVT) assays.Excludedformalin-fixed samples. Removed samples that are also present inGSE38322 [Subject ID: UMB4670, UMB1860].GSE28521 Excluded samples from the cerebellum. Removed subjects that over-lap with subjects in GSE38322 [Subject ID: AN19511, AN06420,AN08873, AN10833].GSE32136 Excluded propionic acid (PPA) treated samples and samples in batch?bcmmes? (user ID).GSE38322 Excluded samples from the cerebellum.GSE37772 Excluded samples from mothers.Table 2.3: Samples excluded in each study.14isms, that is individuals with ASD and individuals without ASD, an independent biologicalunit is then equivalent to a single sample from a unique subject. Multiple samples ob-tained from the same subject were regarded as technical replicates. Two of the studies(GSE28521, GSE28475) included such technical replicates for some specimens, in whichcase we computed the mean of the expression values to get a single expression profile foreach subject.Figure 2.3: Samples obtained from the temporal cortex and frontal cortex of the sameindividual exhibit highly correlated expression values. Data from GSE28521shown here.We also looked at possible batch effects whenever batch information is available. Batchinformation was obtained by automated extraction of ?scan dates? or ?users? in raw datafiles, as well as supplementary texts and metadata provided by submitters. To detect pos-sible batch effects, we compared the first two principal components to batch data, and15visually identify groups that are separated by the principal components. The amount ofvariation explained by each principal component is also reported (Figure 2.4). In this anal-ysis, we discovered 34 samples in which batch effects were confounded with the casegrouping. These samples were removed. In other data sets, we corrected for possible batcheffects using ComBat [53] after discarding probes that are missing in more than 20% ofthe samples. Batch effects could not be corrected in GSE28521 due to small batch sizes,such that priors cannot be estimated in ComBat. Illumina slide numbers were used as batchinformation here.(a)16(b)Figure 2.4: Batch effects: a) Clustering of datapoints into distinct batches with re-spective percentage variances, suggesting the presence of batch effects. b)Batch effects were removed after batch correction (includes robust probes only).Each point marks a sample. Colours represent different batches; shapes indicateASD or control.2.1.2 Re-analysis of differential expression in existing autism datasetsDifferential expression analysis was conducted using analysis of variance (ANOVA) basedon an empirical Bayes approach provided in the limma R package. For each data set, weconducted a two-group disease-control comparison for all probes. Phenotypic subgroupssuch as Asperger?s disorder and PDD-NOS were pooled into one generic autism diseasegroup. To consider the direction of expression change in the meta-analyses, we computed17one-tailed p-values from the resulting two-tailed p-values and t-statistics. Probes wereannotated with platform specific annotations in Gemma3, where gene assignments are madebased on current genome annotations obtained via sequence analysis [54]. Each data setwas then collapsed to the gene level to allow cross-platform integration. When multipleprobes map to a single gene, we assign the Bonferroni corrected minimum p-value (p-valueof best scoring probe, n?min(p) < ?) to the gene, as the smallest p-value is least likely tooccur by chance [37]. We excluded probes that map to multiple genes or do not map to agene at all from the analysis. The proportion of differentially expressed genes (pi1 = 1?pi0)was estimated using the qvalue package in R. As the internal ?bootstrap? method in thepackage does not return a standard error, we computed pi0 standard errors over one hundredbootstrap iterations locally. pi0 values from the ?bootstrap? and ?smoother? method weresimilar.We compared the results from our re-analysis to that of previous publications, and eval-uated the outcomes using the area under receiver operating characteristic curve (AU-ROC),as well as average precision (AP). The AU-ROC gives a probability for which a true positiveis ranked higher than a false positive. AP gives us the amount of correct hits from the topN ranked genes, averaged across N = {1...n}, where n is the total number of genes in therankings. While AU-ROC is an evaluation of where the true positives lie relative to the falsepositives, AP is sensitive to genes with higher rankings (top hits). The same threshold freemethods are also used in later sections.2.1.3 Meta-analysis of differentially expressed genesSeveral meta-analysis techniques have been implemented in previous studies mentioned inthe introduction. We abandoned the idea of combining raw data because the studies wesurveyed were conducted on very different analysis platforms. One example of incompat-ibility is that data from two channel arrays are reported as expression ratios, whereas onechannel arrays provide expression levels. This leaves us with the option of analyzing in-dividual studies separately, and subsequently combining the results. As we have seen inthe examples, one can either combine significance levels (p-value) or the effect size (foldchange) of the genes. These two approaches are related, but because the size of studyis accounted for in the test of significance [45], we chose this approach, as implementedby Rogic and Pavlidis (2009).Fishers combined probability test [55] was applied independently to the blood and braindata sets. Genes were only analyzed if they were represented in at least three data sets3http://www.chibi.ubc.ca/Gemma/arrays/showAllArrayDesigns.html, retrieved on October 10, 2012.18in each of the meta-analyses. 18994 and 16591 genes were included in the blood andbrain meta-analyses respectively. The resulting p-values were corrected for multiple testingusing Benjamini-Hochberg?s false discovery rate approach [56]. A second meta-analysismethod, ?Meta-Rank analysis?, gave similar results. Details of both methods are providedin the following subsections. Because of the gender imbalance in some of the data sets,we excluded from downstream analysis genes which were known or strongly suspected toshow changes in expression between genders. This set of genes include Y-linked genes,X-linked genes that escape X-inactivation (genes with strong evidence only; n=61) [57],as well as autosomal genes previously shown to exhibit sexual dimorphism [58, 59] (Totalnumber of genes excluded: Brain = 202; Blood = 116; see A.5 ). We noted that some of thegenes so filtered (e.g., USP9Y and KDM5C) have been previously associated with ASD,but we were unconfident we could discriminate gender from disease effects for them in ouranalysis.The combined probability method is sensitive to outliers; that is, a single study with avery low p-value can result in statistical significance even when the other studies providelittle evidence for rejection of the null. To control for this, we used a jackknife approachto further select for genes that are robust to statistical outliers (a similar approach was usedin Mistry et al. (2013)). The jackknife procedure involves repeating the meta-analysis ktimes, where k is the number of data sets, For each trial k, the kth data set is left out. Theagreement among these k jackknife meta-analyses was used as a basis for identifying a?core? signature that excludes genes appearing due to the influence of a single data set.Fisher?s combined probability testFisher?s combined probability test [55] takes the raw p-value calculated from the individ-ual differential expression analysis for a gene across all data sets and combines them togenerate a summary statistic (S) using the equation:Si =?2k?j=1log(p j)where k =number of studies and p j =p-value of gene i in study j.Using this method, we combined our results from multiple independent tests, all havingthe same null hypothesis (no difference between autism and control groups). Under the nullhypothesis, the resulting test statistic has a ?2 distribution. P-values for the meta-analysiscan then be obtained from the test-statistic using the ?2 distribution with 2k degrees offreedom. We corrected the p-values for multiple testing using the Benjamini Hochberg [56]19correction method. The resulting FDR represents the proportion of false positives amongall the positive results returned at a given threshold.Meta-rank analysisThe meta-rank analysis is a rank aggregation strategy that involves using the average rankof the gene instead of the combination of p-values. For each individual gene, its rank wasdetermined by their order of p-values. The smallest p-value in an experiment would havethe highest rank. The meta-rank of gene i is computed by averaging the ranks across datasets 1 to k,Ri =1kk?j=1Ri jThe ranks of these averages were then computed for all genes. Unlike the Fisher?smethods where we can detect individual data sets that produce deviant significance values,this method is less sensitive to the influence of a single data set. Another difference is thatthe distribution of ranks cannot be represented by a known statistical distribution.In order to obtain a test statistic, we computed the permutation null distribution. Werandomly permuted the p-values in each study, and recalculate the metric R. Repeating theprocess 10000 times gives an M?10000 matrix of permuted values, where M is the numberof genes. The permutation null is then the empirical distribution of all values in that matrix.We can assess for significance by testing against the permutation null.F(x) is a function of the empirical cumulative distribution of the permutation null,where x is a random variable, which is, in this case, the metarank of gene i. This wascomputed for the number of studies, k = 3,4,5, ...,9 for the null distribution of blood dataand k = 3 for brain data. To ensure that the meta-signature genes are not sensitive to thechoice of the meta-analysis method used, we reanalyzed both blood and brain data setsusing the method described. The rank correlations between these two methods are 0.80and 0.59 for brain and blood data sets respectively (averaged over up-regulated and down-regulated lists), suggesting that there are some discrepancies between these methods. How-ever, by quantifying the predictive power of this method with respect to meta-signaturesfrom Fisher?s method, we observed that the choice of method will not have a substantialeffect on our selection of the top hits in the signatures (Figure 2.5). Subsequent functionalanalyses were thus based on results from Fisher?s method.20Figure 2.5: Comparison between results obtained from the Fisher?s method and theMeta-Rank method. The peaks on the left suggests that genes ranked at the topare similar for both methods.2.1.4 Functional enrichment analysisGene set enrichment analysis was performed using ErmineJ Version 3.04 [60], a softwarefor determining enrichment of Gene Ontology (GO) [61] terms for a given gene list. GOterms represent a controlled vocabulary that links a certain molecular function, biologicalprocess, or cellular component to genes. We focused on terms under the ?biological pro-cess? tree for our analysis. Significant enrichment of a specific GO term could suggest thatthe gene list is enriched for genes involved in a particular biological process. We used thePrecision-Recall method. Precision-Recall uses average precision as a scoring function,thus it is sensitive to genes at the top of the rankings, without having to set a threshold.4http://erminej.chibi.ubc.ca21For each run, the negative log of Fisher?s corrected p-values were used as the input, test-ing against gene sets within the 5? 200 size range over 500000 iterations [62]. ErmineJalso accounts for the ?multifunctionality? bias of the gene sets (refer to ErmineJ?s manual5for more details on ?multifunctionality?). Gene sets that are less affected by this bias areprioritized.We also downloaded candidate gene categories from the Simons Foundation AutismResearch Initiative (SFARI)6 database. Seven categories were established by SFARI basedon their gene scoring syndrome - 1. High Confidence; 2. Strong Candidate; 3. SuggestiveEvidence; 4. Minimal Evidence; 5. Hypothesized; 6. Not Supported; S Syndromic. Cate-gories 1 and 6 were excluded for analysis. This is because none of the genes met SFARI?srequirements for the ?High Confidence? category (zero genes). The latter is irrelevant be-cause these genes show no association with ASD.2.1.5 Literature derived ASD candidatesKnown ASD candidate genes were downloaded from Neurocarta7 [1], a knowledge base ofgene-phenotype associations encompassing 664 unique genes linked to ASD. This includescandidate genes from model organisms (mouse = 11, zebrafish = 1) which were mapped totheir human homologs using HomoloGene8 [63]. In addition to studies in Table 1.1, dif-ferentially expressed genes reported in two additional expression profiling studies [64, 65]for which data were not publicly available were also obtained and compared to our re-sults. Because autism is historically associated with schizophrenia, we were also inter-ested to see if there were similarities between the autism meta-signatures and schizophreniameta-signatures. We obtained differentially expressed genes reported in a meta-analysis ofschizophrenia expression profiles [42] and compared it with our brain meta-signatures.2.1.6 Copy number variation enrichment analysis and predictionclassifierWe collated CNV data from the Autism Chromosomal Rearrangement Database (ACRD)9 [14],Sanders et al. [13] (Table S4 in original study) as well as Pinto et al. [66] (Table S8 in orig-inal study), obtaining a total of 1023 CNVS. These variants are thought to be pathological,but to ensure uniformity, we computed their frequencies in the Database of Genomic Vari-5http://erminej.chibi.ubc.ca/help/tutorials/multifunctionality/6www.sfari.org, retrieved in December 2012.7www.neurocarta.chibi.ubc.ca, retrieved in February 2013.8ftp://ftp.ncbi.nih.gov/pub/HomoloGene, build 67.9http://projects.tcag.ca/autism/22ants (DGV)10 [67] as described in Sanders et al.. We identified seven common CNVS. Afterlifting the genomic coordinates genes over to hg18 (to match CNV data) using the UCSCliftOver tool11 [68] , Fisher?s exact test was used to compute global enrichment of dysregu-lated genes in ASD-associated rare CNV (n = 1016). To reduce the amount of overlap for theexact test, we merged individual CNVS into CNV regions using a 90% reciprocal overlap.We also merged small CNVS that are completely nested within larger CNVS by taking thebreakpoints of the larger CNV (union). The total number of merged CNV regions is 732(Gain = 385, Loss = 340, Unknown = 1).We included all classes of CNV transmissions (in-herited, de novo, unknown). Restricting our analysis to de novo CNV did not substantiallyaffect the results.Using expression profiling data, we also sought to predict the CNV status of an indi-vidual. We used Gist12 [69], a support vector machine and kernel principal componentsanalysis software toolkit to build a preliminary CNV classifier. The performance of the re-sulting classifier was evaluated using leave-one-out cross validation. We used expressionprofiles from GSE7329 for training and GSE37772 for testing. Because the training andtest data are different in several aspects, we attempted to make the data comparable byscaling the expression values or computing values relative to expression levels of a housekeeping gene, ?GAPDH?. Because this is exploratory, we used default parameters, focus-ing on the predicted gene ranks (based on discriminants) rather than the predicted classlabels.2.1.7 Network analysisWe conducted the network analysis on a human protein-protein interaction network (PPIN).The PPIN network comprise data from the Human Protein Reference Database (HPRD) [70],Molecular Interaction Database (MINT) [71], Database of Interacting Proteins (DIP) [72],innateDB [73] and irefIndex [74]. With the aggregated network, we computed local net-work properties for the core candidate gene sets. 10000 random gene sets with similar sizeand node degree (?50 window) were sampled from the network to construct permutationdistributions of the average shortest path length (Dijkstra?s algorithm) and local clusteringcoefficient [42]. The same analysis was repeated on aggregated co-expression networksof brain or liver tissues, except that they were separately performed for up-regulated anddown-regulated gene sets.10http://projects.tcag.ca/variation/, retrieved in March 2013.11http://genome.ucsc.edu/cgi-bin/hgLiftOver12http://www.chibi.ubc.ca/gist/232.2 Results2.2.1 Systematic review shows technical differences andheterogeneity in independent ASD transcriptome studiesWe analyzed twelve independent ASD expression profiling studies and identified differ-ences in microarray preprocessing and data quality control. To ensure comparability amongdata sets from different laboratories, we identified and corrected for technical variationwhere possible (Figure 2.1). From the original total of 1371 samples, the resulting data af-ter quality control comprise 639 ASD microarray samples and 460 controls, which sum to1099 samples from both blood-derived and brain tissues. As summarized in Table 2.4, thereare differences in the criteria used to select the pool of ASD individuals. Some individu-als were diagnosed based on DSM-IV [75]; others were determined using alternative formsof evaluation such as the Autism Diagnostic Interview, Revised (ADI-R) [76] and AutismDiagnostic Observation Schedule (ADOS) [77]. More importantly, the range of autisticphenotypes included in each cohort differs, particularly among the blood studies. Whilesome focused on classical autism, other included milder forms like Asperger?s syndromeand PDD-NOS.We then compared the original lists of significantly differentially expressed genes to seeif there is any consistency in previous results. None of the genes reported overlapped acrossall brain data sets or blood data sets. Though this is partially due to different significancethresholds or filters (so there could be some overlaps in smaller subsets of studies thatare more similar in some aspects), there is generally no evidence of concordance. Weexpect the heterogeneity to influence the results of our study too. In later sections, we willdemonstrate our approach in circumventing the problem and the results achieved.Because ASD is generally more prevalent in males than females [29, 30], we also in-vestigated whether gender imbalance was a factor affecting study designs. Indeed, a fewstudies showed evidence of gender imbalance (Table 2.5). There were no striking differ-ences in the age, race and post-mortem interval (the latter being relevant to brain studiesonly) between cases and controls of each study (Table 2.6).24Diagnosis Criteria Phenotypic descriptionsBrainGSE28475 ADI-R, ADOS, TARF, medical records AutismGSE28521 Available upon request, includes ADI-Rdiagnostic scores, AN-Brain Bank CaseNumberAutismGSE38322 ADI-R AutismBloodGSE6575 DSM-IV, ADI-R, ADOS Autism no regression, autismwith regressionGSE7329 ADI-R, ADOS, Raven-IQ ASD with dup(15q)GSE15402 ADI-R, Raven?s score, Peabody PictureVocabulary TestASDaGSE15451 ADI-R ASDbGSE18123.1 DSM-IV-TR, ADOS, ADI-R, compre-hensive clinical testingAutism, Asperger?s Disorder,PDD-NOSGSE18123.2 DSM-IV-TR, ADOS, ADI-R, compre-hensive clinical testingAutism, Asperger?s Disorder,PDD-NOSGSE25507 DSM-IV, ADOS, ADI-R classical autismGSE32136 - ASDGSE37772 Refer to the SFARI database for phenotypeinformationAutisma Language, Mild, Savant (cluster analysis of ADI-R scores)b severe language impairment (cluster analysis of ADI-R scores)ADI-R: Autism Diagnostic Interview-RevisedADOS: Autism Diagnostic Observation ScheduleTARF: The Autism Research FoundationDSM-IV: Diagnostics and Statistical Manual of Mental Disorders IV, (TR: text revision)SFARI: Simons Foundation Autism Research InitiativeTable 2.4: Summary of diagnosis criteria and ASD phenotypes in the original studies.Refer to Table 1.1 for study citations.25ASD ControlMale Female Male Female OR TotalBrainGSE28475 11 2 18 3 0.92 34GSE28521 8 4 14 1 0.14 27GSE38322 4 0 6 0 - 10BloodGSE6575 28 5 8 3 2.10 44GSE7329 7 0 6 0 - 13GSE15402 77 0 29 0 - 106GSE15451 15 0 12 0 - 27GSE18123.1 64 0 28 0 - 92GSE18123.2 72 21 30 33 3.80 156GSE25507 80 0 63 0 - 143GSE32136 9 0 7 0 - 16GSE37772 198 34 105 94 5.20 431Table 2.5: Demographics I - Gender. Gender imbalance is seen insome data sets, such as GSE37772. OR: Odds ratio.Age range (years) PMI RaceASD Control ASD ControlBrainGSE28475 2-56 3-56 4-43.25 5-36 C, AA, U, MGSE28521 5-51 6.75-43.25 4.75-32.92 16-56 Predominantly C, AGSE38322 2-39 4-22.5 13-24.2 1-60 C, UBloodGSE6575 matched matched - - -GSE7329 - - - - -GSE15402 5-28 3-34 - - Predominantly C, A, MGSE15451 4-12 2-12 - - Predominantly C, UGSE18123.1 3.4-17.5 2.8-16 - - Predominantly C, A, U, MGSE18123.2 2-21 2.5-22 - - Predominantly C, A, U, MGSE25507 2-14 3-11 - - Primarily CGSE32136 - - - - -GSE37772 4-17.7 3-23.8 - - C and non-CTable 2.6: Demographics II - Age, PMI and race of subjects in each study. C: Caucasianor white; AA: African American; A: Asian; M: Mixed or multiracial; U: Unknown262.2.2 Re-analysis for differential expressionThe first stage of our meta-analysis was to analyze each data set individually for differentialexpression. The results are summarized in Table 2.7. Most data sets had low levels of dif-ferential expression, but a few range up to hundreds of significantly differentially expressedgenes at a FDR of 0.05.DE Up Down 1?pi0 Number of genes Number of samplesBrainGSE28475 0 0 0 0.20 16598 34GSE28521 4 1 3 0.25 16598 27GSE38322 0 0 0 0.15 19558 10BloodGSE6575 0 0 0 0.00 18305 44GSE7329 314 160 154 0.41 17159 13GSE15402 5 1 4 0.11 9821 106GSE15451 0 0 0 0.04 12066 27GSE18123.1 333 103 230 0.27 18305 92GSE18123.2 57 35 22 0.47 18617 156GSE25507 2 2 0 0.28 18305 143GSE32136 0 0 0 0.10 8100 16GSE37772 0 0 0 0.00 16598 431Table 2.7: Differentially expressed genes in each data set after re-analysis. DE: Dif-ferentially expressed genes at FDR threshold of 0.05; Up: Up-regulated genes;Down: Down-regulated genes; Number of genes: Number of genes after apply-ing filters.We checked if sample size or FDR threshold could explain the variable amount of dif-ferential expression. If one assumes the effect size of ASD on expression is similar acrossstudies, the amount of differential expression should be consistent. A comparison betweenthe estimated proportion of differentially expressed genes (1?pi0) and sample size showsthat this is clearly not the case for these data (Fig. 2.6). We also grouped pi0 values by tissuetype, cell type and platform type, but there were no obvious trends. There are other possibleexplanations such as phenotype heterogeneity or comorbidities for this phenomenon, butwe were unable to quantify or directly address these factors with the information available.We next compared the result of each analysis to that previously published for the samedata set, where available. This was done by examining where the differentially expressedgenes from the original studies rank in our results (using the AU-ROC). Despite the ex-tensive additional data cleanup we performed and differences in the statistical analysis27Figure 2.6: pi0 values for each study against sample size. Error bars denote standarderrors for 100 bootstrap iterations. PB: peripheral blood; PBL: peripheral bloodlymphocytes; LCL: lymphoblastoid cell lines; WB: whole blood.methods, our re-analyses were generally concordant with the original reports (Table 2.8).28Data sets Significant probes Probes reported Overlap AUC Precision(%)GSE15402 73 530 65 0.98 58.70GSE15451 0 45 0 0.86 1.58GSE18123.1 284 489 43 0.79 8.34GSE18123.2 69 610 43 0.92 31.00GSE25507 - - - - -GSE28475 0 200 0 0.89 6.94GSE28521 4 588 2 0.89 21.50GSE32136 - - - - -GSE37772 - - - - -GSE38322 0 41 0 0.98 6.84GSE6575 0 55 0 0.96 3.34GSE7329 596 1281 339 0.95 44.10Table 2.8: Overlap between results reported in the literature and individual re- analy-sis of differential expression. Significant probes: Per data set significant probesfrom re-analysis, reported at an FDR threshold of 0.05. Probes reported: Differen-tially expressed probes published in original papers of each study. Gene symbolsare used as a proxy for probes in GSE18123.1; GenBank accessions are used inGSE15451 and GSE15402; Spot IDs are used for GSE7329. GSE25507 computeddifferences in expression variance instead of differential expression; GSE37772reported outlier genes instead of differentially expressed genes; GSE32136 is notpublished.2.2.3 Meta-analysis of differential expressionA key observation at this point is that most of the data sets showed clear evidence fordifferential expression (pi0 < 1), but were largely underpowered to separate differentiallyexpressed genes from the background. In the re-analyses, there was also no overlap acrossany of the studies among the genes selected at an FDR of 0.05. We hypothesized that theremight still be similarities among the studies that would emerge in a combined or meta-analysis. We therefore applied a p-value combination strategy, choosing to analyze theblood and brain data sets separately. This approach combines the results for all the datasets without applying any statistical threshold, and thus provides a p-value for all the genesanalyzed. The meta-analysis yields four ranked gene lists: one pair each for blood andbrain, with separate lists for up- and down-regulation, noting that at this stage they containall the genes considered without applying a threshold.We then compared the results of individual study re-analyses to the ranked gene lists. Ifeach data set contributes some signal in the meta-analysis, their results should individuallyresemble the ranked gene lists. Generally, the trends we observed concur with the amount29of differential expression estimated (1? pi0) for each data set. Data sets with more dif-ferential expression displayed stronger associations with the results of the meta-analyses.As shown in Fig. 2.7A, there is a clear similarity among the three brain data sets in theircontributions towards the final gene rankings, as evidenced by the similar trend lines forall three data sets. In contrast, the blood data sets were in lower agreement, with a frac-tion showing a stronger relationship to the meta-analysis results while others show weakassociations (Fig. 2.7B).Figure 2.7: Profiles of meta-signatures from the blood and brain: raw p-values foreach individual data set are plotted against corrected p-values FDR of the meta-signatures. Local Polynomial Regression (LOESS) is used to obtain a smoothfit. The shaded areas represent 95% confidence intervals of the prediction usingthe t-based approximation (see ?stat smooth? in the ggplot2 R package)Applying a threshold to these rankings yielded blood and brain ?meta-signatures?. Atan FDR threshold of 0.05, 30 up-regulated genes and 49 down-regulated genes were foundin the brain. The blood meta-analysis yielded 111 up-regulated and 87 down-regulatedhits (see appendix A.1-A.4). As the number of studies in the brain and blood differ, wecannot tell whether the smaller number of brain hits is due to an underpowered analysis, orsimply due to a weaker biological effect in the brain. While the studies were balanced for30covariates such as age and post-mortem interval (for the brain data), we checked the listsfor genes previously reported to be influenced by these factors [78]. There were minimaloverlaps, confirming that our results were not influenced by them. Genes affected by sexdifferences were removed in the results reported here, though they can be found in theappendix for reference.We further characterized the relative contributions of each data set towards the hits weobtained, to more directly identify any single study that ?drives? genes towards significancein the meta-analyses. By assessing the amount of overlap between meta-signatures anddifferential expression in each data set, we quantified the contribution of each data set tothe meta-analysis (Table 2.9). For instance, the results of GSE28521 analyzed alone canidentify, with relatively high precision (29.47%), down-regulated genes in the brain meta-signature. Overall, GSE28521, along with GSE18123.1 and GSE7329 are studies thatappeared to have a strong impact on the meta-analysis. As described in the next section weimplemented procedures to find genes robust to the selection of data sets.Up-regulated AUC AP(%) Down-regulated AUC AP(%)BrainGSE28475 2/3 0.92 10.77 0/0 0.90 5.60GSE28521 0/0 0.96 15.33 5/5 0.94 29.47GSE38322 0/0 0.90 5.70 0/0 0.84 10.80BloodGSE15402 0/1 0.78 3.31 0/29 0.71 1.35GSE15451 0/0 0.54 0.80 0/0 0.55 1.22GSE18123.1 16/92 0.82 8.36 28/235 0.84 15.01GSE18123.2 13/38 0.86 10.73 2/9 0.74 5.41GSE25507 0/3 0.67 2.83 0/0 0.59 2.03GSE32136 0/0 0.76 5.95 0/0 0.69 2.18GSE37772 0/2 0.65 0.97 0/0 0.57 0.80GSE6575 0/0 0.67 3.05 0/0 0.67 1.69GSE7329 25/183 0.84 13.21 20/234 0.78 5.84Table 2.9: Overlap (overlap/total up or down-regulated in data set) between meta-signature (FDR <0.05) and significantly differentially expressed genes per dataset (FDR <0.05), as well as enrichment of meta-signatures in the results of indi-vidual differential expression analysis. One sided p-values were used to com-pute FDR here. AU-ROC: area under receiver operating characteristic curve;AP: average precision.31To see if the meta-signatures in blood and brain are similar, we quantified the recipro-cal predictive value of meta-signatures from both tissue types using AU-ROC. The bloodmeta-signatures were randomly placed in the brain ranked gene list, and vice versa (Ta-ble 2.10). Thus there was no indication of a common signature between the blood andbrain, supporting our choice for conducting separate analyses.Upregulation DownregulationBlood-Brain 0.51 0.52Brain-Blood 0.51 0.59Table 2.10: Comparisons of blood and brain signatures. AU-ROC reported for signa-ture of tissue A on ranked gene list from meta-analysis of tissue B (A-B).2.2.4 Robust molecular commonalities are more evident in brainsamples compared to bloodTo focus our attention on the genes that show the strongest concordance across studies, weemployed a jackknife procedure. Jackknifing yields multiple lists of gene ranks equivalentto the number of the data sets in the meta-analysis, where each list is the result with one dataset left out. We initially performed this at the same stringency as the initial meta-analysis,applying an FDR threshold of 0.05 for every jackknife result. With this conservative ap-proach, we identified 10 genes from the blood data for which significant values are notdominated by any single data set, but none from the brain. Because removing data sets re-duces power, to establish a less stringent criterion for identifying robust patterns, we defineour core signatures as the intersection of the top 200 (arbitrary cut off) genes retrieved fromeach leave-one-out iteration [42]. From this analysis, the core blood signature consists of15 up-regulated genes and 8 down-regulated genes (corresponding FDRup= 0.14, FDRdown= 0.15). 15 up-regulated genes and 10 down-regulated genes were observed in the corebrain signature (corresponding FDRup = 0.29, FDRdown = 0.24). We visualized these coresignatures using heat maps of the gene expression levels for each sample in the twelve datasets meta-analyzed. While there is a relative lack of a clear pattern in blood data sets, theheat maps for the core brain hits showed good concordance across all three brain data sets(Fig. 2.8).32Figure 2.8: Heat map visualizations of core-signatures expression values in each ofthe brain data sets. Batch corrected expression values were scaled across sam-ples within each data set. Relative expression levels: yellow - high; blue - low.A different visualization for each core signature gene can be seen in Fig. 2.11The relatively noisy expression profiles of the core blood signature (Fig. 2.9) exposegeneral difficulties in detecting genes with heterogeneous expression levels. Very few genesexhibited robust concordance when visualized individually (Fig. 2.10).33Figure 2.9: Gene expression levels of core blood signature. Relative expression lev-els: yellow - high; blue - low; grey - missing values.34(a)35(b)Figure 2.10: P-values of core blood signature in individual studies. Deviation fromthe diagonal for quantile-quantile plots a) Up-regulated genes. b) Down-regulated genes.36(a)37(b)Figure 2.11: P-values of core brain signature in individual studies. a) Up-regulatedgenes. b) Down-regulated genes.382.2.5 Functional analyses reveal perturbations in metabolic processesTo explore gene functional themes in our data, we conducted a threshold-free enrichmentanalysis (using the full list of ranked genes). None of the functions tested for were sig-nificantly enriched in the blood. The brain was enriched for genes involved in ?cellularrespiration? (GO:0045333, FDR = 0.11), suggesting differences at a functional level be-tween individuals with and without autism. An analysis using the three jackknifed genelists from the brain data (that is, meta-analysis of each pair of data sets) showed that theresult is robust. Dysregulated genes in this functional group are shown in Table 2.11. Othertop enriched functions were also related to respiration including GO:0022904 (?respiratoryelectron transport chain?) and GO:0022900 (?electron transport chain?).Gene Symbol Gene Name p-valueATP5O ATP synthase, H+ transporting, mitochondrial F1 complex, Osubunit1.83E-05UQCRQ ubiquinol-cytochrome c reductase, complex III subunit VII,9.5kDa5.45E-05UQCRC1 ubiquinol-cytochrome c reductase core protein I 1.86E-04CYC1 cytochrome c-1 2.90E-04COX5B cytochrome c oxidase subunit Vb 2.98E-04NDUFA11 NADH dehydrogenase (ubiquinone) 1 alpha subcomplex, 11,14.7kDa4.38E-04ATP5L ATP synthase, H+ transporting, mitochondrial Fo complex, sub-unit G4.53E-04UQCR10 ubiquinol-cytochrome c reductase, complex III subunit X 4.53E-04UQCRC2 ubiquinol-cytochrome c reductase core protein II 5.25E-04NDUFA13 NADH dehydrogenase (ubiquinone) 1 alpha subcomplex, 13 5.35E-04SLC25A12 solute carrier family 25 (aspartate/glutamate carrier), member 12 5.37E-04FH fumarate hydratase 7.55E-04UQCR11 ubiquinol-cytochrome c reductase, complex III subunit XI 7.74E-04NDUFS4 NADH dehydrogenase (ubiquinone) Fe-S protein 4, 18kDa(NADH-coenzyme Q reductase)8.29E-04IDH3A isocitrate dehydrogenase 3 (NAD+) alpha 9.06E-04Table 2.11: Top genes in the ?cellular respiration? GO category at a meta-analysis rawp-value threshold of 0.0001. There are a total of 116 genes in this functionalgroup.392.2.6 Shared signatures between autism and otherneurodevelopmental syndromesA natural question is whether any of the signature genes are known ASD candidates re-ported in previous genetics or functional studies. We first checked for overall patterns ofenrichment based on the ranked gene lists from the blood and brain. We observed enrich-ment of genes in the SFARI ?syndromic? category (FDR=0.15; Table 2.12) in the bloodsignature.Gene Symbol Gene Name p-valueUBE3A ubiquitin protein ligase E3A 1.76E-06CDKL5 cyclin-dependent kinase-like 5 1.43E-03DMD dystrophin 2.36E-03SHANK3 SH3 and multiple ankyrin repeat domains 3 6.58E-03HOXA1 homeobox A1 2.02E-02PTEN phosphatase and tensin homolog 2.09E-02TSC1 tuberous sclerosis 1 3.09E-02Table 2.12: Top genes in the SFARI ?syndromic? category at a meta-analysis raw p-value threshold of 0.05. There are a total of 19 genes in this gene set.Analysis of the jackknifed gene lists indicated this was primarily due to the influenceof the 15q duplication cohort (GSE7329). We can directly observe the skew in the top twosyndromic genes: UBE3A (FDR=0.003), CDKL5 (FDR=0.095) (Fig. 2.12). While UBE3Aresides on the 15q11-13 region, CDKL5 (Xp22) does not. The link between 15q duplicationand CDKL5 dysregulation is unclear.We repeated this analysis using a more inclusive list of 664 ASD candidates from Neuro-carta [1], but found no significant enrichment. Among the few Neurocarta candidates genesidentified in our meta-signatures are 11 genes in the blood signature (CAMSAP2, UBE3A,CYFIP1, JARID2, PAFAH1B1, FAN1, BRAF, CXCR3, PRDX4, GAP43, GABRA4) andone gene in the brain signature (GAS2). We also looked for known candidates in thebrain using a relaxed FDR threshold of 0.1. Additional genes found in the brain includeADM, CADM1, STAT3, CD44, CYP19A1, PTCHD1, SLC30A5, SLC25A12, APBA2 andDLX1. Note that only CAMSAP2 and BRAF are core hits. None of the existing candidatesare common to the meta-signatures of both tissue types.40Figure 2.12: Raw p-values of genes located in 15q11-13 (UBE3A, CYFIP1) Xp22(CDKL5), and 7q11.23 (RFC2). Top(Q-Q plots): The lack of an overall devi-ation from the uniform diagonal suggests that the signals are skewed. Bottom:Per-data set p-value with a p-value threshold of 0.05 (dashed grey); genes thatmeet an FDR threshold of 0.05 in the data set are marked with a triangle. Com-pare with core-blood signatures in Figure Meta-signature genes in rare structural variants associatedwith ASDThe candidate gene lists used in the last section do not, for the most part, include genescovered by rare structural variants associated with ASD, because the precise gene or genesinvolved are often not known and are thus not documented by SFARI or Neurocarta. Toexplore the potential links between gene expression and rare structural variations, we as-sembled ASD-associated CNVS from several sources. We first observed that genes in themeta-signatures are distributed widely across the genome. There were no obvious hot spots,and none of the CNVS analyzed were significantly enriched for dysregulated genes. (cor-rected p-value 0.05). Globally, 6.3% of the brain meta-signature (total=79) and 10.6% inblood (total=198) are located in known CNV regions, which is not a significant enrichment(Table 2.13).This computation was constrained to genes showing positive associations between ex-pression levels and copy number changes (up-regulated genes within a duplicated regionand down-regulated genes within a deleted region). All dysregulated CNV genes are shownin Tables 2.14 and 2.15.41Observed Expected Total p-valuen % n %BrainUp 1 3.33 3 10.00 30 NSDown 4 8.16 4 8.16 49 NSTotal 5 6.3 7 8.86 79BloodUp 18 16.22 12 10.81 111 0.04Down 3 3.45 7 8.05 87 NSTotal 21 10.6 19 9.60 198Table 2.13: Dysregulated genes (FDR <0.05, meta-signature) within ASD-associated CNV. Fisher?sexact test was used to compute significance. NS:Not significant.Genes Gain/Loss Chromosome CNV Start CNV End ReferenceSCIN Gain 7 12186385 17527285 AGP Consortium (2007)ABCG2 Loss 4 86507718 101626937 Jaquemont et al. (2006)GRK6 Loss 5 175492445 177359136 Sanders et al (2011)PANX2 Loss 2246277400 49509100 Sanders et al (2011)46335545 49565822 Marshall et al. (2008)45202172 49522605 Sebat et al. (2007)45144027 49465883 Sanders et al (2011)SNRNP25 Loss 16 835 1253638 Sanders et al (2011)Table 2.14: Dysregulated genes in the brain that are found in known ASD CNVs. CNVs thatspan the same gene or set of genes are grouped together.Genes Gain/Loss Chromosome CNV Start CNV End ReferenceCSTF2T Gain 1052699516 54408816 Sanders et al (2011)51672210 61490637 Sanders et al (2011)50562149 61478511 Sebat et al. (2007)Continued on next page42Genes Gain/Loss Chromosome CNV Start CNV End ReferenceCTDSP2 Gain 1254218922 58779615 Pinto et al 201054218922 58779615 Sanders et al (2011)CYFIP1 Gain 1520235613 20807351 Pinto et al 201020303106 20800564 Pinto et al 201020090262 21038099 Pinto et al 2010FAN1 Gain 1528723577 30231488 Sanders et al (2011)28723577 30238780 Sanders et al (2011)FUT8-AS1 Gain 14 61897100 65075600 AGP Consortium (2007)HCK,C20orf112Gain 20 28251057 35143867 Sanders et al (2011)IRF2BPL Gain 14 76007842 76924400 Marshall et al. (2008)P2RX7,GPR133Gain 12114191663 132287723 Marshall et al. (2008)114170000 132388000 Sanders et al (2011)RFC2 Gain 772411506 73811186 Sanders et al (2011)72300351 73782113 Sanders et al (2011)72344426 73782113 Sanders et al (2011)72355583 73782113 Sanders et al (2011)SCCPDH Gain 1 244912594 245041638 Marshall et al. (2008)SH2D1B Gain 1 160435966 161133966 AGP Consortium (2007)SMARCA2 Gain 9 175632 3373495 Sanders et al (2011)UBE3A Gain 1522736034 25689610 Jaquemont et al. (2006)21490300 25698400 AGP Consortium (2007)21190624 26203954 Pinto et al 201021190624 26203954 Sanders et al (2011)21240037 26095621 Sanders et al (2011)UBE3A,CYFIP1Gain 1520428583 26069606 Christian et al. (2008)19925826 26069606 Christian et al. (2008)20197683 26069606 Christian et al. (2008)19767013 26134114 Sanders et al (2011)Continued on next page43Genes Gain/Loss Chromosome CNV Start CNV End ReferenceUBE3A,CYFIP1,FAN1Gain 1518376200 30298800 Marshall et al. (2008)18427100 30298847 Marshall et al. (2008)18376200 30298800 Sanders et al (2011)18427100 30298847 Sanders et al (2011)18526971 30756771 Sebat et al. (2007)18526971 30756771 Sanders et al (2011)ZNF611 Gain 19 57836600 58246200 Marshall et al. (2008)ZNF721 Gain 4 328851 542862 Marshall et al. (2008)ZNF721,SPON2Gain 435410 3511385 Sanders et al (2011)398952 6722859 AGP Consortium (2007)CCDC50 Loss 3 187295051 193862987 Jaquemont et al. (2006)SLC17A9 Loss 20 61056624 61076763 Sanders et al (2011)TSPAN12 Loss 7113335000 128821721 Sanders et al (2011)113528285 129015006 Marshall et al. (2008)Table 2.15: Dysregulated genes in the blood that are found in known ASD CNVs.Because 15q11-13 duplication is one of the most common CNV aberrations in ASD [10],it was unsurprising that we detected dysregulated genes in this region. A closer look at thesegenes (UBE3A, CYFIP1; Fig. 2.12) again reveals the sensitivity towards the data set whichcomprises only autistic subjects with maternally derived 15q duplications (GSE7329). Inother ASD-associated CNVS, we detected genes from the core signatures that are dysreg-ulated in the same direction as the change in copy number: ZNF721 (4p16) in the blood;SCIN (7p21.1), SNRNP25 and ABCG2 (4q21) in the brain. However, we conclude thatwhile some of the genes in our signatures are ASD candidate genes or fall in known rareCNV regions, there is no striking overall relationship between the expression patterns andthe current state of knowledge of ASD genetics.Previous work has shown that gene expression profiles can be used to predict cytogenticabnormalities [33, 79]. We attempted to build a classifier using expression levels of dysreg-ulated genes as features. Because class labels for 15q duplications were available (amongother CNVS), we focused on predicting the presence of 15q duplications using preselectedfeatures (i.e., CYFIP1, UBE3A and FAN1 expression levels). Our training data, GSE7329,comprise seven subjects with 15q duplication and six subjects without. We tested the clas-44sifier on GSE37772 (the only other data set with CNV labels). Out of a total of 431 samplesin GSE37772, there was only one sample with a confirmed 15 duplication status, as re-ported by Luo et al.. Eleven other samples were predicted to have 15q duplications. Theideal classifier would be able to predict presence of 15q duplications in individuals whoseCNV statuses are unknown. But the lack of samples with confirmed CNV statuses makes ita challenge to evaluate the performance of the classifier. We report our predictions giventhe information we have in Table 2.16. While the sample with a confirmed 15q duplication,GSM927674, had relatively high rankings, other samples originally predicted were ran-domly placed. Besides the lack of CNV labels, technical differences between the two datasets used for training and testing might pose problems in predicting outcomes. A classi-fier that can be generalized across different data sets might require more complex machinelearning methods such as transfer learning.StandardizationMethodTrainingAU-ROCCVAU-ROCSV (total=13) Rank ofGSM927674AU-ROCNone 1.00 1.00 12 1 0.58Scaled 1.00 1.00 9 18 0.53GAPDH nor-malized0.86 0.79 13 4 0.36Table 2.16: Predictions on GSE37772 samples using preliminary CNV classifier. CV:cross validated; SV: support vectors; AU-ROC: AU-ROC computed for other15q samples (originally predicted but not confirmed).2.2.8 Network analysis and candidate gene characterizationWe were also interested to see whether these core signature genes possess distinctive prop-erties at the systems level. To do so, we compared PPIN properties of core signature genes tothat of random gene sets with similar size and node degree. Our analysis shows there is noevidence of significant changes in the functional connectivity of core dysregulated genes inautistic individuals (Fig. 2.13). Analyses on the coexpression networks of brain (Fig. 2.14)and liver (Fig. 2.15) showed similar results. Because the biological mechanism of autismis unknown, it is conceivable that different subsets or combinations of genes might sharea common functional topology. In other words, the core signatures alone might be insuffi-cient to cause global alterations in the brain.We have shown earlier that there is little or no change in global functional connectivitybetween autistic individuals and controls. We further explored local PPIN neighbourhoodsof the same core-signatures and found several known candidates in the vicinity (direct45(a) Blood. 15 out of 23 core candidate genes included.(b) Brain. 20 out of 25 core candidate genes included.Figure 2.13: PPIN network properties of core candidate genes in the blood and braincompared to that of respective random gene sets.or first degree connections). A few known candidates were directly linked to our coresignatures (blood = 15, brain = 37). But since the number of known candidates observeddid not significantly differ from what is expected given random gene sets of similar size andnode degree, the links observed are likely to arise by chance due to the large node degreesof some core signatures genes.46(a) Blood; 13/15 up-regulated genes included. (b) Blood; 4/8 down-regulated genes included.(c) Brain; 11/15 up-regulated genes included. (d) Brain; 9/10 down-regulated genes included.Figure 2.14: Brain co-expression network properties of core candidate genes in theblood and brain compared to that of respective random gene sets.(a) Blood; 12/15 up-regualted genes included. (b) Blood; 7/8 down-regulated genes included.(c) Brain; 11/15 up-regulated genes included. (d) Brain; All down-regulated genes included.Figure 2.15: Liver co-expression network properties of core candidate genes in theblood and brain compared to that of respective random gene sets.47To further characterize the candidate genes, we extracted their phenotypes associationsfrom Neurocarta [1]. Neurocarta is an in-house knowledge base of gene-phenotype as-sociations, so it provides a global view of what is currently known about the genes. Wecategorized the candidates based on their phenotype associations, defining genes that areassociated with only autism as ?ASD specific? genes. Those that are only linked to a listof manually curated neurodevelopmental disorders are considered ?neurodevelopment spe-cific?. Results show that a large fraction of genes was not catalogued in Neurocarta. Thiscould be because Neurocarta mainly focuses on the genetic basis of neurodevelopmentaldisorders [1], thus capturing only a subset of genes. There were very few ?specific? candi-date genes. Because the operational definition of ?specificity? is only valid to the extent ofour prior knowledge, it is not clear whether these genes are actually biologically specific tothe disorder, or if they are not well studied.Total In Neuro-cartaASD Can-didatesNdev.SpecificASDSpecificBloodCore signature 23 10 2 2 1Meta-signature 198 63 11 11 3BrainCore signature 25 12 0 0 0Meta-signature 79 29 1 1 0Table 2.17: Categorization of our candidate genes based on Neurocarta. Ndev.:neurodevelopment.The genetic basis of neurodevelopment or neuropsychiatric disorders might not be re-flected in the transcriptome. To look for potential biological similarities in other neurode-velopmental disorders, we compared our signatures to candidates from previous gene ex-pression studies. We were interested in schizophrenia as a methodologically similar studywas done in our lab. Results suggest that there are some overlaps between autism andschizophrenia expression profiles in postmortem brain (Figure 2.16). Further investiga-tions are required to associate these similarities with overlapping symptoms seen betweenthe disorders.48(a) (b)Figure 2.16: Common brain meta-signatures between the autism (current study) andschizophrenia meta-analyses by Mistry et al. a) Up-regulated genes; b) Down-regulated genes.Genes p-value FDRUp-regulatedABCA1 7.19e-05 4.16e-02P4HA1 6.83e-04 9.19e-02Down-regulatedCCDC25 2.59e-05 3.27e-02LRRC17 3.06e-04 5.77e-02RMND5B 4.30e-04 6.09e-02SLC25A12 5.37e-04 6.57e-02FARSA 8.45e-04 7.78e-02PPA2 9.61e-04 8.26e-02APBA2 1.11e-03 9.01e-02Table 2.18: Meta-signature genes that are also dys-regulated in schizophrenia. Meta-analysisFDR <0.149Chapter 3Discussion and conclusionI presented a meta-analysis of autism gene expression profiling studies, providing themost comprehensive survey on gene expression in autism available to date. The mainfinding is that there are molecular commonalities across multiple independent groups ofindividuals with ASD. These similarities have, to our knowledge, gone overlooked in indi-vidual gene expression studies. Genes I identified as most robustly changed across cohortswere not previously underscored in ASD literature. In this final section, I will discuss thesefindings in the context of other autism research, noting some limitations of the current studyand avenues for future work.3.1 Similarities and differences between key findings andprevious resultsThe question of whether one should expect some homogeneous molecular aspects acrossindividuals with ASD is an open one. The studies included in this analysis used a range ofcriteria to select subjects, but they were by and large made up of idiopathic cases (the excep-tion being GSE7329). Since they are not of monogenic etiology, we anticipated variabilitywithin and among individual ASD data sets. Besides methodological differences (which weminimized by handling the data sets uniformly), there are other sources of heterogeneitythat are difficult to address, and raise some questions about the interpretation of ASD expres-sion studies. For example, the smallest data set we used (GSE7329) showed substantiallymore differential expression than the largest data set (GSE37772). GSE7329 comprises in-dividuals with 15q duplications while GSE37772 comprises idiopathic probands from theSSC! (SSC!) (Table 2.4), who have moderate to severe symptoms [80].50Given the high degree of variability among and within studies, it is striking that wefound some genes showing differences that are relatively consistent across cohorts. At thistime the full biological significance of the genes we identified is unclear. Several of theconcordant genes (core hits) we found are linked to genetic disorders with neurologicalimplications. Among the genes in the core brain signature are PDYN (prodynorphin) andABCA1 (ATP-binding cassette, sub-family A). Mutations in PDYN, a gene that codes for aneuropeptide precursor, has causal links to spinocerebellar ataxia (MIM 610245) [81]. Mu-tations in ABCA1 are an established cause for Tangier disease (MIM 205400), a disorderwhich features include neuropathies [82]. There were fewer clear hits in the blood data, butseveral genes stand out (Fig. 2.10). Two known ASD candidates CAMSAP2 (calmodulinregulated spectrin-associated protein family, member 2) and BRAF (v-raf murine sarcomaviral oncogene homolog B1), showed consistent dysregulation in at least three cohorts.Other novel candidates in blood are PRKCH (protein kinase C eta, a member of the pro-tein kinase C family) and ABLIM1 (actin binding LIM protein 1), which have been widelystudied in cellular signaling and axon guidance [83] respectively.As discussed, results from previously published transcriptome analyses have, at the sur-face, shown little agreement. We have also described some reasons why this might occur,including differences in clinical properties or technical aspects of the expression analysis.However, we note that some of our candidates were hits reported in the original studies,as well as in other transcriptome studies not included for analysis (Tables 3.1 and 3.2).In fact, a few were validated with a second independent cohort in the original studies -ZNF322 (zinc finger protein 322) in Kong et al. (2012); HSPA1A (heat shock 70kDa pro-tein 1A) and PDYN in Voineagu et al.(2011), further suggesting bona fide associations withASD. However these genes were not discussed in previous publications, perhaps becauseof their relatively low rankings in the results or the lack of known functional implications.In addition, most existing studies have not dwelled upon the findings of other related stud-ies, either choosing to ignore them or attributing differences to experimental procedures.Our results suggest that in fact many of the molecular or functional differences observedin individual studies are likely to be specific to that study and thus of questionable inter-pretation when the entire autism spectrum is considered. While inferences made based onour findings are preliminary, the fact that some changes show a tendency to be reproducibleopens promising avenues for further research.51CoreBloodSignatureBrain Studies Blood Studies TotalStudiesGSE28475GSE28521GSE38322Purcelletal.Garbettetal.GSE15402GSE15451GSE18123.1GSE18123.2GSE37772GSE6575GSE7329Up-regulatedSCARNA17 X 1ZNF594 X 1GIMAP8 X 1PRKCH X 1CXCR7(CMKOR1)XX XX X 3MALAT1 X 1ZNF322 X* X* X 3CAMSAP2 X 1ENO3 X 1MAN2A2 X X 2ZNF721 0APBB1 X 1ABLIM1 0ZFP62 X 1CYBRD1 X 1Down-regulatedBRAF X 1STRA13 0SERPINB9 X 1MED25 0ZNF784 0SNX22 0FAM46C X 1BLVRB X 1Total hits 184 537 32 29 130 361 35 487 457 332 24 747Total genesanalyzed17979 17979 21348 - 19763 14753 14753 19763 20353 17979 19763 19326XX Overlap with concordant direction.X- Overlap with discordant direction.X Overlap with unknown direction.* Validated in replication cohort.Not tested.Table 3.1: Comparisons between core signature genes in blood and differentially ex-pressed genes reported in original studies. Total hits: Total hits reported in orig-inal study (Genes); Total genes analyzed: Estimated total number of genes ana-lyzed in each study based on Gemma platform annotations.52CoreBrainSignatureBrain Studies Blood Studies TotalStudiesGSE28475GSE28521GSE38322Purcelletal.Garbettetal.GSE15402GSE15451GSE18123.1GSE18123.2GSE37772GSE6575GSE7329Up-regulatedHSPA1A XX*XX X- 3IGFBP5 XX 1PDYN XX* 1ZC3HAV1 XX 1PTPN1 X- 1C2CD4A 0DNAJB1 XX 1C5AR1 0C1orf106 0TAGLN2 XX XX 2ABCA1 XX 1LILRB3 XX dup. 2SCIN XX X 2CD93 0HMOX1 0Down-regulatedABCG2 0KLHDC2 0COA1 X 1TTC1 0FBXL15 0FAM58A 0SNRNP25(C16orf33)del. 1C12orf57 0FIS1 0PIH1D1 0Total hits 184 537 32 29 130 361 35 487 457 332 24 747Total genesanalyzed17979 17979 21348 - 19763 14753 14753 19763 20353 17979 19763 19326XX Overlap with concordant direction.X- Overlap with discordant direction.X Overlap with unknown direction.* Validated in replication cohort.Not tested.Table 3.2: Similar to Table 3.1, for core signature genes in the brain.533.2 Biological interpretations of meta-analyzed ASDexpression profilesTaken as a whole, the expression patterns we observed in brain point to the possibility of ef-fects relating to cellular respiration. Within the cellular respiration group, SLC25A12 (not ahit at an FDR of 0.05 but falls within a relaxed FDR threshold of 0.1), a mitochondrial aspar-tate/glutamate carrier, was previously reported as a susceptibility gene as it harbors SNPsstrongly associated with autism [84]. In addition to the genes which were directly anno-tated with this function, a further examination reveals other highly-ranked genes in our datawhich are known to play regulatory roles in cellular metabolism or mitochondrial relatedfunctions, though not directly annotated in the GO functional groups. For instance, P2RX7(purinergic receptor P2X, ligand-gated ion channel, 7; CNV gene) is involved in puriner-gic signaling, a pathway that might play a role in mitochondrial dysfunction-associatedASD [85]. Mitochondrial dysfunction (MD) has been a topic of study in some neuropsychi-atric disorders (notably bipolar disorder [86, 87]). Some have conjectured a 4-5% preva-lence of MD in individuals on the autism spectrum [10, 88], but there is little direct evidencein the literature. Investigations on mitochondrial DNA mutations in ASD yielded mixedconclusions [89, 90]. In part supported by the enrichment of ?cellular respiration? (consist-ing of only nuclear encoded genes), current research seems to indicate a role for nucleargenes in the co-occurrence of MD and ASD [91, 92]. The genetics of which might not be assimple as other monogenic metabolic disorders with high prevalence of ASD, like Smith-Lemli Optiz syndrome (MIM 270400). However, as our analysis of brain transcriptomeshave shown, there are convergent functional consequences of what could be heterogeneousgenetic or genomic aberrations underlying the disorders.Potentially causative rare CNVs are found in up to 20% of ASD cases [10]. While sev-eral genes we identified are within regions implicated in CNV studies of ASD, there wasno overall significant enrichment. It is still possible that the changes in RNA levels weobserved are linked indirectly to CNVs or other types of rare genetic variants, which we arenot able to determine because the genomic backgrounds for most of the cases in our datasets were unknown. Genes suggestive of direct correlations include PANX2, RFC2 and15q genes, which reside in regions that have recurrent (previously reported in several ASDcases) rare CNVs. RFC2 lies in the 7q11.23 region, deletions of which are associated withWilliams-Beuren syndrome (MIM 194050). Duplications of this region, concordant withan up-regulated RFC2 we found, have been strongly linked to autism previously [13]. Itis likely, based on the data at hand, that expression changes in an individual reflect under-lying chromosome abnormalities. However it should be noted that some genes with large54effects, such as 15q genes, are driven by a subset of cohorts. There might also be othercomplex links between copy number variation and RNA expression that are not obviousbecause of varying dosage sensitivity [93], potentially explaining observations of geneswith common expression changes in rare or uncommon CNVs. For genes that are actuallycopy number-sensitive, previous work showed that their expression profiles can be used topredict chromosome abnormalities in blood [79]. Efforts like these will improve progno-sis of the disorder, if successful. Our attempt was impeded partly by the lack of data or?labels? that are needed to train an accurate classifier. Another question raised is whetherthe blood ?markers? are relevant to the brain, because the two tissue types exhibit differentprofiles.3.3 Limitations and future directionsMost of the literature on ASD expression profiling focuses on analysis of blood samplesrather than brain. Because brain samples are hard to obtain, the hope is that blood cellscan serve as a surrogate for brain [30, 31]. But this was not supported by our results. Weshowed that the biological profiles of brain and blood differ at the molecular and functionallevels. Although we note that external factors such as medication, age range (while bloodsamples are often obtained from a younger population, the brain samples include a widerage range), and cause of death may affect gene expression in post mortem brain tissue, itmay be infeasible to address these issues at present due to the lack of brain tissue resources.In addition, current data do not provide information on developmental trajectories, so weare limited by ?snapshots? of expression levels in both tissue types. Longitudinal studiesin neuroimaging [94] or embryonic brain cells [95] may yield further insights.Another important caveat for our interpretation is the difficulty of attributing any causalrole to the changes we observe. They could be sequelae of ASD, or due to comorbid condi-tions. Most of the studies we used did not provide any details about comorbidities, makingthis difficult to address in our analysis. Future studies should endeavor to provide such de-tails to allow further dissection of real effects from potential confounds. As current leadinghypotheses on the etiology of ASD focus on brain connectivity or synaptic function [96], itis a challenge to determine where mitochondrial function fits into the picture. The multi-tude of genes involved in this function also makes it hard to determine its specificity to thedisorder. Sure enough, one can refute specificity by providing evidence of similar trends inother unrelated expression studies.In previous work from the lab, gene annotation biases were shown to impact gene func-tion prediction [62, 97]. Briefly, genes or gene groups that are deeply annotated (multi-55functional) are more likely to appear as being enriched in a functional enrichment analysis,whereas genes that are not well characterized tend to fall out of favour. This bias is ame-liorated in our functional enrichment analysis where we recovered a set of genes (withrelatively low multifunctionality) representing ?cellular respiration?. When evaluating in-dividual genes, however, one has to be aware that although they could be relevant to autism,they might not be specific. This is evident in their associations with schizophrenia, as wellas numerous other non-neuropsychiatric phenotypes or disorders. Retrospectively, this isalso reflected in the lack of enrichment in GO terms associated with neurological func-tions although several hits in the meta-signatures are associated with neuropsychiatry orneurodevelopmental syndromes. It is not to say that ?non-specific? genes are hence un-interesting. But rather I emphasize on the importance of making cautious assessments byaccounting for potential biases.3.4 ConclusionIn conclusion, this meta-analysis reveals subtle but consistent changes in expression inthe brains of individuals with ASD. Future work could explore whether these changesare replicable in additional cohorts. In blood, the signals were much weaker and moreheterogeneous, with the clearest effects being associated with duplications in 15q. Thetentative interpretation of this is that blood may not be a good tissue for identification ofcommonalities in the transcriptome in ASD, but it might be useful in probing the biologicaleffects of specific chromosome abnormalities.56Bibliography[1] Elodie Portales-Casamar, Carolyn Ch?ng, Frances Lui, Nicolas St-Georges, AntonZoubarev, Artemis Y. Lai, Mark Lee, Cathy Kwok, Willie Kwok, Luchia Tseng, andPaul Pavlidis. Neurocarta: aggregating and sharing disease-gene relations for theneurosciences. BMC Genomics, 14(1):129, February 2013. ISSN 1471-2164.doi:10.1186/1471-2164-14-129. URLhttp://www.biomedcentral.com/1471-2164/14/129/abstract. PMID: 23442263. ?pages iii, 22, 40, 48[2] Jamee M. Berg and Daniel H. Geschwind. Autism genetics: searching for specificityand convergence. Genome Biology, 13(7):247, July 2012. ISSN 1465-6906.doi:10.1186/gb4034. URL http://genomebiology.com/2012/13/7/247/abstract. ?pages 1[3] American Psychiatric Association. Diagnostic and statistical manual of mentaldisorders DSM-5. Arlington, VA, 2013. ISBN 9780890425596 08904255909780890425541 089042554X 9780890425558 0890425558. URLhttp://dsm.psychiatryonline.org/book.aspx?bookid=556. ? pages 2[4] E. Bleuler and N.S. Kline. Synopsis of Eugen Bleuler?s Dementia Praecox, Or TheGroup of Schizophrenias. International Universities Press, 1952. URLhttp://books.google.ca/books?id=Jg54OwAACAAJ. ? pages 2[5] L Kanner. Autistic disturbances of affective contact. Acta Paedopsychiatr, 35(4):100?136, 1968. ISSN 0001-6586. PMID: 4880460. ? pages 2[6] Dennis S Charney and Eric J Nestler. Neurobiology of mental illness. OxfordUniversity Press, Oxford; New York, 2004. ISBN 9780199725250 019972525X9780195149623 0195149629. URL http://site.ebrary.com/id/10317706. ? pages 2[7] Stuart Murray. Autism. The Routledge series integrating science and culture.Routledge, New York, 2012. ISBN 9780415884983. ? pages 2[8] Joseph D Buxbaum and Patrick R Hof. The neuroscience of autism spectrumdisorders. Academic Press, Oxford; Waltham, MA, 2013. ISBN 9780123919243012391924X. URL http://www.sciencedirect.com/science/book/9780123919243.? pages 257[9] S Folstein and M Rutter. Infantile autism: a genetic study of 21 twin pairs. JournalOf Child Psychology And Psychiatry, And Allied Disciplines, 18(4):297?321, 1977.ISSN 0021-9630. URL http://search.ebscohost.com/login.aspx?direct=true&db=mnh&AN=562353&site=ehost-live&scope=site. ? pages 2[10] Judith H. Miles. Autism spectrum disordersA genetics review. Genetics in Medicine,13(4):278?294, 2011. ISSN 1098-3600. doi:10.1097/GIM.0b013e3181ff67ba. ?pages 3, 44, 54[11] Kai Wang, Haitao Zhang, Deqiong Ma, Maja Bucan, Joseph T. Glessner, Brett S.Abrahams, Daria Salyakina, Marcin Imielinski, Jonathan P. Bradfield, Patrick M. A.Sleiman, Cecilia E. Kim, Cuiping Hou, Edward Frackelton, Rosetta Chiavacci,Nagahide Takahashi, Takeshi Sakurai, Eric Rappaport, Clara M. Lajonchere, JeffreyMunson, Annette Estes, Olena Korvatska, Joseph Piven, Lisa I. Sonnenblick, AnaI. Alvarez Retuerto, Edward I. Herman, Hongmei Dong, Ted Hutman, MarianSigman, Sally Ozonoff, Ami Klin, Thomas Owley, John A. Sweeney, Camille W.Brune, Rita M. Cantor, Raphael Bernier, John R. Gilbert, Michael L. Cuccaro,William M. McMahon, Judith Miller, Matthew W. State, Thomas H. Wassink, HilaryCoon, Susan E. Levy, Robert T. Schultz, John I. Nurnberger, Jonathan L. Haines,James S. Sutcliffe, Edwin H. Cook, Nancy J. Minshew, Joseph D. Buxbaum,Geraldine Dawson, Struan F. A. Grant, Daniel H. Geschwind, Margaret A.Pericak-Vance, Gerard D. Schellenberg, and Hakon Hakonarson. Common geneticvariants on 5p14.1 associate with autism spectrum disorders. Nature, 459(7246):528?533, April 2009. ISSN 0028-0836. doi:10.1038/nature07999. URLhttp://www.nature.com/nature/journal/v459/n7246/full/nature07999.html. ? pages 3[12] Lauren A. et al. Weiss. A genome-wide linkage and association scan reveals novelloci for autism. Nature, 461(7265):802?808, October 2009. ISSN 0028-0836.doi:10.1038/nature08490. URLhttp://www.nature.com/nature/journal/v461/n7265/full/nature08490.html. ? pages 3[13] Stephan J Sanders, A Gulhan Ercan-Sencicek, Vanessa Hus, Rui Luo, Michael TMurtha, Daniel Moreno-De-Luca, Su H Chu, Michael P Moreau, Abha R Gupta,Susanne A Thomson, Christopher E Mason, Kaya Bilguvar, Patricia B SCelestino-Soper, Murim Choi, Emily L Crawford, Lea Davis, Nicole R DavisWright, Rahul M Dhodapkar, Michael DiCola, Nicholas M DiLullo, Thomas VFernandez, Vikram Fielding-Singh, Daniel O Fishman, Stephanie Frahm, RoubenGaragaloyan, Gerald S Goh, Sindhuja Kammela, Lambertus Klei, Jennifer K Lowe,Sabata C Lund, Anna D McGrew, Kyle A Meyer, William J Moffat, John DMurdoch, Brian J O?Roak, Gordon T Ober, Rebecca S Pottenger, Melanie JRaubeson, Youeun Song, Qi Wang, Brian L Yaspan, Timothy W Yu, Ilana RYurkiewicz, Arthur L Beaudet, Rita M Cantor, Martin Curland, Dorothy E Grice,Murat Gnel, Richard P Lifton, Shrikant M Mane, Donna M Martin, Chad A Shaw,Michael Sheldon, Jay A Tischfield, Christopher A Walsh, Eric M Morrow, David HLedbetter, Eric Fombonne, Catherine Lord, Christa Lese Martin, Andrew I Brooks,James S Sutcliffe, Jr Cook, Edwin H, Daniel Geschwind, Kathryn Roeder, Bernie58Devlin, and Matthew W State. Multiple recurrent de novo CNVs, includingduplications of the 7q11.23 williams syndrome region, are strongly associated withautism. Neuron, 70(5):863?885, June 2011. ISSN 1097-4199.doi:10.1016/j.neuron.2011.05.002. PMID: 21658581. ? pages 3, 22, 23, 54[14] Christian R Marshall, Abdul Noor, John B Vincent, Anath C Lionel, Lars Feuk,Jennifer Skaug, Mary Shago, Rainald Moessner, Dalila Pinto, Yan Ren, BhoomaThiruvahindrapduram, Andreas Fiebig, Stefan Schreiber, Jan Friedman, Cees E JKetelaars, Yvonne J Vos, Can Ficicioglu, Susan Kirkpatrick, Rob Nicolson, LeonSloman, Anne Summers, Clare A Gibbons, Ahmad Teebi, David Chitayat, RosannaWeksberg, Ann Thompson, Cathy Vardy, Vicki Crosbie, Sandra Luscombe, RebeccaBaatjes, Lonnie Zwaigenbaum, Wendy Roberts, Bridget Fernandez, Peter Szatmari,and Stephen W Scherer. Structural variation of chromosomes in autism spectrumdisorder. Am. J. Hum. Genet., 82(2):477?488, February 2008. ISSN 1537-6605.doi:10.1016/j.ajhg.2007.12.009. PMID: 18252227. ? pages 3, 22[15] Brian J. ORoak, Laura Vives, Santhosh Girirajan, Emre Karakoc, Niklas Krumm,Bradley P. Coe, Roie Levy, Arthur Ko, Choli Lee, Joshua D. Smith, Emily H. Turner,Ian B. Stanaway, Benjamin Vernot, Maika Malig, Carl Baker, Beau Reilly, Joshua M.Akey, Elhanan Borenstein, Mark J. Rieder, Deborah A. Nickerson, Raphael Bernier,Jay Shendure, and Evan E. Eichler. Sporadic autism exomes reveal a highlyinterconnected protein network of de novo mutations. Nature, April 2012. ISSN0028-0836. doi:10.1038/nature10989. URL http://www.nature.com/nature/journal/vaop/ncurrent/full/nature10989.html?WT.ec id=NATURE-20120405. ? pages 3[16] Benjamin M. Neale, Yan Kou, Li Liu, Avi Maayan, Kaitlin E. Samocha, Aniko Sabo,Chiao-Feng Lin, Christine Stevens, Li-San Wang, Vladimir Makarov, Paz Polak,Seungtai Yoon, Jared Maguire, Emily L. Crawford, Nicholas G. Campbell, Evan T.Geller, Otto Valladares, Chad Schafer, Han Liu, Tuo Zhao, Guiqing Cai, JayonLihm, Ruth Dannenfelser, Omar Jabado, Zuleyma Peralta, Uma Nagaswamy, DonnaMuzny, Jeffrey G. Reid, Irene Newsham, Yuanqing Wu, Lora Lewis, Yi Han,Benjamin F. Voight, Elaine Lim, Elizabeth Rossin, Andrew Kirby, Jason Flannick,Menachem Fromer, Khalid Shakir, Tim Fennell, Kiran Garimella, Eric Banks, RyanPoplin, Stacey Gabriel, Mark DePristo, Jack R. Wimbish, Braden E. Boone,Shawn E. Levy, Catalina Betancur, Shamil Sunyaev, Eric Boerwinkle, Joseph D.Buxbaum, Edwin H. Cook, Bernie Devlin, Richard A. Gibbs, Kathryn Roeder,Gerard D. Schellenberg, James S. Sutcliffe, and Mark J. Daly. Patterns and rates ofexonic de novo mutations in autism spectrum disorders. Nature, April 2012. ISSN0028-0836. doi:10.1038/nature11011. URL http://www.nature.com/nature/journal/vaop/ncurrent/full/nature11011.html?WT.ec id=NATURE-20120405. ? pages[17] Stephan J. Sanders, Michael T. Murtha, Abha R. Gupta, John D. Murdoch,Melanie J. Raubeson, A. Jeremy Willsey, A. Gulhan Ercan-Sencicek, Nicholas M.DiLullo, Neelroop N. Parikshak, Jason L. Stein, Michael F. Walker, Gordon T. Ober,Nicole A. Teran, Youeun Song, Paul El-Fishawy, Ryan C. Murtha, Murim Choi,John D. Overton, Robert D. Bjornson, Nicholas J. Carriero, Kyle A. Meyer, Kaya59Bilguvar, Shrikant M. Mane, Nenad estan, Richard P. Lifton, Murat Gnel, KathrynRoeder, Daniel H. Geschwind, Bernie Devlin, and Matthew W. State. De novomutations revealed by whole-exome sequencing are strongly associated with autism.Nature, April 2012. ISSN 0028-0836. doi:10.1038/nature10945. URLhttp://www.nature.com/nature/journal/vaop/ncurrent/full/nature10945.html?WT.ec id=NATURE-20120405. ? pages 3[18] Timothy W. Yu, Maria H. Chahrour, Michael E. Coulter, Sarn Jiralerspong, KazukoOkamura-Ikeda, Bulent Ataman, Klaus Schmitz-Abe, David A. Harmin, MazharAdli, Athar N. Malik, Alissa M. DGama, Elaine T. Lim, Stephan J. Sanders,Ganesh H. Mochida, Jennifer N. Partlow, Christine M. Sunu, Jillian M. Felie,Jacqueline Rodriguez, Ramzi H. Nasir, Janice Ware, Robert M. Joseph, R. Sean Hill,Benjamin Y. Kwan, Muna Al-Saffar, Nahit M. Mukaddes, Asif Hashmi, SoherBalkhy, Generoso G. Gascon, Fuki M. Hisama, Elaine LeClair, Annapurna Poduri,Ozgur Oner, Samira Al-Saad, Sadika A. Al-Awadi, Laila Bastaki, TawfegBen-Omran, Ahmad S. Teebi, Lihadh Al-Gazali, Valsamma Eapen, Christine R.Stevens, Leonard Rappaport, Stacey B. Gabriel, Kyriacos Markianos, Matthew W.State, Michael E. Greenberg, Hisaaki Taniguchi, Nancy E. Braverman, Eric M.Morrow, and Christopher A. Walsh. Using whole-exome sequencing to identifyinherited causes of autism. Neuron, 77(2):259?273, January 2013. ISSN 0896-6273.doi:10.1016/j.neuron.2012.11.002. URLhttp://www.cell.com/neuron/abstract/S0896-6273(12)00993-2. ? pages 3[19] Gaia Novarino, Paul El-Fishawy, Hulya Kayserili, Nagwa A Meguid, Eric M Scott,Jana Schroth, Jennifer L Silhavy, Majdi Kara, Rehab O Khalil, Tawfeg Ben-Omran,A Gulhan Ercan-Sencicek, Adel F Hashish, Stephan J Sanders, Abha R Gupta,Hebatalla S Hashem, Dietrich Matern, Stacey Gabriel, Larry Sweetman, YasmeenRahimi, Robert A Harris, Matthew W State, and Joseph G Gleeson. Mutations inBCKD-kinase lead to a potentially treatable form of autism with epilepsy. Science,338(6105):394?397, October 2012. ISSN 1095-9203.doi:10.1126/science.1224631. PMID: 22956686. ? pages 3[20] Uta Frith. Autism: explaining the enigma. Blackwell Pub, Malden, MA, 2nd ededition, 2003. ISBN 0631229000. ? pages 5[21] P. Hammond, C. Forster-Gibson, A. E. Chudley, J. E. Allanson, T. J. Hutton, S. A.Farrell, J. McKenzie, J. J. A. Holden, and M. E. S. Lewis. Facebrain asymmetry inautism spectrum disorders. Mol Psychiatry, 13(6):614?623, March 2008. ISSN1359-4184. doi:10.1038/mp.2008.18. URLhttp://www.nature.com/mp/journal/v13/n6/full/mp200818a.html. ? pages 5[22] M. D. Spencer, R. J. Holt, L. R. Chura, J. Suckling, A. J. Calder, E. T. Bullmore, andS. Baron-Cohen. A novel functional brain imaging endophenotype of autism: theneural response to facial expression of emotion. Transl Psychiatry, 1(7):e19, July2011. doi:10.1038/tp.2011.18. URLhttp://www.nature.com/tp/journal/v1/n7/full/tp201118a.html. ? pages 560[23] Yasunari Sakai, Chad A Shaw, Brian C Dawson, Diana V Dugas, ZainaAl-Mohtaseb, David E Hill, and Huda Y Zoghbi. Protein interactome revealsconverging molecular pathways among autism disorders. Sci Transl Med, 3(86):86ra49, June 2011. ISSN 1946-6242. doi:10.1126/scitranslmed.3002166. PMID:21653829. ? pages 5[24] Maggie L. Chow, Tiziano Pramparo, Mary E. Winn, Cynthia Carter Barnes, Hai-RiLi, L?auren Weiss, Jian-Bing Fan, Sarah Murray, Craig April, Haim Belinson,Xiang-Dong Fu, Nicholas J. Wynshaw-Boris, Anthony a nd Schork, and EricCourchesne. Age-dependent brain gene expression and copy number anomalies inautism suggest distinct pathological processe s at young versus mature ages. PLoSGenet, 8(3):e1002592, March 2012. doi:10.1371/journal.pgen.1002592. URLhttp://dx.doi.org/10.1371/journal.pgen.1002592. ? pages 6[25] Irina Voineagu, Xinchen Wang, Patrick Johnston, Jennifer K. Lowe, Yuan Tian,Steve Horvath, Jonathan Mill, Rita M. Cantor, Benjamin J. Blencowe, and Daniel H.Geschwind. Transcriptomic analysis of autistic brain reveals convergent molecularpathology. Nature, advance online publication, May 2011. ISSN 1476-4687.doi:10.1038/nature10110. URL http://dx.doi.org/10.1038/nature10110. ? pages 6,7, 9, 51[26] Matthew R. Ginsberg, Robert A. Rubin, Tatiana Falcone, Angela H. Ting, andMarvin R. Natowicz. Brain transcriptional and epigenetic associations with autism.PLoS ONE, 7(9):e44736, September 2012. doi:10.1371/journal.pone.0044736. URLhttp://dx.doi.org/10.1371/journal.pone.0044736. ? pages 6, 7[27] Jeffrey P. Gregg, Lisa Lit, Colin A. Baron, Irva Hertz-Picciotto, Wynn Walker,Ryan A. Davis, Lisa A. Croen, Sally Ozonoff, Robin Hansen, Isaac N. Pessah, andFrank R. Sharp. Gene expression changes in children with autism. Genomics, 91(1):22?29, January 2008. ISSN 0888-7543. doi:10.1016/j.ygeno.2007.09.003. URLhttp://www.sciencedirect.com/science/article/pii/S0888754307002327. ? pages 6[28] Yuhei Nishimura, Christa L. Martin, Araceli Vazquez-Lopez, Sarah J. Spence,Ana Isabel Alvarez-Retuerto, Marian Sigman, Corinna Steindler, Sandra Pellegrini,N. Carolyn Schanen, Stephen T. Warren, and Daniel H. Geschwind. Genome-wideexpression profiling of lymphoblastoid cell lines distinguishes different forms ofautism and reveals shared pathways. Hum. Mol. Genet., 16(14):1682?1698, July2007. ISSN 0964-6906, 1460-2083. doi:10.1093/hmg/ddm116. URLhttp://hmg.oxfordjournals.org/content/16/14/1682. ? pages 6, 7[29] Valerie W Hu, Tewarit Sarachana, Kyung Soon Kim, AnhThu Nguyen, ShreyaKulkarni, Mara E Steinbe?rg, Truong Luu, Yinglei Lai, and Norman H Lee. Geneexpression profiling differentiates autism case-controls and phenotypic variants ofautism spectrum disor ders: evidence for circadian rhythm dysfunction in severeautism. Autism Res, 2(2):78?97, April 2009. ISSN 1939-3806. doi:10.1002/aur.73.URL http://www.ncbi.nlm.nih.gov/pubmed/19418574. PMID: 19418574. ? pages6, 7, 2461[30] Valerie W. Hu, AnhThu Nguyen, Kyung Soon Kim, Mara E. Steinberg, TewaritSarachana, Michele A. Scully, Steven J. Soldin, Truong Luu, and Norman H. Lee.Gene expression profiling of lymphoblasts from autistic and nonaffected sib pairs:Altered pathways in neuronal development and steroid biosynthesis. PLoS ONE, 4(6):e5775, June 2009. doi:10.1371/journal.pone.0005775. URLhttp://dx.doi.org/10.1371/journal.pone.0005775. ? pages 6, 24, 55[31] Sek Won Kong, Christin D. Collins, Yuko Shimizu-Motohashi, Ingrid A. Holm,Malcolm G. Campbell, In-Hee Lee, Stephanie J. Brewster, Ellen Hanson, Heather K.Harris, Kathryn R. Lowe, Adrianna Saada, Andrea Mora, Kimberly Madison, RachelHundley, Jessica Egan, Jillian McCarthy, Ally Eran, Michal Galdzicki, LeonardRappaport, Louis M. Kunkel, and Isaac S. Kohane. Characteristics and predictivevalue of blood transcriptome signature in males with autism spectrum disorders.PLoS ONE, 7(12):e49475, December 2012. ISSN 1932-6203.doi:10.1371/journal.pone.0049475. ? pages 6, 9, 51, 55[32] Mark D. Alter, Rutwik Kharkar, Keri E. Ramsey, David W. Craig, Raun D. Melmed,Theresa A. Grebe, R. Curtis Bay, Sharman Ober-Reynolds, Janet Kirwan, Josh J.Jones, J. Blake Turner, Rene Hen, and Dietrich A. Stephan. Autism and increasedpaternal age related changes in global levels of gene expression regulation. PLoSONE, 6(2):e16715, February 2011. doi:10.1371/journal.pone.0016715. URLhttp://dx.doi.org/10.1371/journal.pone.0016715. ? pages 6[33] Rui Luo, Stephan J Sanders, Yuan Tian, Irina Voineagu, Ni Huang, Su H Chu,Lambertus Klei, Chaochao Cai, Jing Ou, Jennifer K Lowe, Matthew E Hurles,Bernie Devlin, Matthew W State, and Daniel H Geschwind. Genome-widetranscriptome profiling reveals the functional impact of rare de novo and recurrentCNVs in autism spectrum disorders. Am J Hum Genet, June 2012. ISSN 1537-6605.doi:10.1016/j.ajhg.2012.05.011. URLhttp://www.ncbi.nlm.nih.gov/pubmed/22726847. PMID: 22726847. ? pages 6, 44,45[34] Patrick F. Sullivan, Cheng Fan, and Charles M. Perou. Evaluating the comparabilityof gene expression in blood and brain. Am. J. Med. Genet. Part B, 141B(3):261268,2006. ISSN 1552-485X. doi:10.1002/ajmg.b.30272. URLhttp://onlinelibrary.wiley.com/doi/10.1002/ajmg.b.30272/abstract. ? pages 8[35] Margus Lukk, Misha Kapushesky, Janne Nikkila?, Helen Parkinson, AngelaGoncalves, Wolfgang Huber, Esko Ukkonen, and Alvis Brazma. A global map ofhuman gene expression. Nature biotechnology, 28(4):322?324, 2010.doi:10.1038/nbt0410-322. URL http://dx.doi.org/10.1038/nbt0410-322. ? pages 8[36] Jennie E Larkin, Bryan C Frank, Haralambos Gavras, Razvan Sultana, and JohnQuackenbush. Independence and reproducibility across microarray platforms. Nat.Methods, 2(5):337?344, April 2005. ISSN 1548-7091, 1548-7105.doi:10.1038/nmeth757. URL http://www.nature.com/doifinder/10.1038/nmeth757.? pages 862[37] Adaikalavan Ramasamy, Adrian Mondry, Chris C Holmes, and Douglas G Altman.Key issues in conducting a meta-analysis of gene expression microarray datasets.PLoS Med, 5(9):e184, September 2008. doi:10.1371/journal.pmed.0050184. URLhttp://dx.doi.org/10.1371/journal.pmed.0050184. ? pages 8, 18[38] Daniel H Geschwind and Pat Levitt. Autism spectrum disorders: developmentaldisconnection syndromes. Curr. Opin. Neurobiol., 17(1):103?111, February 2007.ISSN 0959-4388. doi:10.1016/j.conb.2007.01.009. PMID: 17275283. ? pages 8[39] Jon McClellan and Mary-Claire King. Genetic heterogeneity in human disease. Cell,141(2):210?217, April 2010. ISSN 1097-4172. doi:10.1016/j.cell.2010.03.032.URL http://www.ncbi.nlm.nih.gov/pubmed/20403315. PMID: 20403315. ? pages 8[40] Julia Feichtinger, Gerhard G. Thallinger, Ramsay J. McFarlane, and Lee D.Larcombe. Microarray meta-analysis: From data to expression to biologicalrelationships. In Zlatko Trajanoski, editor, Computational Medicine, pages 59?77.Springer Vienna, Vienna, 2012. ISBN 978-3-7091-0946-5, 978-3-7091-0947-2.URL http://www.springerlink.com/content/u961373j411101w0/fulltext.html. ?pages 9[41] Morton M Hunt. How science takes stock: the story of meta-analysis. Russell SageFoundation, New York, 1997. ISBN 0871543893 9780871543899 08715439829780871543981. ? pages 9[42] M. Mistry, J. Gillis, and P. Pavlidis. Genome-wide expression profiling ofschizophrenia using a large combined cohort. Mol Psychiatry, 18(2):215?225,February 2013. ISSN 1359-4184. doi:10.1038/mp.2011.172. URLhttp://www.nature.com/mp/journal/v18/n2/full/mp2011172a.html. ? pages 9, 19, 22,23, 32[43] S. Rogic and P. Pavlidis. Meta-analysis of kindling-induced gene expression changesin the rat hippocampus. Front Neurosci, 3:53, 2009. ? pages 9, 18[44] Kwang Ho Choi, Michael Elashoff, Brandon W Higgs, Jonathan Song, SanghyeonKim, Sarven Sabunciyan, Suad Diglisic, Robert H Yolken, Michael B Knable,E Fuller Torrey, and Maree J Webster. Putative psychosis genes in the prefrontalcortex: combined analysis of gene expression microarrays. BMC Psychiatry, 8:87,November 2008. ISSN 1471-244X. doi:10.1186/1471-244X-8-87. URLhttp://www.ncbi.nlm.nih.gov/pmc/articles/PMC2585075/. PMID: 18992145 PMCID:PMC2585075. ? pages 9[45] Harris M. Cooper, Larry V. Hedges, and Jeff C. Valentine. The handbook of researchsynthesis and meta-analysis. Russell Sage Foundation, New York, 2nd ed edition,2009. ISBN 9780871541635. ? pages 18[46] Li Liu, Aniko Sabo, Benjamin M. Neale, Uma Nagaswamy, Christine Stevens,Elaine Lim, Corneliu A. Bodea, Donna Muzny, Jeffrey G. Reid, Eric Banks, HillaryCoon, Mark DePristo, Huyen Dinh, Tim Fennel, Jason Flannick, Stacey Gabriel,63Kiran Garimella, Shannon Gross, Alicia Hawes, Lora Lewis, Vladimir Makarov,Jared Maguire, Irene Newsham, Ryan Poplin, Stephan Ripke, Khalid Shakir,Kaitlin E. Samocha, Yuanqing Wu, Eric Boerwinkle, Joseph D. Buxbaum, Edwin H.Cook, Bernie Devlin, Gerard D. Schellenberg, James S. Sutcliffe, Mark J. Daly,Richard A. Gibbs, and Kathryn Roeder. Analysis of rare, exonic variation amongstsubjects with autism spectrum disorders and population controls. PLoS Genet, 9(4):e1003443, April 2013. doi:10.1371/journal.pgen.1003443. URLhttp://dx.doi.org/10.1371/journal.pgen.1003443. ? pages 9[47] E Ben-David and S Shifman. Combined analysis of exome sequencing points towarda major role for transcription regulation during brain dev elopment in autism. Mol.Psychiatry, November 2012. ISSN 1476-5578. doi:10.1038/mp.2012.148. PMID:23147383. ? pages 9[48] T. Barrett, S. E. Wilhite, P. Ledoux, C. Evangelista, I. F. Kim, M. Tomashevsky,K. A. Marshall, K. H. Phillippy, P. M. Sherman, M. Holko, A. Yefanov, H. Lee,N. Zhang, C. L. Robertson, N. Serova, S. Davis, and A. Soboleva. NCBI GEO:archive for functional genomics data sets?update. Nucleic Acids Res, 41(D1):D991?D995, November 2012. ISSN 0305-1048, 1362-4962.doi:10.1093/nar/gks1193. URLhttp://www.nar.oxfordjournals.org/cgi/doi/10.1093/nar/gks1193. ? pages 10[49] Yuki Kuwano, Yoko Kamio, Tomoko Kawai, Sakurako Katsuura, Naoko Inada,Akiko Takaki, and Kazuhito Rokutan. Autism-associated gene expression inperipheral leucocytes commonly observed between subjects with autism and healthywomen having autistic children. PLoS ONE, 6(9):e24723, 2011.doi:10.1371/journal.pone.0024723. URLhttp://dx.doi.org/10.1371/journal.pone.0024723. ? pages 11[50] Laurent Gautier, Leslie Cope, Benjamin M. Bolstad, and Rafael A. Irizarry.affy?analysis of affymetrix genechip data at the probe level. Bioinformatics, 20(3):307?315, 2004. ISSN 1367-4803.doi:http://dx.doi.org/10.1093/bioinformatics/btg405. ? pages 12[51] Pan Du, Warren A. Kibbe, and Simon M. Lin. lumi: a pipeline for processingillumina microarray. Bioinformatics, 24(13):1547?1548, July 2008. ISSN1367-4803, 1460-2059. doi:10.1093/bioinformatics/btn224. URLhttp://bioinformatics.oxfordjournals.org/content/24/13/1547. PMID: 18467348. ?pages 12[52] Gary A. Churchill. Fundamentals of experimental design for cDNA microarrays.Nat Genet, 32:490?495, December 2002. doi:10.1038/ng1031. URLhttp://www.nature.com/ng/journal/v32/n4s/full/ng1031.html. ? pages 13[53] W. Evan Johnson, Cheng Li, and Ariel Rabinovic. Adjusting batch effects inmicroarray expression data using empirical bayes methods. Biostat, 8(1):118?127,January 2007. ISSN 1465-4644, 1468-4357. doi:10.1093/biostatistics/kxj037. URL64http://biostatistics.oxfordjournals.org/content/8/1/118. PMID: 16632515. ? pages16[54] Anton Zoubarev, Kelsey M Hamer, Kiran D Keshav, E Luke McCarthy, JosephRoy C Santos, Thea Van Rossum, Cameron McDonald, Adam Hall, Xiang Wan,Raymond Lim, Jesse Gillis, and Paul Pavlidis. Gemma: A resource for the re-use,sharing and meta-analysis of expression profiling data. Bioinformatics, 28(17):2272?3, July 2012. ISSN 1367-4811. doi:10.1093/bioinformatics/bts430. URLhttp://www.ncbi.nlm.nih.gov/pubmed/22782548. PMID: 22782548. ? pages 18[55] RA Fisher. Combining independent tests of significance. American Statistician,2:30, 1948. ? pages 18, 19[56] Y Benjamini and Y Hochberg. Controlling the false discovery rate: a practical andpowerful approach to multiple testing. Journal of the Royal Statistical SocietySeries, B(57):289?300, 1995. ? pages 19[57] Laura Carrel and Huntington F. Willard. X-inactivation profile reveals extensivevariability in x-linked gene expression in females. Nature, 434(7031):400?404,March 2005. ISSN 0028-0836. doi:10.1038/nature03479. URLhttp://www.nature.com/nature/journal/v434/n7031/full/nature03479.html. ? pages19[58] Adeline R. Whitney, Maximilian Diehn, Stephen J. Popper, Ash A. Alizadeh,Jennifer C. Boldrick, David A. Relman, and Patrick O. Brown. Individuality andvariation in gene expression patterns in human blood. PNAS, 100(4):1896?1901,February 2003. ISSN 0027-8424, 1091-6490. doi:10.1073/pnas.252784499. URLhttp://www.pnas.org/content/100/4/1896. PMID: 12578971. ? pages 19[59] Hyo Jung Kang, Yuka Imamura Kawasawa, Feng Cheng, Ying Zhu, Xuming Xu,Mingfeng Li, Andre? M. M. Sousa, Mihovil Pletikos, Kyle A. Meyer, Goran Sedmak,Tobias Guennel, Yurae Shin, Matthew B. Johnson, Z?eljka Krsnik, Simone Mayer,Sofia Fertuzinhos, Sheila Umlauf, Steven N. Lisgo, Alexander Vortmeyer, Daniel R.Weinberger, Shrikant Mane, Thomas M. Hyde, Anita Huttner, Mark Reimers, Joel E.Kleinman, and Nenad estan. Spatio-temporal transcriptome of the human brain.Nature, 478(7370):483?489, October 2011. ISSN 0028-0836.doi:10.1038/nature10523. URLhttp://www.nature.com/nature/journal/v478/n7370/full/nature10523.html. ? pages19[60] Homin K Lee, William Braynen, Kiran Keshav, and Paul Pavlidis. ErmineJ: tool forfunctional analysis of gene expression data sets. BMC Bioinformatics, 6:269, 2005.ISSN 1471-2105. doi:10.1186/1471-2105-6-269. PMID: 16280084. ? pages 21[61] M Ashburner, C A Ball, J A Blake, D Botstein, H Butler, J M Cherry, A P Davis,K Dolinski, S S Dwight, J T Eppig, M A Harris, D P Hill, L Issel-Tarver,A Kasarskis, S Lewis, J C Matese, J E Richardson, M Ringwald, G M Rubin, and65G Sherlock. Gene ontology: tool for the unification of biology. the gene ontologyconsortium. Nat. Genet., 25(1):25?29, May 2000. ISSN 1061-4036.doi:10.1038/75556. PMID: 10802651. ? pages 21[62] Jesse Gillis, Meeta Mistry, and Paul Pavlidis. Gene function analysis in complexdata sets using ErmineJ. Nat. Protocols, 5(6):1148?1159, June 2010. ISSN1754-2189. doi:10.1038/nprot.2010.78. URLhttp://www.nature.com/nprot/journal/v5/n6/full/nprot.2010.78.html. ? pages 22, 55[63] D. L. Wheeler, T. Barrett, D. A. Benson, S. H. Bryant, K. Canese, V. Chetvernin,D. M. Church, M. DiCuccio, R. Edgar, S. Federhen, L. Y. Geer, Y. Kapustin,O. Khovayko, D. Landsman, D. J. Lipman, T. L. Madden, D. R. Maglott, J. Ostell,V. Miller, K. D. Pruitt, G. D. Schuler, E. Sequeira, S. T. Sherry, K. Sirotkin,A. Souvorov, G. Starchenko, R. L. Tatusov, T. A. Tatusova, L. Wagner, andE. Yaschenko. Database resources of the national center for biotechnologyinformation. Nucl. Acids Res., 35(Database):D5?D12, January 2007. ISSN0305-1048, 1362-4962. doi:10.1093/nar/gkl1031. URLhttp://www.nar.oxfordjournals.org/cgi/doi/10.1093/nar/gkl1031. ? pages 22[64] Krassimira Garbett, Philip J. Ebert, Amanda Mitchell, Carla Lintas, Barbara Manzi,Kroly Mirnics, and Antonio M. Persico. Immune transcriptome alterations in thetemporal cortex of subjects with autism. Neurobiol Dis, 30(3):303?311, June 2008.ISSN 0969-9961. doi:10.1016/j.nbd.2008.01.012. URLhttp://www.ncbi.nlm.nih.gov/pmc/articles/PMC2693090/. PMID: 18378158 PMCID:PMC2693090. ? pages 22, 52, 53[65] A E Purcell, O H Jeon, A W Zimmerman, M E Blue, and J Pevsner. Postmortembrain abnormalities of the glutamate neurotransmitter system in autism. Neurology,57(9):1618?1628, November 2001. ISSN 0028-3878. PMID: 11706102. ? pages22, 52, 53[66] Dalila Pinto, Alistair T. Pagnamenta, Lambertus Klei, Richard Anney, DanieleMerico, Regina Regan, Judith Conroy, Tiago R. Magalhaes, Catarina Correia,Brett S. Abrahams, Joana Almeida, Elena Bacchelli, Gary D. Bader, Anthony J.Bailey, Gillian Baird, Agatino Battaglia, Tom Berney, Nadia Bolshakova, SvenBo?lte, Patrick F. Bolton, Thomas Bourgeron, Sean Brennan, Jessica Brian, Susan E.Bryson, Andrew R. Carson, Guillermo Casallo, Jillian Casey, Brian H. Y. Chung,Lynne Cochrane, Christina Corsello, Emily L. Crawford, Andrew Crossett, CherylCytrynbaum, Geraldine Dawson, Maretha de Jonge, Richard Delorme, Irene Drmic,Eftichia Duketis, Frederico Duque, Annette Estes, Penny Farrar, Bridget A.Fernandez, Susan E. Folstein, Eric Fombonne, Christine M. Freitag, John Gilbert,Christopher Gillberg, Joseph T. Glessner, Jeremy Goldberg, Andrew Green,Jonathan Green, Stephen J. Guter, Hakon Hakonarson, Elizabeth A. Heron, MatthewHill, Richard Holt, Jennifer L. Howe, Gillian Hughes, Vanessa Hus, Roberta Igliozzi,Cecilia Kim, Sabine M. Klauck, Alexander Kolevzon, Olena Korvatska, VladKustanovich, Clara M. Lajonchere, Janine A. Lamb, Magdalena Laskawiec, Marion66Leboyer, Ann Le Couteur, Bennett L. Leventhal, Anath C. Lionel, Xiao-Qing Liu,Catherine Lord, Linda Lotspeich, Sabata C. Lund, Elena Maestrini, WilliamMahoney, Carine Mantoulan, Christian R. Marshall, Helen McConachie,Christopher J. McDougle, Jane McGrath, William M. McMahon, AlisonMerikangas, Ohsuke Migita, Nancy J. Minshew, Ghazala K. Mirza, Jeff Munson,Stanley F. Nelson, Carolyn Noakes, Abdul Noor, Gudrun Nygren, Guiomar Oliveira,Katerina Papanikolaou, Jeremy R. Parr, Barbara Parrini, Tara Paton, Andrew Pickles,Marion Pilorge, Joseph Piven, Chris P. Ponting, David J. Posey, Annemarie Poustka,Fritz Poustka, Aparna Prasad, Jiannis Ragoussis, Katy Renshaw, Jessica Rickaby,Wendy Roberts, Kathryn Roeder, Bernadette Roge, Michael L. Rutter, Laura J.Bierut, John P. Rice, Jeff Salt, Katherine Sansom, Daisuke Sato, Ricardo Segurado,Ana F. Sequeira, Lili Senman, Naisha Shah, Val C. Sheffield, Latha Soorya, Ine?sSousa, Olaf Stein, Nuala Sykes, Vera Stoppioni, Christina Strawbridge, RaffaellaTancredi, Katherine Tansey, Bhooma Thiruvahindrapduram, Ann P. Thompson,Susanne Thomson, Ana Tryfon, John Tsiantis, Herman Van Engeland, John B.Vincent, Fred Volkmar, Simon Wallace, Kai Wang, Zhouzhi Wang, Thomas H.Wassink, Caleb Webber, Rosanna Weksberg, Kirsty Wing, Kerstin Wittemeyer,Shawn Wood, Jing Wu, Brian L. Yaspan, Danielle Zurawiecki, LonnieZwaigenbaum, Joseph D. Buxbaum, Rita M. Cantor, Edwin H. Cook, Hilary Coon,Michael L. Cuccaro, Bernie Devlin, Sean Ennis, Louise Gallagher, Daniel H.Geschwind, Michael Gill, Jonathan L. Haines, Joachim Hallmayer, Judith Miller,Anthony P. Monaco, John I. Nurnberger Jr, Andrew D. Paterson, Margaret A.Pericak-Vance, Gerard D. Schellenberg, Peter Szatmari, Astrid M. Vicente,Veronica J. Vieland, Ellen M. Wijsman, Stephen W. Scherer, James S. Sutcliffe, andCatalina Betancur. Functional impact of global rare copy number variation in autismspectrum disorders. Nature, 466(7304):368?372, July 2010. ISSN 0028-0836.doi:10.1038/nature09146. URLhttp://www.nature.com/nature/journal/v466/n7304/full/nature09146.html. ? pages22[67] A. John Iafrate, Lars Feuk, Miguel N. Rivera, Marc L. Listewnik, Patricia K.Donahoe, Ying Qi, Stephen W. Scherer, and Charles Lee. Detection of large-scalevariation in the human genome. Nat Genet, 36(9):949?951, September 2004. ISSN1061-4036. doi:10.1038/ng1416. URLhttp://www.nature.com/ng/journal/v36/n9/abs/ng1416.html. ? pages 23[68] A. S. Hinrichs, D. Karolchik, R. Baertsch, G. P. Barber, G. Bejerano, H. Clawson,M. Diekhans, T. S. Furey, R. A. Harte, F. Hsu, J. Hillman-Jackson, R. M. Kuhn, J. S.Pedersen, A. Pohl, B. J. Raney, K. R. Rosenbloom, A. Siepel, K. E. Smith, C. W.Sugnet, A. Sultan-Qurraie, D. J. Thomas, H. Trumbower, R. J. Weber, M. Weirauch,A. S. Zweig, D. Haussler, and W. J. Kent. The UCSC genome browser database:update 2006. Nucleic Acids Res, 34(Database issue):D590?D598, January 2006.ISSN 0305-1048. doi:10.1093/nar/gkj144. URLhttp://www.ncbi.nlm.nih.gov/pmc/articles/PMC1347506/. PMID: 16381938 PMCID:PMC1347506. ? pages 2367[69] P. Pavlidis, I. Wapinski, and W. S. Noble. Support vector machine classification onthe web. Bioinformatics, 20:586?7, 2004. ISSN 1367-4803 (Print). URLhttp://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list uids=14990457. 4. ? pages 23[70] T S Keshava Prasad, Kumaran Kandasamy, and Akhilesh Pandey. Human proteinreference database and human proteinpedia as discovery tools for systems biology.Methods Mol. Biol., 577:67?79, 2009. ISSN 1940-6029.doi:10.1007/978-1-60761-232-2 6. PMID: 19718509. ? pages 23[71] Andrew Chatr-aryamontri, Arnaud Ceol, Luisa Montecchi Palazzi, GiulianoNardelli, Maria Victoria Schneider, Luisa Castagnoli, and Gianni Cesareni. MINT:the molecular INTeraction database. Nucl. Acids Res., 35(suppl 1):D572?D574,January 2007. ISSN 0305-1048, 1362-4962. doi:10.1093/nar/gkl950. URLhttp://nar.oxfordjournals.org/content/35/suppl 1/D572. PMID: 17135203. ? pages23[72] Ioannis Xenarios, Danny W. Rice, Lukasz Salwinski, Marisa K. Baron, Edward M.Marcotte, and David Eisenberg. DIP: the database of interacting proteins. NucleicAcids Res, 28(1):289?291, January 2000. ISSN 0305-1048. URLhttp://www.ncbi.nlm.nih.gov/pmc/articles/PMC102387/. PMID: 10592249 PMCID:PMC102387. ? pages 23[73] David J. Lynn, Geoffrey L. Winsor, Calvin Chan, Nicolas Richard, Matthew R.Laird, Aaron Barsky, Jennifer L. Gardy, Fiona M. Roche, Timothy H. W. Chan,Naisha Shah, Raymond Lo, Misbah Naseer, Jaimmie Que, Melissa Yau, MichaelAcab, Dan Tulpan, Matthew D. Whiteside, Avinash Chikatamarla, Bernadette Mah,Tamara Munzner, Karsten Hokamp, Robert E. W. Hancock, and Fiona S. L.Brinkman. InnateDB: facilitating systems-level analyses of the mammalian innateimmune response. Mol Syst Biol, 4(1), September 2008. doi:10.1038/msb.2008.55.URL http://www.nature.com/msb/journal/v4/n1/full/msb200855.html. ? pages 23[74] Sabry Razick, George Magklaras, and Ian M. Donaldson. iRefIndex: a consolidatedprotein interaction database with provenance. BMC Bioinformatics, 9(1):405,September 2008. ISSN 1471-2105. doi:10.1186/1471-2105-9-405. URLhttp://www.biomedcentral.com/1471-2105/9/405/abstract. PMID: 18823568. ?pages 23[75] American Psychiatric Association, American Psychiatric Association, and TaskForce on DSM-IV. Diagnostic and statistical manual of mental disordersDSM-IV-TR. American Psychiatric Association, Washington, DC, 2000. ISBN0890423342 9780890423349. URLhttp://dsm.psychiatryonline.org/book.aspx?bookid=22. ? pages 24[76] C Lord, M Rutter, and A Le Couteur. Autism diagnostic interview-revised: a revisedversion of a diagnostic interview for caregivers of individuals with possiblepervasive developmental disorders. J Autism Dev Disord, 24(5):659?685, October1994. ISSN 0162-3257. PMID: 7814313. ? pages 2468[77] C Lord, M Rutter, S Goode, J Heemsbergen, H Jordan, L Mawhood, and E Schopler.Autism diagnostic observation schedule: a standardized observation ofcommunicative and social behavior. J Autism Dev Disord, 19(2):185?212, June1989. ISSN 0162-3257. PMID: 2745388. ? pages 24[78] M. Mistry and P. Pavlidis. A cross-laboratory comparison of expression profilingdata from normal human postmortem brain. Neuroscience, 167:384?95, 2010. ISSN1873-7544 (Electronic) 0306-4522 (Linking).doi:S0306-4522(10)00017-5[pii]10.1016/j.neuroscience.2010.01.016. URLhttp://www.ncbi.nlm.nih.gov/pubmed/20138973. 2. ? pages 31[79] Yiming Zhou, Qing Zhang, Owen Stephens, Christoph J Heuck, Erming Tian,Jeffrey R Sawyer, Marie-Astrid Cartron-Mizeracki, Pingping Qu, Jason Keller,Joshua Epstein, Bart Barlogie, and Jr Shaughnessy, John D. Prediction ofcytogenetic abnormalities with gene expression profiles. Blood, 119(21):e148?150,May 2012. ISSN 1528-0020. doi:10.1182/blood-2011-10-388702. PMID:22496154. ? pages 44, 55[80] Gerald D. Fischbach and Catherine Lord. The simons simplex collection: A resourcefor identification of autism genetic risk factors. Neuron, 68(2):192?195, October2010. ISSN 0896-6273. doi:10.1016/j.neuron.2010.10.006. URLhttp://www.sciencedirect.com/science/article/pii/S0896627310008305. ? pages 50[81] Georgy Bakalkin, Hiroyuki Watanabe, Justyna Jezierska, Clo Depoorter, CorienVerschuuren-Bemelmans, Igor Bazov, Konstantin A Artemenko, Tatjana Yakovleva,Dennis Dooijes, Bart P C Van de Warrenburg, Roman A Zubarev, Berry Kremer,Pamela E Knapp, Kurt F Hauser, Cisca Wijmenga, Fred Nyberg, Richard J Sinke,and Dineke S Verbeek. Prodynorphin mutations cause the neurodegenerativedisorder spinocerebellar ataxia type 23. Am. J. Hum. Genet., 87(5):593?603,November 2010. ISSN 1537-6605. doi:10.1016/j.ajhg.2010.10.001. PMID:21035104. ? pages 51[82] J F Oram. Tangier disease and ABCA1. Biochim. Biophys. Acta, 1529(1-3):321?330, December 2000. ISSN 0006-3002. PMID: 11111099. ? pages 51[83] Linda Erkman, Paul A. Yates, Todd McLaughlin, Robert J. McEvilly, ThomasWhisenhunt, Shawn M. O?Connell, Anna I. Krones, Michael A. Kirby, David H.Rapaport, John R. Bermingham Jr., Dennis D.M. O?Leary, and Michael G.Rosenfeld. A POU domain transcription FactorDependent program regulates axonpathfinding in the vertebrate visual system. Neuron, 28(3):779?792, December 2000.ISSN 0896-6273. doi:10.1016/S0896-6273(00)00153-7. URLhttp://www.sciencedirect.com/science/article/pii/S0896627300001537. ? pages 51[84] Nicolas Ramoz, Jennifer G. Reichert, Christopher J. Smith, Jeremy M. Silverman,Irina N. Bespalova, Kenneth L. Davis, and Joseph D. Buxbaum. Linkage andassociation of the mitochondrial Aspartate/Glutamate carrier SLC25A12 gene withautism. Am J Psychiatry, 161(4):662?669, April 2004. ISSN 0002-953X.69doi:10.1176/appi.ajp.161.4.662. URL http://dx.doi.org/10.1176/appi.ajp.161.4.662.? pages 54[85] Maria P Abbracchio, Geoffrey Burnstock, Alexei Verkhratsky, and HerbertZimmermann. Purinergic signalling in the nervous system: an overview. TrendsNeurosci., 32(1):19?29, January 2009. ISSN 0166-2236.doi:10.1016/j.tins.2008.10.001. PMID: 19008000. ? pages 54[86] Ana C Andreazza, Li Shao, Jun-Feng Wang, and L Trevor Young. Mitochondrialcomplex i activity and oxidative damage to mitochondrial proteins in the prefrontalcortex of patients with bipolar disorder. Arch. Gen. Psychiatry, 67(4):360?368, April2010. ISSN 1538-3636. doi:10.1001/archgenpsychiatry.2010.22. PMID: 20368511.? pages 54[87] Xiujun Sun, Jun-Feng Wang, Michael Tseng, and L Trevor Young. Downregulationin components of the mitochondrial electron transport chain in the postmortemfrontal cortex of subjects with bipolar disorder. J Psychiatry Neurosci, 31(3):189?196, May 2006. ISSN 1180-4882. PMID: 16699605. ? pages 54[88] D. A. Rossignol and R. E. Frye. Mitochondrial dysfunction in autism spectrumdisorders: a systematic review and meta-analysis. Mol Psychiatry, 17(3):290?314,2012. ISSN 1359-4184. doi:10.1038/mp.2010.136. URLhttp://www.nature.com/mp/journal/v17/n3/full/mp2010136a.html. ? pages 54[89] Fahimeh Piryaei, Massoud Houshmand, Omid Aryani, Sepideh Dadgar, andZahra-Soheila Soheili. Investigation of the mitochondrial ATPase 6/8 andtRNA(Lys) genes mutations in autism. Cell J, 14(2):98?101, 2012. ISSN2228-5806. PMID: 23508290. ? pages 54[90] Vanesa lvarez Iglesias, Ana Mosquera-Miguel, Ivn Cusc, ngel Carracedo,Luis Alberto Prez-Jurado, and Antonio Salas. Reassessing the role of mitochondrialDNA mutations in autism spectrum disorder. BMC Med. Genet., 12:50, 2011. ISSN1471-2350. doi:10.1186/1471-2350-12-50. PMID: 21470425. ? pages 54[91] Sukhbir Dhillon, Jessica A Hellings, and Merlin G Butler. Genetics andmitochondrial abnormalities in autism spectrum disorders: A review. CurrGenomics, 12(5):322?332, August 2011. ISSN 1389-2029.doi:10.2174/138920211796429745. URLhttp://www.ncbi.nlm.nih.gov/pmc/articles/PMC3145262/. PMID: 22294875 PMCID:PMC3145262. ? pages 54[92] Ayyappan Anitha, Kazuhiko Nakamura, Ismail Thanseem, Kazuo Yamada, YoshimiIwayama, Tomoko Toyota, Hideo Matsuzaki, Taishi Miyachi, Satoru Yamada,Masatsugu Tsujii, Kenji J Tsuchiya, Kaori Matsumoto, Yasuhide Iwata, KatsuakiSuzuki, Hironobu Ichikawa, Toshiro Sugiyama, Takeo Yoshikawa, and Norio Mori.Brain region-specific altered expression and association of mitochondria-relatedgenes in autism. Mol Autism, 3(1):12, 2012. ISSN 2040-2392.doi:10.1186/2040-2392-3-12. PMID: 23116158. ? pages 5470[93] C. N. Henrichsen, E. Chaignat, and A. Reymond. Copy number variants, diseasesand gene expression. Hum Mol Gen, 18(R1):R1?R8, April 2009. ISSN 0964-6906,1460-2083. doi:10.1093/hmg/ddp011. URLhttp://www.hmg.oxfordjournals.org/cgi/doi/10.1093/hmg/ddp011. ? pages 55[94] Jason J. Wolff. Differences in white matter fiber tract development present from 6 to24 months in infants with autism. Am J Psychiatry, February 2012. ISSN0002-953X. doi:10.1176/appi.ajp.2011.11091447. URL http://neuro.psychiatryonline.org/article.aspx?articleid=668180&RelatedWidgetArticles=true. ?pages 55[95] G Konopka, E Wexler, E Rosen, Z Mukamel, G E Osborn, L Chen, D Lu, F Gao,K Gao, J K Lowe, and D H Geschwind. Modeling the functional genomics of autismusing human neurons. Mol Psychiatry, 17(2):202?214, February 2012. ISSN1359-4184. URL http://dx.doi.org/10.1038/mp.2011.60. ? pages 55[96] C. Ecker, W. Spooren, and D. G. M. Murphy. Translational approaches to thebiology of autism: false dawn or a new era? Mol Psychiatry, 18(4):435?442, April2013. ISSN 1359-4184. doi:10.1038/mp.2012.102. URL http://www.nature.com/mp/journal/v18/n4/full/mp2012102a.html?WT.ec id=MP-201304.? pages 55[97] Jesse Gillis and Paul Pavlidis. The impact of multifunctional genes on ?Guilt byassociation? analysis. PLoS ONE, 6(2):e17258, February 2011.doi:10.1371/journal.pone.0017258. URLhttp://dx.doi.org/10.1371/journal.pone.0017258. ? pages 5571Appendix AAppendixTable A.1: Up-regulated brain meta-signature. FDR Computed before removal ofsex-biased genes. A: Known Candidate; B: Gender Biased; C: Known CNV.Y: Yes; N: No.Gene Symbol Entrez ID Number ofStudiesp-value FDR A B C LocusPTPN1 5770 3 1.47E-06 1.37E-02 N N N 20q13.1-q13.2HSPA1A 3303 3 2.48E-06 1.37E-02 N N N 6p21.3ZC3HAV1 56829 3 2.52E-06 1.37E-02 N N N 7q34C5AR1 728 3 4.19E-06 1.37E-02 N N N 19q13.3-q13.4STC1 6781 3 4.55E-06 1.37E-02 N N N 8p21-p11.2KIF20A 10112 3 5.55E-06 1.37E-02 N N N 5q31PDYN 5173 3 6.05E-06 1.37E-02 N N N 20p13GANAB 23193 3 6.60E-06 1.37E-02 N N N 11q12.3TMED9 54732 3 1.16E-05 2.14E-02 N N N 5q35.3CALR 811 3 1.87E-05 2.98E-02 N N N 19p13.3-p13.2FAM159A 348378 3 2.00E-05 2.98E-02 N N N 1p32.3SCIN 85477 3 2.27E-05 2.98E-02 N N Y 7p21.3HSPA5 3309 3 2.52E-05 2.98E-02 N N N 9q33.3C1orf106 55765 3 2.60E-05 2.98E-02 N N N 1q32.1BRPF1 7862 3 2.93E-05 2.98E-02 N N N 3p26-p25IGFBP5 3488 3 3.11E-05 2.98E-02 N N N 2q33-q36C2CD4A 145741 3 3.31E-05 2.98E-02 N N N 15q22.2Continued. . .72Gene Symbol Entrez ID Number ofStudiesp-value FDR A B C LocusLILRB3 11025 3 3.34E-05 2.98E-02 N N N 19q13.4CALU 813 3 3.56E-05 2.98E-02 N N N 7q32.1TAGLN2 8407 3 3.59E-05 2.98E-02 N N N 1q21-q25CD93 22918 3 4.37E-05 3.45E-02 N N N 20p11.21DNAJB1 3337 3 4.91E-05 3.50E-02 N N N 19p13.2CLDN23 137075 3 5.00E-05 3.50E-02 N N N 8p23.1LMAN2L 81562 3 5.07E-05 3.50E-02 N N N 2q11.2ADAMTS9 56999 3 5.73E-05 3.80E-02 N N N 3p14.1CRISPLD2 83716 3 6.19E-05 3.95E-02 N N N 16q24.1PCOLCE2 26577 3 7.02E-05 4.18E-02 N N N 3q21-q24ABCA1 19 3 7.19E-05 4.18E-02 N N N 9q31.1JUN 3725 3 7.36E-05 4.18E-02 N N N 1p32-p31ITPRIP 85450 3 7.65E-05 4.18E-02 N Y N 10q25.1ALPK1 80216 3 7.81E-05 4.18E-02 N N N 4q25Table A.2: Down-regulated brain meta-signature. FDR Computed before removal ofsex-biased genes. A: Known Candidate; B: Gender Biased; C: Known CNV. Y:Yes; N: No.Gene Symbol Entrez ID Number ofStudiesp-value FDR A B C LocusABCG2 9429 3 3.18E-07 5.27E-03 N N Y 4q22SHD 56961 3 1.09E-06 9.04E-03 N N N 19p13.3FIS1 51024 3 4.39E-06 1.66E-02 N N N 7q22.1RCAN2 10231 3 5.50E-06 1.66E-02 N N N 6p12.3MRPL2 51069 3 6.51E-06 1.66E-02 N N N 6p21.3COA1 55744 3 6.72E-06 1.66E-02 N N N 7p13C12orf57 113246 3 8.19E-06 1.66E-02 N N N 12p13.31SLC22A18AS 5003 3 8.66E-06 1.66E-02 N N N 11p15.5TTC1 7265 3 9.01E-06 1.66E-02 N N N 5q33.3KLHDC2 23588 3 1.72E-05 2.64E-02 N N N 14q21.3ATP5O 539 3 1.83E-05 2.64E-02 N N N 21q22.11Continued. . .73Gene Symbol Entrez ID Number ofStudiesp-value FDR A B C LocusZFAND2B 130617 3 1.91E-05 2.64E-02 N N N 2q35CCDC25 55246 3 2.59E-05 3.31E-02 N N N 8p21.1THOC5 8563 3 2.90E-05 3.43E-02 N N N 22q12.2ANKRD29 147463 3 3.11E-05 3.44E-02 N N N 18q11.2ZNF25 219749 3 3.64E-05 3.72E-02 N N N 10p11.1HAPLN4 404037 3 3.81E-05 3.72E-02 N N N 19p13.1ANO1 55107 3 4.19E-05 3.86E-02 N N N 11q13.3FBXL15 79176 3 4.70E-05 3.98E-02 N N N 10q24.32ABTB1 80325 3 4.79E-05 3.98E-02 N N N 3q21UQCRQ 27089 3 5.45E-05 4.20E-02 N N N 5q31.1MRPL54 116541 3 5.57E-05 4.20E-02 N N N 19p13.3ASGR2 433 3 6.85E-05 4.89E-02 N N N 17pPRKAB1 5564 3 7.74E-05 4.89E-02 N N N 12q24.1-q24.3VILL 50853 3 7.97E-05 4.89E-02 N N N 3p21.3PANX2 56666 3 8.07E-05 4.89E-02 N N Y 22q13.33SNRNP25 79622 3 8.69E-05 4.89E-02 N N Y 16p13.3CCBL2 56267 3 8.75E-05 4.89E-02 N N N 1p22.2COQ3 51805 3 8.80E-05 4.89E-02 N N N 6q16.2PIH1D1 55011 3 9.13E-05 4.89E-02 N N N 19q13.33CHCHD6 84303 3 9.32E-05 4.89E-02 N N N 3q21.3NAT6 24142 3 9.79E-05 4.89E-02 N N N 3p21.3EIF3K 27335 3 9.82E-05 4.89E-02 N N N 19q13.2FAM58A 92002 3 1.00E-04 4.89E-02 N N N Xq28GRK6 2870 3 1.10E-04 4.97E-02 N N Y 5q35GAS2 2620 3 1.15E-04 4.97E-02 Y N N 11p14.3HINT2 84681 3 1.16E-04 4.97E-02 N N N 9p13.3NEFH 4744 3 1.18E-04 4.97E-02 N N N 22q12.2PVALB 5816 3 1.25E-04 4.97E-02 N N N 22q13.1ZBTB8OS 339487 3 1.29E-04 4.97E-02 N N N 1p35.1TOX2 84969 3 1.33E-04 4.97E-02 N N N 20q13.12NFU1 27247 3 1.33E-04 4.97E-02 N N N 2p15-p13ACTR6 64431 3 1.34E-04 4.97E-02 N N N 12q23.1NDUFAF2 91942 3 1.38E-04 4.97E-02 N N N 5q12.1Continued. . .74Gene Symbol Entrez ID Number ofStudiesp-value FDR A B C LocusEDN3 1908 3 1.41E-04 4.97E-02 N N N 20q13.2-q13.3ACTR1B 10120 3 1.43E-04 4.97E-02 N N N 2q11.1-q11.2GLI4 2738 3 1.47E-04 4.97E-02 N N N 8q24.3FDX1L 112812 3 1.49E-04 4.97E-02 N N N 19p13.2ALOX12B 242 3 1.49E-04 4.97E-02 N N N 17p13.1Table A.3: Up-regulated blood meta-signature. FDR Computed before removal ofsex-biased genes. A: Known Candidate; B: Gender Biased; C: Known CNV. Y:Yes; N: No.Gene Symbol Entrez ID Number ofStudiesp-value FDR A B C LocusKDM5D 8284 9 2.40E-16 4.56E-12 N Y N Yq11ZFY 7544 6 1.23E-14 1.17E-10 N Y N Yp11.3RPS4Y1 6192 6 3.03E-13 1.92E-09 N Y N Yp11.3USP9Y 8287 9 2.81E-11 1.33E-07 Y Y N Yq11.2EIF1AY 9086 9 5.27E-11 2.00E-07 N Y N Yq11.223PRKY 5616 8 2.76E-10 8.23E-07 N Y N Yp11.2UTY 7404 9 3.03E-10 8.23E-07 N Y N Yq11DDX3Y 8653 9 2.13E-09 5.05E-06 N Y N Yq11TXLNG2P 246126 9 3.85E-08 7.32E-05 N Y N Yq11.222SCARNA17 677769 5 3.55E-08 7.32E-05 N N N 18q21.1ZNF322 79692 9 8.12E-08 1.40E-04 N N N 6p22.1ZNF594 84622 6 1.94E-007 3.07E-04 N N N 17p13CXCR7 57007 9 2.10E-07 3.07E-04 N N N 2q37.3CAMSAP2 23271 9 3.84E-007 5.21E-04 Y N N 1q32.1GIMAP8 155038 6 5.05E-07 6.40E-04 N N N 7q36.1TMSB4Y 9087 8 5.92E-007 6.63E-04 N Y N Yq11.221ZNF721 170960 9 5.94E-07 6.63E-04 N N Y 4p16.3NOTCH2 4853 9 8.69E-07 9.17E-04 N Y N 1p13-p11ABLIM1 3983 9 9.89E-07 9.39E-04 N N N 10q25PRKCH 5583 8 9.63E-07 9.39E-04 N N N 14q23.1Continued. . .75Gene Symbol Entrez ID Number ofStudiesp-value FDR A B C LocusUBE3A 7337 9 1.76E-06 1.59E-03 Y N Y 15q11.2MALAT1 378938 8 1.97E-06 1.70E-03 N N N 11q13.1ZNF763 284390 4 2.71E-06 2.24E-03 N N N 19p13.2PLAG1 5324 7 2.99E-06 2.37E-03 N N N 8q12MAN2A2 4122 9 4.66E-06 3.54E-03 N N N 15q26.1BAHD1 22893 6 6.38E-06 4.39E-03 N N N 15q15.1HCK 3055 8 6.40E-06 4.39E-03 N N Y 20q11-q12APBB1 322 9 6.47E-06 4.39E-03 N N N 11p15FAM82A1 151393 6 7.22E-06 4.57E-03 N N N 2p22.2LOC728392 728392 7 7.06E-06 4.57E-03 N N N 17p13.2ENO3 2027 6 7.67E-06 4.70E-03 N N N 17pter-p11P2RX7 5027 6 8.50E-06 4.83E-03 N N Y 12q24GFOD1 54438 9 8.62E-06 4.83E-03 N N N 6pter-p22.1SEMA4C 54910 6 8.65E-06 4.83E-03 N N N 2q11.2LOC100287482 100287482 5 9.03E-06 4.87E-03 N N N 7q32.1ZNF611 81856 5 9.22E-06 4.87E-03 N N Y 19q13.41ZNF445 353274 7 1.05E-05 5.24E-03 N N N 3p21.32LCP2 3937 6 1.03E-05 5.24E-03 N N N 5q35.1ADCY1 107 9 1.78E-05 8.67E-03 N N N 7p13-p12TRUB1 142940 9 1.98E-05 9.17E-03 N N N 10q25.3ADCY10P1 221442 6 2.03E-05 9.17E-03 N N N 6p21.1ZFP62 643836 7 2.02E-05 9.17E-03 N N N 5q35.3CCDC88C 440193 5 2.12E-05 9.34E-03 N N N 14q32.11KIF26B 55083 6 2.16E-05 9.34E-03 N N N 1q44IFFO2 126917 8 2.60E-05 9.48E-03 N N N 1p36.13MTERFD2 130916 9 2.49E-05 9.48E-03 N N N 2q37.3ESF1 51575 9 2.58E-05 9.48E-03 N N N 20p12.1TXK 7294 7 2.52E-05 9.48E-03 N N N 4p12AHNAK 79026 9 2.56E-05 9.48E-03 N N N 11q12.2ATRN 8455 9 2.26E-05 9.48E-03 N N N 20p13ORMDL1 94101 9 2.57E-05 9.48E-03 N N N 2q32CYBRD1 79901 9 2.59E-05 9.48E-03 N N N 2q31.1ZNF292 23036 8 2.65E-05 9.49E-03 N N N 6q14.3Continued. . .76Gene Symbol Entrez ID Number ofStudiesp-value FDR A B C LocusLOC284757 284757 4 3.11E-05 1.07E-02 N N N 20q13.33IL21R 50615 9 3.11E-05 1.07E-02 N N N 16p11GVINP1 387751 5 3.22E-05 1.09E-02 N N N 11p15.4CYFIP1 23191 9 4.69E-05 1.53E-02 Y N Y 15q11SMARCA2 6595 9 4.65E-05 1.53E-02 N N Y 9p22.3SH2D1B 117157 6 4.98E-05 1.59E-02 N N Y 1q23.3CIB4 130106 6 5.08E-05 1.59E-02 N N N 2p23.3BIN2 51411 6 5.11E-05 1.59E-02 N N N 12q13SEPT8 23176 6 5.32E-05 1.61E-02 N N N 5q31RPS24 6229 8 5.35E-05 1.61E-02 N N N 10q22C1orf63 57035 8 5.92E-05 1.76E-02 N N N 1p36.13-p35.1HIF1AN 55662 9 6.92E-05 2.02E-02 N N N 10q24AGPAT5 55326 9 7.22E-05 2.08E-02 N N N 8p23.1ZNF37A 7587 7 7.94E-05 2.25E-02 N N N 10p11.2ITPKB 3707 8 9.00E-05 2.48E-02 N N N 1q42.13ENGASE 64772 9 8.89E-05 2.48E-02 N N N 17q25.3ZNF197 10168 9 9.69E-05 2.63E-02 N N N 3p21JARID2 3720 9 9.91E-05 2.65E-02 Y N N 6p24-p23CARD11 84433 9 1.02E-04 2.68E-02 N N N 7p22SPON2 10417 6 1.06E-04 2.71E-02 N N Y 4p16.3CSTF2T 23283 7 1.07E-04 2.71E-02 N N Y 10q11ARMCX3 51566 9 1.06E-04 2.71E-02 N N N Xq22.1CX3CR1 1524 7 1.17E-04 2.76E-02 N N N 3p21.3FOXK1 221937 9 1.19E-04 2.76E-02 N N N 7p22.1HIPK2 28996 9 1.19E-04 2.76E-02 N N N 7q32-q34HOXB2 3212 9 1.20E-04 2.76E-02 N N N 17q21.32INSR 3643 9 1.20E-04 2.76E-02 N N N 19p13.3-p13.2CD244 51744 6 1.18E-04 2.76E-02 N N N 1q23.3ZFP14 57677 7 1.11E-04 2.76E-02 N N N 19q13.12FUT8-AS1 645431 4 1.19E-04 2.76E-02 N N Y 14q23.3IFFO1 25900 6 1.26E-04 2.84E-02 N N N 12p13.3ZNF514 84874 6 1.28E-04 2.85E-02 N N N 2q11.1RAG1 5896 8 1.31E-04 2.88E-02 N N N 11p13Continued. . .77Gene Symbol Entrez ID Number ofStudiesp-value FDR A B C LocusSON 6651 9 1.39E-04 3.04E-02 N N N 21q22.11RCN1 5954 9 1.43E-04 3.05E-02 N N N 11p13RFC2 5982 6 1.42E-04 3.05E-02 N N Y 7q11.23CARS2 79587 6 1.45E-04 3.06E-02 N N N 13q34KIAA1107 23285 6 1.51E-04 3.16E-02 N N N 1p22.1IRF2BPL 64207 9 1.59E-04 3.26E-02 N N Y 14q24.3DENND2D 79961 6 1.60E-04 3.26E-02 N N N 1p13.3FAM161A 84140 9 1.62E-04 3.28E-02 N N N 2p15TET3 200424 8 1.64E-04 3.29E-02 N N N 2p13.1LMO7 4008 9 1.67E-04 3.30E-02 N N N 13q22.2KLHL29 114818 8 1.78E-04 3.48E-02 N N N 2p24.1CYTH3 9265 9 1.84E-04 3.58E-02 N N N 7p22.1MKRN2 23609 9 1.92E-04 3.68E-02 N N N 3p25BICD2 23299 9 1.95E-04 3.70E-02 N N N 9q22.31ZNF337 26152 9 2.00E-04 3.70E-02 N N N 20p11.1HK2 3099 9 1.99E-04 3.70E-02 N N N 2p13ACYP1 97 9 2.01E-04 3.70E-02 N N N 14q24.3NLGN4Y 22829 6 2.07E-04 3.77E-02 Y Y N Yq11.221LUC7L3 51747 9 2.09E-04 3.77E-02 N N N 17q21.33GPR133 283383 6 2.20E-04 3.94E-02 N N Y 12q24.33C20orf112 140688 9 2.25E-04 3.99E-02 N N Y 20q11.21ACRC 93953 8 2.27E-04 3.99E-02 N N N Xq13.1PAFAH1B1 5048 9 2.31E-04 4.02E-02 Y N N 17p13.3PPP1R12A 4659 9 2.34E-04 4.03E-02 N N N 12q15-q21ERCC5 2073 7 2.37E-04 4.05E-02 N N N 13q33BOK 666 8 2.43E-04 4.08E-02 N N N 2q37.3ZC3H14 79882 9 2.43E-04 4.08E-02 N N N 14q31.3SNRNP200 23020 9 2.57E-04 4.23E-02 N N N 2q11.2ZC3H7B 23264 9 2.55E-04 4.23E-02 N N N 22q13.2SPOCK2 9806 7 2.59E-04 4.23E-02 N N N 10pter-q25.3HNRNPA2B1 3181 9 2.71E-04 4.40E-02 N N N 7p15SCCPDH 51097 9 2.78E-04 4.48E-02 N N Y 1q44CTDSP2 10106 9 2.89E-04 4.53E-02 N N Y 12q14.1Continued. . .78Gene Symbol Entrez ID Number ofStudiesp-value FDR A B C LocusRNF168 165918 7 2.93E-04 4.53E-02 N N N 3q29FAN1 22909 7 2.90E-04 4.53E-02 Y N Y 15q13.2-q13.3MIR17HG 407975 7 2.89E-04 4.53E-02 N N N 13q31.3KLRF1 51348 6 2.94E-04 4.53E-02 N N N 12p13.31Table A.4: Down-regulated blood meta-signature. FDR Computed before removal ofsex-biased genes. A: Known Candidate; B: Gender Biased; C: Known CNV. Y:Yes; N: No.Gene Symbol Entrez ID Number ofStudiesp-value FDR A B C LocusHDHD1 8226 9 1.45E-10 1.51E-06 N Y N Xp22.32KDM6A 7403 9 1.59E-10 1.51E-06 N Y N Xp11.2BRAF 673 9 5.50E-07 3.02E-03 Y N N 7q34SERPINB9 5272 8 7.02E-07 3.02E-03 N N N 6p25HIST1H3F 8968 4 7.95E-07 3.02E-03 N N N 6p22.2PNOC 5368 7 1.12E-06 3.55E-03 N N N 8p21STRA13 201254 7 2.47E-06 6.24E-03 N N N 17q25.3SLC17A9 63910 6 2.83E-06 6.24E-03 N N Y 20q13.33AURKB 9212 9 2.96E-06 6.24E-03 N N N 17p13.1MPC2 25874 6 4.00E-06 7.03E-03 N N N 1q24TRANK1 9881 8 4.07E-06 7.03E-03 N N N 3p22.2MRPL10 124995 6 5.07E-06 8.02E-03 N N N 17q21.32LOC338758 338758 6 6.74E-06 9.84E-03 N N N 12q21.33EMC4 51234 6 9.10E-06 1.21E-02 N N N 15q14KDM5C 8242 9 9.59E-06 1.21E-02 Y Y N Xp11.22-p11.21RNASE2 6036 5 1.07E-05 1.27E-02 N N N 14q24-q31MED25 81857 6 1.33E-05 1.48E-02 N N N 19q13.3SNX22 79856 9 1.75E-05 1.77E-02 N N N 15q22.31BLVRB 645 9 1.80E-05 1.77E-02 N N N 19q13.1-q13.2CHRM3 1131 5 1.87E-05 1.77E-02 N N N 1q43MCM6 4175 9 2.20E-05 1.99E-02 N N N 2q21Continued. . .79Gene Symbol Entrez ID Number ofStudiesp-value FDR A B C LocusZNF784 147808 8 2.35E-05 2.03E-02 N N N 19q13.42ISG15 9636 9 2.47E-05 2.04E-02 N N N 1p36.33CXCR3 2833 6 2.76E-05 2.11E-02 Y N N Xq13CCDC50 152137 9 2.83E-05 2.11E-02 N N Y 3q28C1orf170 84808 5 2.99E-05 2.11E-02 N N N 1p36.33NINJ2 4815 8 2.99E-05 2.11E-02 N N N 12p13TSPAN12 23554 8 3.23E-05 2.19E-02 N N Y 7q31.31KLF1 10661 7 3.52E-05 2.27E-02 N N N 19p13.2C12orf49 79794 7 3.69E-05 2.27E-02 N N N 12q24.22TK1 7083 7 3.81E-05 2.27E-02 N N N 17q23.2-q25.3GNPDA1 10007 8 3.83E-05 2.27E-02 N N N 5q21TREML3P 340206 4 4.30E-05 2.47E-02 N N N 6p21.1KCND1 3750 8 4.82E-05 2.69E-02 N N N Xp11.23FAM46C 54855 9 5.12E-05 2.78E-02 N N N 1p12DALRD3 55152 6 5.55E-05 2.93E-02 N N N 3p21.31ZDHHC16 84287 9 5.92E-05 3.04E-02 N N N 10q24.1PLK1 5347 6 6.43E-05 3.21E-02 N N N 16p12.2POU2AF1 5450 9 6.75E-05 3.23E-02 N N N 11q23.1MYBL2 4605 9 6.93E-05 3.23E-02 N N N 20q13.1PSMA5 5686 9 6.97E-05 3.23E-02 N N N 1p13RUVBL2 10856 7 7.40E-05 3.31E-02 N N N 19q13.3SLC25A2 83884 9 7.49E-05 3.31E-02 N N N 5q31WDR31 114987 9 7.70E-05 3.32E-02 N N N 9q32MGC39372 221756 5 8.31E-05 3.51E-02 N N N 6p25.2TESK2 10420 6 8.54E-05 3.53E-02 N N N 1p32EGR1 1958 9 9.11E-05 3.68E-02 N N N 5q31.1STAP1 26228 6 9.54E-05 3.70E-02 N N N 4q13.2IFI27L1 122509 9 9.54E-05 3.70E-02 N N N 14q32.12APOM 55937 9 1.00E-04 3.73E-02 N N N 6p21.33SLC17A8 246213 6 1.01E-04 3.73E-02 N N N 12q23.1FTL 2512 6 1.02E-04 3.73E-02 N N N 19q13.33C3orf37 56941 6 1.13E-04 4.03E-02 N N N 3q21.3OPTC 26254 6 1.17E-04 4.10E-02 N N N 1q32.1Continued. . .80Gene Symbol Entrez ID Number ofStudiesp-value FDR A B C LocusPRDX4 10549 5 1.26E-04 4.16E-02 Y N N Xp22.11SLC16A6 9120 6 1.27E-04 4.16E-02 N N N 17q24.2GAP43 2596 9 1.28E-04 4.16E-02 Y N N 3q13.1-q13.2FAM98C 147965 9 1.28E-04 4.16E-02 N N N 19q13.2SNX7 51375 7 1.29E-04 4.16E-02 N N N 1p21.3RBM4B 83759 6 1.32E-04 4.17E-02 N N N 11q13NXT1 29107 6 1.39E-04 4.32E-02 N N N 20p12-p11.2MYL4 4635 9 1.42E-04 4.35E-02 N N N 17q21-qterOR1J2 26740 5 1.49E-04 4.50E-02 N N N 9q34.11LCT 3938 6 1.56E-04 4.55E-02 N N N 2q21TIMM17B 10245 6 1.68E-04 4.55E-02 N N N Xp11.23SF3B4 10262 9 1.69E-04 4.55E-02 N N N 1q21.2RAPGEF2 9693 8 1.73E-04 4.55E-02 N N N 4q32.1TXNDC12 51060 6 1.74E-04 4.55E-02 N N N 1p32.3ARHGAP39 80728 9 1.74E-04 4.55E-02 N N N 8q24.3RAB30 27314 6 1.74E-04 4.55E-02 N N N 11q12-q14SH2B2 10603 7 1.74E-04 4.55E-02 N N N 7q22CCER1 196477 5 1.75E-04 4.55E-02 N N N 12q21.33DCTPP1 79077 9 1.76E-04 4.55E-02 N N N 16p11.2PAFAH1B3 5050 9 1.78E-04 4.55E-02 N N N 19q13.1DGUOK 1716 7 1.87E-04 4.55E-02 N N N 2p13MRPS30 10884 9 1.89E-04 4.55E-02 N N N 5q11MRPS18A 55168 7 1.89E-04 4.55E-02 N N N 6p21.3SHISA4 149345 9 1.90E-04 4.55E-02 N N N 1q32.1C18orf61 497259 4 1.90E-04 4.55E-02 N N N 18p11.22HMBS 3145 9 1.92E-04 4.55E-02 N N N 11q23.3CLIC1 1192 9 1.95E-04 4.55E-02 N N N 6p21.3DESI1 27351 9 1.97E-04 4.55E-02 N N N 22q13.2ADCY6 112 6 2.01E-04 4.58E-02 N N N 12q12-q13ASPM 259266 6 2.03E-04 4.58E-02 N N N 1q31C16orf59 80178 6 2.10E-04 4.67E-02 N N N 16p13.3FAM59A 64762 8 2.12E-04 4.67E-02 N N N 18q12.1SEC13 6396 8 2.16E-04 4.72E-02 N N N 3p25-p24Continued. . .81Gene Symbol Entrez ID Number ofStudiesp-value FDR A B C LocusLOC100130776 100130776 4 2.28E-04 4.85E-02 N N N 12q14.1GABRA4 2557 6 2.28E-04 4.85E-02 Y N N 4p12SPSB2 84727 6 2.30E-04 4.85E-02 N N N 12p13.3182Table A.5: Genes that have been shown to exhibit sexual dimorphism in blood andbrain. Asterisks denote known ASD candidates.Blood KDM5D, ZFY, RPS4Y1, USP9Y*, EIF1AY, PRKY, UTY, DDX3Y, TXLNG2P,TMSB4Y, NOTCH2, NLGN4Y*, BCORP1, AKAP17A, SLA, NCRNA00185, RPS4X,OFD1, CSF2RA, RAB9A, ADAM19, TTTY15, RXRA, P2RY8, SH3BGRL, TTTY7,KAL1, KPNB1, CD99, MYO5A, S100G, SHOX, MAL, RBMY2FP, AP1S2*, TN-FRSF1B*, RAB27A, GYG2*, TTTY10, ASMT*, ARHGDIB, CDK16, RBBP7,GPM6B, ASMTL-AS1, TBL1Y, CTPS2, EIF2S3, MCL1, PCDH11Y, ZDHHC9,TTTY12, PNPLA4, LINC00685, ZBED1, INE1, ASMTL, TMEM27, SEPT6, DDX3X,TRAPPC2, PIR*, IFITM3, EIF1AX, VAMP7, DHRSX, CA5BP1, MAOA*, MED14,FUNDC1, SELL, TTTY5, RBMY3AP, LINC00102, SPRY3, BTG3, CA5B, CSPG4P1Y,TSC22D3, ALPL, AMELY, PPP2R3B, SLC25A6, HBZ, IFNGR2, IL3RA, IL9R,CD99P1, STS, ARSD, ARSE, NHLH2, OAZ1, PLXNB3, GEMIN8, PLCXD1, TXLNG,NLGN4X*, RPS9, DUSP21, SRY, UBA1, KDM6A, XG, XIST, ZFX, CA4, GTPBP6,HDHD1, USP9X, KDM5C*, TTTY11, TTTY13, TTTY14, SYAP1, ITM2BBrain ITPRIP, SERPINH1, S100A10, ANXA1, TIMP1, CKS2, MSN*, EHD2, SDC2*,CEBPD, RAB27A, BAMBI, EMP3, FBLN5, TM4SF1, PDK4, LAPTM5, COL1A1,KDM5C*, LY6E, DUSP21, MFAP2, CTSC, BGN, CEP135, HIST1H2BK, HIST1H4F,HIST1H4D, NPC2, VIM, SCN9A, TTTY19, IL9R, PLXNA4, PREX1, HIST1H1A,GYPC, TYROBP, PRRX1, TET2, LIFR, KIAA1009, LGALS3BP, H2AFX, KDM6A,P2RY8, USP9X, ALOX5AP*, GEMIN8, ARHGDIB, CD99, CSMD3*, TTTY12, NEU-ROG2, STOM, CD53, IQGAP2, DHRSX, MAOA*, SNCAIP, ZNF423, HIST1H4B,JAG1, CBR1, GSN*, TRAPPC2, EDNRA, NLGN4X*, PLTP, SPARC, C3, GTPBP6,OFD1, CFH, CSF2RA, DHRS3, UBA1, EMP2, COL3A1, TXLNG, SRY, ASCL1,ZNF804B, PIR*, FOLR2, RAB9A, CYBB, AKAP17A, GPM6B, HYDIN, TMEM27,CALD1, EZR, CSF1R, CTGF, ZFX, TTTY10, SPRY3, LUC7L, SLC9A3R1,HIST1H2BB, ASMT*, KDM5D, PRKY, KIAA1199, WIPF1, ZDHHC9, LYVE1,TXLNG2P, ZNF337, CTPS2, KAL1, NR3C2, ISLR, FZD8, PARVA, COLEC12,MEGF10, NLGN4Y*, COL1A2, LPAR6, C3orf62, ZBED1, ZNF793, GNG11, NKTR,ARGLU1, GLO1*, SHOX, GYG2*, ACTG2, MYL9, CA5B, CSPG4P1Y, COX7A2,CRABP1, FUNDC1, CX3CR1, RIBC1, RBMY2EP, RBMY2FP, DDX3X, EIF1AX,EIF2S3, AIF1, KHDRBS2*, TSPAN15, PPP2R3B, RCOR2, BCORP1, IL3RA, ITIH2,LGALS1, LUM, STS, ARSD, ARSE, MYO5A, OGN, PCSK2, CDK16, PLXNB3,PPP1R2, PLCXD1, KIF16B, OSGEP, OLFML3, CCDC146, ACTA2, RBBP7, RNASE1,RPS4Y1, SH3BGRL, CAPRIN2, VAMP7, TSPAN6, UTY, XG, ZFY, ZIC1, S100G, CY-BRD1, KCNIP4, ITIH5*, EEPD1, HDHD1, PNPLA4, USP9Y*, PCDH11Y, HIST1H4A,HIST1H4C, TTTY5, TTTY11, TTTY13, TTTY14, LSMD1, ASMTL, DDX3Y, AP1S2*,TBL1Y, EIF1AY, TMSB4Y83


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items