UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Investigation of novel schizophrenia candidate genes through biochemical and computational methods Mead, Carri-Lyn Rebecca 2010

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2010_spring_mead_carri-lyn.pdf [ 11.41MB ]
Metadata
JSON: 24-1.0069913.json
JSON-LD: 24-1.0069913-ld.json
RDF/XML (Pretty): 24-1.0069913-rdf.xml
RDF/JSON: 24-1.0069913-rdf.json
Turtle: 24-1.0069913-turtle.txt
N-Triples: 24-1.0069913-rdf-ntriples.txt
Original Record: 24-1.0069913-source.json
Full Text
24-1.0069913-fulltext.txt
Citation
24-1.0069913.ris

Full Text

    INVESTIGATION OF NOVEL SCHIZOPHRENIA CANDIDATE GENES THROUGH BIOCHEMICAL AND COMPUTATIONAL METHODS   by  CARRI-LYN REBECCA MEAD  Bachelor of Science, University of Waterloo, 1999 (Science and Business)      A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF  DOCTOR OF PHILOSOPHY   in   THE FACULTY OF GRADUATE STUDIES  (Genetics)       THE UNIVERSITY OF BRITISH COLUMBIA  (Vancouver)      January 2010  © Carri-Lyn Rebecca Mead, 2010    ii Abstract Schizophrenia is a complex highly heritable psychiatric disorder affecting ~1% of the human population. Complex disease research must consider the wide variety of confounding factors that contribute to disease pathology.  Underlying genetic contributions to disease are often heterogeneous among the disease population and individual gene linkage and association signals may be weak and inconsistent within affected populations.  The disease phenotype may actually result from multiple defects within one or more related functional pathways.  Understanding the physical interactions that known susceptibility genes engage in provides insight into the functions and pathways contributing to disease, and also implicates the interacting genes and proteins as potential schizophrenia candidate genes.  While many candidate schizophrenia genes have been proposed, findings for only a few genes have been sufficiently replicated for them to be considered schizophrenia susceptibility genes, including neuregulin-1 and dysbindin.  The first aim of this thesis was to identify novel candidate schizophrenia genes through investigation of the interactions and pathways that known susceptibility genes neuregulin- 1 and dysbindin participate in.  The second aim of this thesis was the generation of a novel method for whole genome linkage meta- analysis. Numerous genome-wide linkage studies have been performed on a wide variety of schizophrenia cohorts, however highly significant genome-wide linkage signals have not been prevalent and there has been little replication between studies.  It is possible that individual studies contain weak linkage signals that are consistent across multiple studies, but due to their lack of significance in any one study, have not been identified.  The Marker Footprint Linkage Meta-analysis method was developed to allow for refinement of candidate schizophrenia linkage regions from existing studies and identification of novel regions that show broadly consistent, but perhaps weak, linkage signals across multiple studies.  Through these analyses a common protein interaction network that encompasses three of the current best schizophrenia susceptibility genes (neuregulin-1, dysbindin, and disrupted-In-schizophrenia-1) was identified.  These findings greatly expand current knowledge of interactions with these important schizophrenia susceptibility genes.  A novel method for performing genome-wide linkage meta-analyses was developed that incorporates recombination to refine existing linkage regions and identify novel linkage regions that have not previously been identified. Table of Contents Abstract ......................................................................................................................................................... ii Table of Contents ......................................................................................................................................... iii List of Tables .................................................................................................................................................v List of Figures............................................................................................................................................... vi Abbreviations & Gene Definitions ............................................................................................................... vii Acknowledgements .................................................................................................................................... xiv Dedication.................................................................................................................................................... xv Co-Authorship Statement ........................................................................................................................... xvi  1 Introduction.............................................................................................................................1 1.1 Thesis Overview .....................................................................................................................1 1.2 Complex Disease ...................................................................................................................2 1.3 Schizophrenia.........................................................................................................................3 1.4 Schizophrenia Susceptibility Genes .......................................................................................6 1.4.1 Dysbindin (DTNBP1) ..............................................................................................................6 1.4.2 Neuregulin-1 (NRG1)..............................................................................................................6 1.4.3 Disrupted-In-Schizophrenia-1 (DISC1)...................................................................................7 1.5 Traditional Disease Gene Finding Methods ...........................................................................7 1.6 Recombination......................................................................................................................10 1.7 Non-Traditional Disease Gene Finding Tools ......................................................................11 1.7.1 Protein-protein interactions...................................................................................................11 1.7.1.1 High-throughput protein-protein interaction detection methods....................................12 1.7.1.2 Protein-protein interaction validation ............................................................................14 1.7.2 Protein-DNA interactions ......................................................................................................14 1.7.3 Gene function .......................................................................................................................16 1.7.4 Visualization of biological information using networks .........................................................16 1.8 Statistical Analyses...............................................................................................................17 1.9 Thesis Chapters Summary...................................................................................................18 1.10 References ...........................................................................................................................21  2 Cytosolic Protein Interactions of the Schizophrenia Susceptibility Gene Dysbindin............31 2.1 Introduction...........................................................................................................................31 2.2 Materials and Methods .........................................................................................................32 2.2.1 Cloning..................................................................................................................................32 2.2.2 Antibodies.............................................................................................................................33 2.2.3 Cell culture............................................................................................................................33 2.2.4 Immunoprecipitation, protein complex preparation, and mass spectrometry.......................33 2.2.5 Identification of candidate interacting proteins .....................................................................34 2.2.6 Validation of protein interactions through immunoprecipitation-western analysis................35 2.2.7 Linkage and association with schizophrenia analysis ..........................................................35 2.3 Results..................................................................................................................................36 2.3.1 DTNBP1 protein interactions................................................................................................36 2.3.2 Exocyst and dynactin protein interactions ............................................................................37 2.3.3 Verification of DISC1 interactions with exocyst and dynactin complexes ............................37 2.3.4 Gene ontology analysis of interacting proteins.....................................................................37 2.3.5 DTNBP1 and dynactin interacting proteins linked to schizophrenia ....................................38 2.4 Discussion ............................................................................................................................38 2.5 Chapter 2 Tables ..................................................................................................................42 2.6 Chapter 2 Figures.................................................................................................................76 2.7 References ...........................................................................................................................91  3 Protein and DNA Interactions of the Intracellular Domain of Neuregulin-1..........................97 3.1 Introduction...........................................................................................................................97 3.2 Materials and Methods .........................................................................................................98 3.2.1 Cloning..................................................................................................................................98 3.2.2 Cell culture............................................................................................................................99   iii 3.2.3 Immunoprecipitation .............................................................................................................99 3.2.4 Protein separation and mass spectrometry..........................................................................99 3.2.5 Identification of candidate protein interacting proteins ...................................................... 100 3.2.6 Antibodies.......................................................................................................................... 100 3.2.7 Validation of protein interactions through immunoprecipitation-western analysis............. 101 3.2.8 Immunofluorescence ......................................................................................................... 101 3.2.9 Chromatin immunoprecipitation-sequencing (ChIP-seq) analysis .................................... 101 3.2.10 Linkage and association with schizophrenia analysis for the NRG1 ICD protein interacting proteins.............................................................................................................................. 102 3.2.11 Schizophrenia association within ChIP-seq proximal genes............................................. 102 3.3 Results............................................................................................................................... 102 3.3.1 NRG1 ICD interacting proteins.......................................................................................... 102 3.3.2 NRG1 ICD protein-DNA interactions ................................................................................. 104 3.3.3 Investigation of known TFBSs within ChIP-seq peak sequences ..................................... 105 3.3.4 NRG1 ICD ChIP-seq regions and schizophrenia associated genes................................. 106 3.4 Discussion ......................................................................................................................... 106 3.5 Chapter 3 Tables ............................................................................................................... 110 3.6 Chapter 3 Figures.............................................................................................................. 126 3.7 References ........................................................................................................................ 135  4 A Meta-Analysis of Schizophrenia Linkage Data that Utilizes Recombination Distance .. 142 4.1 Introduction........................................................................................................................ 142 4.2 Materials and Methods ...................................................................................................... 144 4.2.1 Selection of genome scans ............................................................................................... 144 4.2.2 Recombination rate footprint meta-analysis method......................................................... 144 4.2.3 Schizophrenia associated genes in significant regions..................................................... 147 4.3 Results............................................................................................................................... 148 4.3.1 Schizophrenia meta-analysis significant regions .............................................................. 148 4.3.2 Schizophrenia associated genes in significant regions..................................................... 148 4.4 Discussion ......................................................................................................................... 149 4.5 Chapter 4 Tables ............................................................................................................... 153 4.6 Chapter 4 Figures.............................................................................................................. 159 4.7 References ........................................................................................................................ 163  5 Conclusion......................................................................................................................... 166 5.1 Summary of Major Findings............................................................................................... 166 5.2 Providing a Protein Interaction Context for Schizophrenia................................................ 167 5.3 In the Context of Recent GWAS Results........................................................................... 170 5.4 Mechanisms ...................................................................................................................... 171 5.5 Implications Beyond Schizophrenia .................................................................................. 173 5.6 Strengths, Weaknesses, and Contributions to the Field of Study..................................... 173 5.7 Future Directions ............................................................................................................... 175 5.8 Conclusions ....................................................................................................................... 175 5.9 Chapter 5 Figures.............................................................................................................. 177 5.10 References ........................................................................................................................ 179  6 Appendix 1 – Additional Tables......................................................................................... 183 6.1 ChIP-Seq Peak Regions and Associated Genes .............................................................. 183    iv List of Tables Table 2.1   - Protein Interactions Identified Through IP-MS........................................................................42 Table 2.2   - DTNBP1 Immunoprecipitation-Mass Spectrometry Data .......................................................44 Table 2.3   - Dynactin Immunoprecipitation-Mass Spectrometry Data .......................................................51 Table 2.4   - Exocyst Immunoprecipitation-Mass Spectrometry Data.........................................................55 Table 2.5   - Results of GO Analysis for DTNBP1 Interacting Proteins ......................................................62 Table 2.6   - Results of GO Analysis for Dynactin Interacting Proteins ......................................................64 Table 2.7   - Results of GO Analysis for Exocyst Interacting Proteins........................................................65 Table 2.8   - Schizophrenia Linkage and Association Studies for DTNBP1 Interacting Proteins ...............66 Table 2.9   - Schizophrenia Linkage and Association Studies for Dynactin Interacting Proteins ...............70 Table 2.10 - Schizophrenia Linkage and Association Studies for Exocyst Interacting Proteins.................72 Table 2.11 - Chart of PCR Primers Used for Cloning .................................................................................75  Table 3.1   - PCR Primers ........................................................................................................................ 110 Table 3.2   - Summary of NRG1 ICD IP-MS/MS Hit Counts by Experiment............................................ 111 Table 3.3   - NRG1 ICD IP-MS/MS Interacting Partner Details................................................................ 112 Table 3.4   - Schizophrenia Linkage and Association Studies for the NRG1 ICD Interacting Proteins ... 115 Table 3.5   - Novel NRG1 ICD Interacting Proteins ................................................................................. 117 Table 3.6   - The NRG1 ICD Interacting Proteins that have Known Interactions with ChIP-Seq Associated Transcription Factors ........................................................................................................... 118 Table 3.7   - Schizophrenia Related NRG1 ICD-a ChIP-Seq Peak Proximal Genes............................... 119 Table 3.8   - Known Transcription Factor Binding Sites in the ChIP-Seq Data........................................ 122  Table 4.1   - Schizophrenia Whole-Genome Linkage Studies Included in Meta-analysis ....................... 153 Table 4.2   - Nominally Significant Regions Resulting from the MFLM.................................................... 154 Table 4.3   - Genes in Significant Regions............................................................................................... 155 Table 4.4   - Summary of Regions of Interest from Various Schizophrenia Meta-Analyses.................... 157 Table 4.5   - Overlap between Studies..................................................................................................... 158  Table 6.1   - ChIP-Seq Peak Regions and Associated Genes................................................................. 183    v List of Figures Figure 2.1   - Overall Protein Interaction Network of the DTNBP1, Exocyst and Dynactin Complexes Identified through Immunoprecipitation-Mass Spectrometry .................................................76 Figure 2.2   - Validation of Protein Interactions between DTNBP1 and the Exocyst, Dynactin, and AP3 Complexes.............................................................................................................................78 Figure 2.3   - Validation of the DISC1 Interaction with the Dynactin and Exocyst Complexes ...................79 Figure 2.4   - Ontological Classification of DTNBP1 and Exocyst Interacting Proteins ..............................80 Figure 2.5   - Representative Gel of Immunoprecipitated DTNBP1 Protein Complexes ............................81 Figure 2.6   - Peptide Ion Chromatograms for Representative DTNBP1 Interacting Proteins....................85 Figure 2.7   - Representative Gel of Immunoprecipitated EXOC and DCTN Protein Complexes ..............86 Figure 2.8   - Venn Diagram of Overlap between Protein Interaction Datasets..........................................87 Figure 2.9   - Graph of Average Log(E) Scores for DTNBP1 Interacting Proteins......................................88 Figure 2.10 - Graph of Average Log(E) Scores for Dynactin Interacting Proteins......................................89 Figure 2.11 - Graph of Average Log(E) Scores for Exocyst Interacting Proteins .......................................90  Figure 3.1   - Graph of the X!Tandem Average Log(E) Scores for NRG1 ICD Interacting Proteins........ 126 Figure 3.2   - NRG1 ICD Isoforms............................................................................................................ 127 Figure 3.3   - Validation of the AKAP8L Interaction with the NRG1 ICD.................................................. 128 Figure 3.4   - Validation of the Interaction of the NRG1 ICD with SNX27................................................ 129 Figure 3.5   - Validation of the UTRN Interactions with the NRG1 ICD and DTNBP1 ............................. 130 Figure 3.6   - The NRG1 ICD ChIP-Seq Peaks Upstream of the DLG4 Gene......................................... 131 Figure 3.7   - The NRG1 ICD ChIP-Seq Peaks Proximal to Schizophrenia Susceptibility Genes DTNBP1 and DISC1 .......................................................................................................................... 132 Figure 3.8   - The NRG1 ICD Protein Interaction Network....................................................................... 133 Figure 3.9   - Representative Gel of Immunoprecipitated NRG1 Protein Complexes.............................. 134  Figure 4.1   - Relationship between Linkage Score and Significance Level ............................................ 159 Figure 4.2   - An Example of Footprint and Genomic Distributions.......................................................... 160 Figure 4.3   - Schizophrenia Genomic Distributions and Final Meta-Analysis Score for All Studies on Chromosome 1 ................................................................................................................... 161 Figure 4.4   - Final Meta-Analysis P-Value Results Across the Genome ................................................ 162  Figure 5.1   - The Overall NRG1-DTNBP1-DISC1 Protein Interaction Network ...................................... 177 Figure 5.2   - Proteins of Interest Involved in the Vesicle Lifecycle from a Presynaptic Perspective ...... 178    vi Abbreviations & Gene Definitions Abbreviations ADP  Adenosine diphosphate AEBSF  4-(2-Aminoethyl) benzenesulfonyl fluoride hydrochloride ATCC  American tissue culture collection B&H  Benjamini and Hochberg BIND  Biomolecule interaction network database BioGRID The general repository for interaction datasets BP  Biological process CC  Cellular component cDNA  Complementary deoxyribose nucleic acid, synthesized from a mature mRNA template ChIP  Chromatin immunoprecipitation ChIP-seq Chromatin immunoprecipitation followed by high throughput sequencing cM Centimorgan  (equivalent to a 1% chance that a marker at one genetic locus on a chromosome will be separated from a marker at a second locus due to crossing over in a single generation) DAPI  4’,6-diamidino-2-phenylindole DAVID  Database for annotation, visualization, and integrated discovery DIP  Database of interacting proteins DLPFC  Dorsolateral prefrontal cortex DNA  Deoxyribonucleic acid DSM-IV  Diagnostic and statistical manual of mental disorders version 4 DsRed  Red fluorescent protein ECD  Extracellular domain EDTA  Ethylenediaminetetraacetic acid ESI  Electrospray ionization ESI-MS/MS Electrospray ionization tandem mass spectrometry FDR False discovery rate FLAG  A peptide tag with the octopeptide sequence N-DYKDDDDK-C GABA  Gamma-aminobutyric acid GAD  Genetic association database GFP  Green fluorescent protein GO  Gene ontology GSMA  Genome scan meta-analysis GST  Glutathione S-transferase GWAS  Genome-wide associated study HEK293 Human embryonic kidney 293 cell line HGMD  Human gene mutation database   vii HGNC  Human genome organization (HUGO) gene nomenclature committee HPLC High performance liquid chromatography HPRD  Human protein reference database HPS  Hermansky-Pudlak syndrome HUGO  Human genome organization ICD  Intracellular domain ID  Identification IMAGE  Integrated molecular analysis of gene expression, a public collection of genes IntAct  Protein interaction database and analysis system IP  Immunoprecipitation IP-MS/MS Immunoprecipitation followed by tandem mass spectrometry KEGG  Kyoto encyclopedia of genes and genomes LOD  Logarithm of odds M2  Anti-FLAG M2 antibody MAGS  Meta-analysis procedure for genome-wide linkage study MALDI  Matrix assisted laser desorption / ionization MATCH A tool for searching transcription factor binding sites in DNA sequences Mb  Megabase, (1,000,000 bases) MES  2-(N-morpholino)ethanesulfonic acid MF  Molecular function MFLM  Marker footprint linkage meta-analysis MINT  Molecular interactions database MIPS  Mammalian protein-protein interaction database mRNA  Messenger ribonucleic acid MS  Mass spectrometry MSP  Multiple scan probability NCBI  National Center for Biotechnology Information NIH  National Institutes of Health NMDA  N-methyl D-aspartatic acid NP-40  Nonyl phenoxylpolyethoxylethanol (a substitute for Nonidet P-40) NSB  Non specific binding NPL  Non-parametric linkage NSF  N-ethylmaleimide sensitive fusion protein NuPAGE A precast gel system for high performance polyacrylamide gel electrophoresis PBS  Phosphate buffered saline PCR  Polymerase chain reaction PRIDE  Proteomics identifications database qPCR  Quantitative polymerase chain reaction RDist  Recombination distance RI  Relative information   viii RNA  Ribonucleic acid SDS  Sodium dodecyl sulfate SDS-PAGE Sodium dodecyl sulfate polyacrylamide gel electrophoresis SNP  Single-nucleotide polymorphism TFBS  Transcription factor binding site TBS  Tris buffered saline Tm Melting temperature of an oligonucleotide; the temperature at which half of the molecules are dissociated TRANSFAC A public database of transcription factors, their experimentally-proven binding sites, and regulated genes. UCSC  University of California, Santa Cruz WRPC  Weighted rank pairwise correlation statistic for linkage X57  Mouse corpus striatum hybrid cell line Y2H  Yeast-2-hybrid  Gene Definitions ABCA1  ATP-binding cassette sub-family A member 1 ACMSD 2-amino-3-carboxymuconate-6-semialdehyde ACTG1  Actin, cytoplasmic 2 ACTR1A Alpha-centractin ADRA1A Alpha-1A adrenergic receptor AKAP  A-kinase anchor protein family AKAP8L A-kinase anchor protein 8-like AKT  Protein kinase B family ANK  Ankyrin-1 AP3  Adapter-related protein complex-3 AP3B1  Adapter-related protein complex-3 subunit beta-1 AP3B2  Adapter-related protein complex-3 subunit beta-2 AP3D1  Adapter-related protein complex-3 subunit delta-1 APOE  Apolipoprotein E APOER2 Low-density lipoprotein receptor-related protein 8 (alias for LRP8) APP  Amyloid beta A4 protein ARF1  Adenosine diphosphate (ADP) ribosylation factor-1 ARFGAP1 Adenosine diphosphate (ADP) ribosylation factor guanosine triphosphate (GTP)ase- activating protein-1 ARIA Acetylcholine receptor-inducing activity (alias for neuregulin-1) ASAP1 Adenosine diphosphate ribosylation factor guanosine triphosphate-ase activating protein (ARFGAP) with proto-oncogene tyrosine-protein kinase Src (SRC) homology 3 (SH3) domain, ankyrin (ANK) repeat and pleckstrin homology (PH) domain-containing protein 1   ix BACE1 Beta-secretase 1 BLOC1  Biogenesis of lysosome-related organelles complex-1 BLOC1S3 Biogenesis of lysosome-related organelles complex-1 subunit 3 BLOC2  Biogenesis of lysosome-related organelles complex-2 BRN2  POU domain, class 3, transcription factor 2 (alias for POU3F2) C2H2 The classical zinc finger domain containing a zinc ion co-ordinate within two conserved cysteines and histidines CAPON Carboxyl-terminal PDZ ligand of neuronal nitric oxide synthase protein CAPZB  F-actin-capping protein subunit beta CCDC60 Coiled-coil domain-containing protein 60 CCT  Chaperonin containing T-complex protein 1 complex CCT3  Chaperonin containing T-complex protein 1 complex subunit 3 CCT8   Chaperonin containing T-complex protein 1 complex subunit 8 CDH2  Cadherin-2 COG7  Conserved oligomeric Golgi complex subunit 7 COMT  Catechol-O-methyl transferase DAAM2  Disheveled-associated activator of morphogenesis 2 DAB1  Disabled homolog 1 DAO  D-amino acid oxidase DAOA  D-amino acid oxidase activator DBZ  Disrupted in schizophrenia-1 (DISC1) binding zinc finger protein DCTN1  Dynactin subunit 1 DCTN2  Dynactin subunit 2 DCTN5  Dynactin subunit 5 DDR1  Epithelial discoidin domain-containing receptor 1 DGC  Dystrophin-associated glycoprotein complex DISC1  Disrupted-on-schizophrenia-1 DLG4  Disks large homolog 4 DMD  Dystrophin DPC  Dystrophin associated protein complex DSP  Desmoplakin DTNBP1 Dysbindin-1 DYNLL1 Dynein light chain, cytoplasmic EGFR  Epidermal growth factor receptor ERBB  Epidermal growth factor receptor family ERBB2  Receptor tyrosine-protein kinase ErbB-2 ERBB4  Receptor tyrosine-protein kinase ErbB-4 EXOC1  Exocyst complex component-1 EXOC3  Exocyst complex component-3 EXOC4  Exocyst complex component-4   x EXOC7  Exocyst complex component-7 FLNA  Filamin-A FEZ1  Fasciculation and elongation protein zeta-1 G72  D-amino acid oxidase activator GAPDH Glyceraldehyde-3-phosphate dehydrogenase GGF  Glial growth factor (alias for neuregulin-1) GRB2  Growth factor receptor-bound protein 2 GRIN3A Glutamate [NMDA] receptor subunit 3A GRM  Metabotropic glutamate receptor GTF2B  Transcription initiation factor IIB HMGN1 Non-histone chromosomal protein HMG-14 HRG  Heregulin (alias for neuregulin-1) IGFBP2 Insulin-like growth factor-binding protein 2 IGFBP4 Insulin-like growth factor-binding protein 4 IKZF4   Zinc finger protein Eos IL1B  Interleukin-1 beta ILF2  Interleukin enhancer-binding factor 2 ILF3  Interleukin enhancer-binding factor 3 ITGAL  Integrin alpha-L JAK  Janus kinase is a family of intracellular non-receptor tyrosine kinases JARID2  Protein jumonji JNK  c-Jun N-terminal kinase family JUN  Transcription activator AP-1 KIR  Family of KIR proteins KIR3  Family of G protein-gated inwardly rectifying K+ channels (GIRKs) LIMK  LIM domain kinase 1 LMNA  Lamin-A/C LRP8  Low-density lipoprotein receptor-related protein 8 MAP6  Microtubule-associated protein 6 MAPK  Mitogen activated protein kinase MUTED Protein MUTED homolog MYO18B Myosin-XVIIIb NDEL1  Nuclear distribution protein nudE-like 1 NDF  Neu differentiation factor (alias for neuregulin-1) NF45  Interleukin enhancer-binding factor 2 (also called ILF2) NF90  Interleukin enhancer-binding factor 3 (also called ILF3) NME2  Nucleoside diphosphate kinase B NOS1  Nitric oxide synthase, brain NOS1AP Carboxyl-terminal PDZ ligand of neuronal nitric oxide synthase protein NQO2  Ribosyldihydronicotinamide degydrogenase [quinine]   xi NRG1  Neuregulin-1 OCT7  POU domain, class 3, transcription factor 2 OTX2  Homeobox protein OTX2 PAFAH1B1 Platelet-activating factor acetylhydrolase 1B subunit alpha PCBP1  Poly(rC) binding protein 1 PCNT  Pericentrin PDE4B  cAMP-specific 3’,5’-cyclic phosphodiesterase 4B PDZ  A common structural domain of 80-90 amino acids PGM5  Phosphoglucomutase-like protein 5 PH  Pleckstrin homology domain PI3K  Phosphoinositide 3-kinase family PKA  Protein kinase A PKP2  Plankophilin-2 PLDN  Pallidin PLEC1  Plectin-1 PLXNA2 Plexin-A2 PKP2  Plakophilin-2 POU  A bipartite DNA binding domain POU3F2 POU domain, class 3, transcription factor 2 PPM1B  Protein phosphatase 1B PRKDC  DNA-dependent protein kinase catalytic subunit PRMT5  Protein arginine N-methyltransferase 5 PSD95  Disks large homolog 4 PTMA  Prothymosin A RAB11A Ras-related protein Rab-11A RAB1B  Ras-related protein Rab-1B RAF  RAF proto-oncogene serine/threonine protein kinase RAS  A family of genes encoding small GTPases RB1  Retinoblastoma-associated protein RELA  Transcription factor p65 RELN  Reelin RGS4  Regulator of G-protein signaling 4 SEMA3A Semaphorin-3A SGSM1  Small G protein signaling modulator 1 SH3  SRC homology 3 domain SNAP25 Synaptosomal associated protein 25 SNAPIN SNARE-associated protein snapin SNARE  Soluble N-ethylmaleimide sensitive fusion protein (NSF) attachment protein receptors SNX  Sorting nexin family of proteins SNX27  Sorting nexin-27   xii SOX  Sex determining region Y-box family of proteins SOX11  Transcription factor SOX-11 STAT  Signal transducers and activator of transcription protein family STXBP1 Syntaxin-binding protein 1 SYN1  Synapsin-1 SYPL1  Synaptophysin-like protein 1 TMEM33 Transmembrane protein 33 TUBA1C Tubulin alpha-1C chain TUBB2A Tubulin beta-2A chain TUBB2B Tubulin beta-2B chain UTRN  Utrophin VAMP2  Vesicle associated membrane protein 2 VLDLR  Very low-density lipoprotein receptor YTHDC2 Probable ATP-dependant RNA helicase YTHDC2 YWHAE 14-3-3 protein epsilon   xiii Acknowledgements I would like to thank my graduate supervisors, Robert Holt and Gregg Morin, for the opportunity to pursue graduate studies at the Genome Sciences Centre (GSC), for their wonderful generosity, patience, support, encouragement, and guidance throughout the duration of my studies, and for providing me with a unique and fabulous opportunity to define my own scientific path and learn the tools necessary to travel along it.  Thanks also to my thesis committee members:  Leonard Foster, William Honer, and Steven Jones, for your guidance, support, and encouragement, and to the Director of the Genetics Graduate Program, Hugh Brock, for operating a well-organized department.  I am grateful to the British Columbia Mental Health Foundation and Canadian Institutes for Health Research Institute of Neurosciences, Mental Health and Addiction for providing three years of stipend funding, plus funding for travel and equipment.  Thank you to my co-workers and friends in the laboratory at the GSC, particularly Gary Wilson, Michael Kuzyk, Annie Moradian, Grace Cheng, Michelle Tang, Doug Freeman, Suganthi Chittaragan, and Claire Hou for their incredible patience, support, and guidance.  Thank you also to my fellow graduate students and friends, especially Monica Sleumer, Obi Griffith, Malachi Griffith, Ben and Oanh Good, Sheri Chen, James Roland, Heesun Shin, George Yang, Vilte Barakauskas, Jennifer Nickel, and Stana Djurdjevic whose friendship, advice, and support have been so invaluable throughout this process.  The research described in this thesis would not have been possible without the help and advice of numerous others at the GSC and elsewhere.  It has been a pleasure to have worked with so many outstanding people.  Finally, I would like to thank my sisters and their partners:  Katherine Mead and Sean Baldwin, and Samantha and Matteo Galvan, and my parents Ian and Lynda Mead for their love and support throughout everything I do in life.   xiv Dedication I would like to dedicate this thesis to my parents, Ian and Lynda Mead, for instilling in me from such a young age to follow my own path, as life is a journey, not a destination, with happiness being our highest reward.  “Twenty years from now you will be more disappointed by the things you didn't do than by the ones you did do. So throw off the bowlines. Sail away from the safe harbor. Catch the trade winds in your sails. Explore. Dream. Discover.”  Mark Twain      xv Co-Authorship Statement Together with my supervisors Rob Holt and Gregg Morin, I was responsible for the identification and design of the research program described in this thesis.  I was primarily responsible for performing the research, data analyses, and manuscript preparation.  Each chapter of the thesis was prepared as a multi-author publication. I was primarily responsible for all the research performed in each of these publications, however, the co-authors on these publications did contribute analyses, text, figures, tables, editorial suggestions, funding and supervision. Their specific contributions are briefly summarized here. Rob Holt and Gregg Morin contributed to study design, concepts, text, figures, editorial suggestions, funding and supervision for all chapters.  Michael Kuzyk and Annie Moradian performed the mass spectrometry analyses, advised on subsequent data analysis of that data, contributed to images and provided editorial suggestions for Chapters 2 and 3.  Gary Wilson provided advice and guidance on laboratory analysis and provided editorial suggestions for Chapters 2 and 3.  Inanc Birol provided advice, guidance and editorial suggestions for Chapter 4.  Mikhail Bilenky performed the findpeaks analysis and provided editorial suggestions for Chapter 4.  Many others made minor contributions to the research described herein (see author lists and acknowledgements in the individual publications for details).    xvi  1 1 Introduction Human genetic variants contribute to a vast multitude of diseases.  Methods for investigating the genetic etiology of disease were initially focused on Mendelian disorders and have had great success within that realm [1].  These methods have been less successful with investigations of complex diseases, where multiple susceptibility genes may participate in a set of functional pathways and the impairment of one of more of these pathways or functions produces disease.  Schizophrenia is a complex psychiatric disorder that is highly heritable with a strong genetic component [2,3].  While a multitude of linkage and association studies have been performed on numerous schizophrenia families and cohorts, there has been little consistency among results, and in many studies the results did not meet statistical significance. There is a need to move beyond traditional methods in order to discover the underlying genetics of disease.  By combining multiple genome-wide linkage studies in a linkage meta-analysis it may be possible to refine existing linkage regions and identify novel linkage regions that have consistent, weak signals across multiple studies.  This would provide better definition of linkage regions across multiple populations and potentially expand researchers focus to regions that may contain valid susceptibility genes that would otherwise potentially go undetected.  There are a number of different terms used to describe how robustly a gene has been linked to schizophrenia.  From the least robust to the most, these include: “potential candidate schizophrenia genes”, which are genes that are proposed to be involved in schizophrenia due to some characteristic but which have not yet been directly investigated for disease involvement; “candidate schizophrenia genes”, which are genes that have had some direct evidence of disease involvement (e.g. one positive association result), but for which the evidence is not yet considered sufficient to be considered a susceptibility gene; and “schizophrenia susceptibility gene”, for those few genes that have had considerable evidence showing contribution to schizophrenia etiology, typically replication in independent cohorts plus functional data.  A small number of schizophrenia susceptibility genes have been identified, including dysbindin (DTNBP1), neuregulin-1 (NRG1), disrupted- in-schizophrenia-1 (DISC1), and Catechol-O-methyl transferase (COMT).  Based on the concept that complex disease is the disruption of one or more biological functions rather than the result of perturbation in one or more genes, identification of common interaction partners, functions, or pathways that known susceptibility genes participate in is critical to the identification of the genetic underpinnings of schizophrenia.  This introduction reviews the issues and tools available for investigating the genetics of complex disease, as well as provides background on schizophrenia, which was the focus of this thesis.  1.1 Thesis Overview The central objective of this thesis is the identification of potential candidate genes contributing to schizophrenia etiology.  The first part of this thesis focuses on two of the most promising schizophrenia susceptibility genes, DTNBP1 and NRG1, and investigates their protein-protein interactions using immunoprecipitation (IP) and mass spectrometry (MS) techniques.  The ability to identify proteins in a high-throughput manner, without prior inference as to the identity of the protein, using MS in combination with the human genome sequence and annotations has been developed fairly recently [4, 5].  The identified interacting proteins not only provide information on DTNBP1 and NRG1 function, but their underlying genes may also be considered potential schizophrenia candidate genes.  The second part of this thesis focuses on leveraging available whole genome linkage data, one of the only data resources that allows direct assessment of the connection between genetics and disease without bias to specific genomic regions.  A bioinformatics approach was developed and used to combine available whole genome linkage analyses for the purpose of refining and identifying genomic regions that have shown consistent linkage over multiple studies.  1.2 Complex Disease Complex genetic disease refers to common familial illnesses that do not show a simple Mendelian pattern of inheritance [6-8].  Some examples of complex disease include: coronary heart disease, deafness, epilepsy, hypertension, rheumatoid arthritis, type I and II diabetes, asthma, many cancers, and most psychiatric disorders, including schizophrenia [6].  Unlike monogenic diseases where a single genetic mutation may cause the disease phenotype, complex diseases involve a wide variety of risk factors found in both our environment and genetics [1, 8].  Several attributes that distinguish complex diseases from Mendelian disorders include locus heterogeneity, allelic heterogeneity, threshold inheritance, variable penetrance, epigenetics, epistasis, gene-environment interactions, temporal gene expression, and senescence [9-16].  These differences make identification of the genetic etiology of complex disease much more difficult.  The potential effects of these complexities need to be incorporated into the analysis of complex disease systems, or at the very least, considered when interpreting results.  Genetic heterogeneity in particular (where variation at different genetic loci can give rise to the same phenotype) is a common denominator in complex traits and can be considered to be the most important obstacle to overcome [17].  Another critical barrier to complex disease research, and in particular schizophrenia research, is the underlying phenotypic heterogeneity of the affected population.  In general, disease is investigated as a set of binary traits where people are considered to be with or without disease, but this simple binary system does not apply well to complex disease which can consist of a spectrum of phenotypes [1]. In addition, for most complex diseases the pathophysiology is poorly understood.  The identification of biomarkers and biological measures (e.g. blood glucose in diabetes) that allow cases to be stratified into homogeneous classes is imperative to the successful interpretation of the complexities of multifactorial diseases [6, 17].  There is also debate as to the underlying genetic theory for many complex diseases. The common disease-common allele theory states that disease results from the chance accumulation of multiple common variants, each with a small effect on susceptibility [6].  The rare allele-common disease theory  2 states that disease is caused by multiple different highly penetrant, rare, and severe mutations [10]. These polarized views remain unresolved and, in actuality, complex diseases are likely to be some combination of both theories [6, 10].  The culmination of these complex disease attributes is that the investigation of individual genes in complex disease becomes much more difficult due to weak and inconsistent disease risk for each gene within affected populations.  The combined effect of multiple genetic perturbations within one or more pathways may result in the disease phenotype, or the underlying genetic susceptibility may stem from one or more possible mutations in many different members of a pathway.  Therefore, taken in the context of one or more related functional pathways, subsets of genes involved in disease susceptibility provide an increased ability to infer a role in disease mechanisms when analyzed as a group, and are therefore more tractable for researchers to investigate [18].  Further, while simple loss of gene function may underlie a specific subset of the phenotypic defects, more subtle gene mutations that affect protein interactions, gene expression and translation, or protein stability without loss of gene function may be important.  Even if a confirmed susceptibility gene for a disease is of minor effect, the biochemical pathways and molecular mechanisms it is involved in may prove relevant to the disorder in general [19].  It is informative to view complex diseases as a disruption of one or more biological functions rather than as a result of perturbations in single or small groups of genes [20].  There are potentially many pathways contributing to the optimal performance of a specific biological function, and therefore there are potentially many different pathways that may lead to disease [20].  1.3 Schizophrenia Some of the most heritable complex diseases are the psychiatric disorders, including schizophrenia. Heritability is the proportion of phenotypic variation in the population due to genetic variation [21].  In a broad sense, the heritability of a trait is the degree to which it is genetically determined and is expressed as a ratio of the total genetic variance to phenotypic variance [21].  Schizophrenia has a high heritability of 80-87% [22-25], similar to that of bipolar disorder (79-93%) [22, 26, 27], in contrast to other complex diseases such as Alzheimer’s disease (29-79%) [28-32], asthma (48-79%) [33] and breast cancer (25- 50%) [34-36].  Schizophrenia is a psychiatric disorder identified 100 years ago by Kraeplin that affects approximately 1% of the population [37].  Psychiatric disorders, including schizophrenia, have added complexity in that they rely on non-biological descriptive measures (for example the Diagnostic and Statistical Manual of Mental Disorders fourth edition (DSM-IV) [38]) for the purposes of diagnosis, which has shown limited effectiveness for segregating individuals into phenotypically or genotypically homogeneous disease populations [6].  In addition, no conclusive consistent physiological markers have yet been defined for schizophrenia [39].  These factors increase the complexity of investigating schizophrenia, as the underlying affected populations may be widely heterogeneous.   3 While no conclusive physiological markers have been identified for schizophrenia, there has been a great deal of research focused on their identification.  Alterations in the cytoarchitecture of various brain areas (hippocampus, prefrontal cortex, dorsal thalamus) have been a long term focus of schizophrenia research and evidence supports the existence of these alterations as a potential physiological representation of schizophrenia [40]. Neuropathology research has focused on the extended limbic system (hippocampus, dorsolateral prefrontal cortex (DLPFC) and cingulate gyrus) [37] and evidence that the prefrontal cortex is a site of abnormal brain function in schizophrenia is overwhelming [41].  One of many models for schizophrenia development is that it is a neurodevelopmental disorder where dysconnectivity or misconnectivity of neurons affect the precise organization of the neural circuitry and perhaps its plasticity [37], leading to abnormal synaptic connectivity [19].  Dysconnectivity refers to the abnormal functional integration of brain processes either due to aberrant neuronal wiring, or aberrant synaptic plasticity or both [42].  Synapses are specialized junctions between neurons, or neurons and adjacent cells that mediate transmission of signals.  Signals are transduced by the release of neurotransmitters from the presynapse of one cell that is received through specialized neurotransmitter receptors on the postsynapse of adjacent cells.  The human brain is thought to contain tens of billions of neurons, each having approximately 7,000 synapses, resulting in hundreds of trillions of synapses [43].  Synaptic plasticity occurs throughout life and refers to changes in the strength of the synapse (defined as the change in transmembrane potential resulting from activation) and its ability to process signal information. There are many mechanisms involved in affecting synaptic plasticity, one of which is change in the quantity of neurotransmitter released.  Alterations in neurotransmitter systems (dopamine, glutamine, gamma-aminobutyric acid (GABA)-ergic, serotonergic, cholinergic, and opioid) provide evidence of synaptic involvement in pathophysiological processes leading to symptoms of schizophrenia [39, 44, 45].  Dopamine, glutamate, and GABA are the systems that have received the most attention.  Hyperactivation of the dopamine system was the first established hypothesis once it was established that traditional antipsychotics are principally dopamine D2 receptor antagonists [46].  Subsequently, the glutamatergic system has received attention [19].  The hypofunction of the glutamate system in schizophrenia was first inferred from observations that N-methyl- D-aspartate (NMDA) receptor antagonists (ketamine, phencyclidine, and MK-801) induce schizophrenia- like symptoms in normal individuals [46].  Glutamate has been described as the major excitatory neurotransmitter in the mammalian brain and is thought to be utilized by 40% of all synapses [47].  There is a variety of evidence that further supports impaired glutamatergic transmission involvement in schizophrenia.  Pyramidal cells, which use glutamate as their neurotransmitter, contribute to the interconnectivity between the cerebral cortex and limbic system, brain regions that have been implicated in the pathophysiology of schizophrenia [47].   Many of the genes with the strongest associations with schizophrenia (COMT, D-amino acid oxidase activator (DAOA or G72), DISC1, DTNBP1, metabotropic glutamate receptor 3 (GRM3), NRG1, and regulator of G-protein signaling 4 (RGS4)) are involved in the glutamatergic signaling pathway [48].  COMT acts directly on monoaminergic neurotransmission and likely affects other synaptic populations, including glutamatergic synapses [19].  DAOA functions as an activator for D-amino acid oxidation [49].  Impaired DAOA activity is thought to reduce the availability of  4 D-serine (an NMDA receptor co-agonist) and thereby reduce glutamatergic transmission.  DISC1 has several interaction partners in common with glutamate receptors [48].  DTNBP1 regulates glutamate storage and release [48].  GRM3 is a receptor agonist and inhibits glutamate release.  NRG1 is present in glutamatergic synaptic vesicles, affects NMDA receptors [19], and also interacts indirectly with the postsynaptic density proteins associated with ionotropic glutamate receptors through ErbB4 to alter synaptic plasticity [19].  RGS4 is a negative regulator of G-protein coupled receptors including the metabotropic glutamate receptors [19].  Finally, hypofunction of the GABA system, one of the most reproducible neuroanatomic alterations in schizophrenia, was shown to be caused by reduced NMDA receptor activation of GABA interneurons [46].  Any gene product related to neurotransmitters or their receptors, for example any gene product that affects the quantity of neurotransmitters in the cell, their transportation, storage, and time and location of their release, or the location or number of their receptors may be considered a potential schizophrenia candidate gene.  Researchers continue to investigate the physical attributes of the schizophrenic brain in an attempt to identify the elusive physiological marker for schizophrenia that would allow more definitive segregation of the affected population.  The heritability of schizophrenia indicates a strong genetic component, however the specific genetic risk factors remain largely unknown [50-52].  In addition to a strong genetic component, environmental risk factors contribute to schizophrenia etiology.  Although far from conclusive, studies indicate environmental risk factors may include obstetric complications, stress, drug use, immigration, season of birth, urban upbringing, head injury, viral infection, and history of trauma [53].  Although environmental factors do play a role in the development of this disorder, the specific factors involved and how they contribute to disease is uncertain.  Individual environmental risk factors are thought to contribute a minor component of risk for development of schizophrenia [54].  For the purpose of this thesis, I have focused on the underlying genetics of schizophrenia etiology, but it cannot be discounted that there are many more factors contributing to disease than can be attributed to a purely genetic origin.  Normal brain development and function is complex and involves a large number of exquisitely timed steps [55].  It has therefore been suggested that alterations at various developmental stages could contribute to the development of schizophrenia [55] and that the genetic defects contributing to the disease may affect the timing of gene expression rather than gene coding mutations.  The result of more than 20 genome-wide linkage scans in more than 1,200 families with schizophrenia found evidence confirming that schizophrenia is a complex disease where multiple genes modify susceptibility, but any single gene is neither necessary nor sufficient to cause the disorder [15, 56].   There is accumulated evidence of involvement of a small subset of genes such as DTNBP1,  DISC1, and NRG1 [50] in schizophrenia such that they are now considered schizophrenia susceptibility genes, however the picture is incomplete, and it is still uncertain how these genes contribute to the development of the disease.   5 1.4 Schizophrenia Susceptibility Genes 1.4.1 Dysbindin (DTNBP1) A more detailed introduction to DTNBP1 will be provided in Chapter 2, which focuses on DTNBP1 protein-protein interactions.  This section will provide a brief introduction as to why DTNBP1 was chosen as one of the genes of focus for the investigation of novel schizophrenia susceptibility genes, pathways, and function.  DTNBP1 is located within one of the most consistently replicated schizophrenia linkage regions (6p22.3) [57-64] and has one of the most replicated schizophrenia association findings [65-74]. DTNBP1 is a widely expressed cytoplasmic protein involved in vesicle trafficking, and the specific biological functions it is associated with are expected to vary with cell and tissue types and the vesicle populations therein.  Involvement of a protein at any part of the vesicle lifecycle is of interest from a schizophrenia standpoint.  Vesicles store and transport material within the cell.  In neurons, vesicles store and transport neurotransmitters for release at the synapse, a function which is known to be impaired in schizophrenic brain.  The DTNBP1 knockout mouse model (sandy) shows increased dopamine turnover in specific brain regions [75].  Defects in neurosecretion and vesicular morphology in neuroendocrine cells and hippocampal synapses have also been identified at the single vesicle level in sandy mice, which implicate DTNBP1 in the regulation of exocytosis and vesicle biogenesis in endocrine cells and neurons [76].  These findings are consistent with the hypothesis that defective synaptic transmission and neurotransmitter release is a pathogenic mechanism in schizophrenia [50, 77].  In addition, schizophrenia patients have been found to have decreased expression of DTNBP1 at presynaptic glutamatergic terminals in the hippocampus [78, 79] and DLPFC [80].  This evidence culminates in DTNBP1 currently being considered one of the most promising schizophrenia susceptibility genes, and thus it was chosen as a focus within this thesis.  1.4.2 Neuregulin-1 (NRG1) A more detailed introduction to NRG1 will be provided in Chapter 3, which focuses on NRG1 protein- protein and protein-DNA interactions.  This section will provide a brief introduction as to why NRG1 was chosen as a gene of focus for the investigation of novel schizophrenia susceptibility genes, pathways, and function.  Our knowledge of NRG1 implicates it in many fundamental neuronal functions.  In particular, NRG1 is known to be involved in neural development [81], neuronal differentiation, migration and survival [82-84], synaptic maturation and plasticity [85, 86], and myelination [87].  NRG1 is also one of the best replicated schizophrenia susceptibility genes [88-102].  Pharmacogenetics studies indicate that schizophrenia antipsychotics haloperidol, risperidone and clozapine increase expression of NRG1, and the ErbB receptors it is known to interact with, in rat hippocampus [103].  Schizophrenia NRG1 at-risk haplotypes have been shown to have decreased efficacy of glutamatergic and GABAergic neurotransmission [86, 104].   Finally, altered NRG1 signaling has been implicated in abnormal oligodendrocyte development and myelination as well as reduced oligodendrocyte numbers, which are all  6 thought to play a role in the pathophysiology of schizophrenia [105].  This combination of factors makes NRG1 one of the most convincing schizophrenia susceptibility genes identified to date, and for that reason it was chosen for investigation in this thesis.  NRG1 is a widely expressed transmembrane protein with both extracellular and intracellular domains that is involved in cell-cell communication (among other functions).  There has been a great deal of research in a variety of arenas focused on NRG1, including schizophrenia, heart disease, stroke, Hirschsprung’s disease, and cancer. However, the majority of this research has focused on the extracellular domain (ECD) and its interaction with its receptor, another candidate schizophrenia gene, ErbB4 [106-108].  Very little research has focused on the intracellular domain (ICD), however there is evidence that once cleaved, the ICD is transported to the nucleus and affects transcription [109, 110].  For the investigations described in this thesis, I chose to focus on the protein and DNA interactions of the NRG1 ICD.  The findings described in this thesis for NRG1 ICD may be informative not only in the context of schizophrenia etiology, but also for other complex diseases.  1.4.3 Disrupted-In-Schizophrenia-1 (DISC1) DISC1 is also considered a promising schizophrenia susceptibility gene and although it was not investigated directly, it was found to have several common protein interacting partners with DTNBP1 and NRG1 based on the findings described in Chapters 2 and 3.  The DISC1 gene was named Disrupted-in- Schizophrenia-1 because it was identified at a site for a balanced translocation (1:11) (q42.1;q14.3) that co-segregates with schizophrenia and other psychiatric disorders in a large Scottish pedigree [111, 112]. Several subsequent studies also found evidence of DISC1 association with schizophrenia [113-118].  A number of DISC1 interacting proteins have been identified, including PDE4B and FEZ1, which have also had positive schizophrenia association results [119, 120].  The biochemical investigation of DISC1 is still early and it is not yet understood how DISC1 contributes to schizophrenia etiology.  However, there is evidence that DISC1 is involved in brain development, including neuronal migration, neurite outgrowth and neural maturation through interaction with several cytoskeletal proteins [121].  1.5 Traditional Disease Gene Finding Methods Linkage and association studies are two main types of traditional molecular-genetic techniques that have been used to investigate candidate disease genes [15].  These techniques determine if a genetic locus or variant segregates with a given disease in the study population.  A linkage study attempts to detect genomic regions that harbor genes contributing to disease by identifying sites in the genome that show evidence of segregating with illness in families that have more than one affected member [122].  The major strength of linkage analysis is in mapping rare disorders through large families with many affected individuals [1].  Over 1,000 human monogenic disease genes  7 have been identified through this type of analysis [123, 124].  While linkage studies can be quite powerful and have had excellent success with some Mendelian disorders, they also have a number of weaknesses.  They need to assume a mode of inheritance [125], are costly, require a large investment in recruiting families that contain affected members, and have low resolution.  At best, linkage studies may restrict regions for disease to between 10-15 Mb and require further fine mapping and hunting to determine the causal variant.  Genetic heterogeneity is also a major problem for linkage, and is one reason why studies in different populations produce different results.  At the same time, linkage studies benefit from being performed within large families, where disease is likely to segregate with the same genetic (or epigenetic) aberration.  While linkage studies can identify large regions of the genome that co-segregate with disease in large families, association studies allow specific alleles to be tested for association with disease in the population.  Association studies determine if a gene is involved in disease through genetic testing of variants in diseased (case) compared to non-diseased (control) individuals [1].  Recent advances in technology now allow the typing of over 1 million markers for only a few hundred dollars per person [125], allowing for whole genome association studies (GWAS).  GWAS is a type of association study where regularly spaced genetic markers across the entire genome are tested for association with a specific disease or trait.  GWAS has the benefit of being able to test variants across the entire genome rather than focusing on specific regions indicated from linkage studies or other information.  However, the large number of tests performed by GWAS (one test per variant) means that multiple testing correction must be performed to accommodate increased potential for false discovery.  The underlying rationale of GWAS is that of common disease-common allele, and there was an expectation that many more loci associated with disease would be identified through GWAS. Unfortunately, for many complex traits and diseases (including schizophrenia), even after analysis by GWAS, only a small portion of estimated heritability has been explained [126], leading to a new concept called missing heritability.  A number of risk haplotypes have been identified for NRG1 with odds ratios varying from 1.22 to 3.1 [93, 97, 99, 127]. Odds ratios (OR) from some of the best schizophrenia candidate / susceptibility genes include DTNBP1 (OR, 1.76) [70], BLOC1S3 (OR, 1.45) [128], DAO (OR, 1.71) [129], and GRM3 (OR 1.94 to 2.18) [130].  Rare variants in DISC1 are thought to contribute up to 2% of risk for schizophrenia, however common variants have an odds ratio of only 1.3 [131].  Epistatic effects between COMT and DISC1 produce an odds ratio of 2.46 [130].  There are many possible explanations for missing heritability, including the involvement of a large number of variants of small effect that have yet to be found, the involvement of variants that are not tested by GWAS (including structural variants and rare variants with possibly large effects), gene-gene interactions, epigenetic effects, and the influence of shared environment on the calculation of heritability.  With the exception of epigenetic effects and environmental influence, the effects potentially contributing to the missing heritability in schizophrenia would benefit from improved knowledge of the protein interactions, functions, and pathways of known schizophrenia susceptibility genes, as the gene products involved likely represent some of the missing heritability.  8  There are a number of issues that must be considered when interpreting association studies.  Isolation of a genetic variant which has a large difference in frequency between cases and controls provides insufficient evidence to assume a disease susceptibility role because the variant could simply be in linkage disequilibrium with the true causative variant.  Careful construction of matched case and control sets is very important, since drawing sets from inherently different genetic populations may generate false positives.  The replication studies that are required to validate initial findings are expensive, difficult and often conflicting.  Association studies may fail to identify associations between genetic markers and disease phenotype when multiple, independent disease mutations are present (genetic heterogeneity) [125]. Genetic heterogeneity is a major confounding factor for association studies and disease variants identified in one population may not be present in other populations, resulting in lack of reproducibility of findings.  Association studies may not have sufficient power to identify rare causation variants. Association between genetic markers and disease phenotype can be present for reasons other than involvement in the disease, including population stratification and chance, leading to false positive associations [125], although use of the transmission disequilibrium test, which relies on genotyping trios (the proband, mother, and father), mitigates this risk [132].  Unfortunately, for the investigation of schizophrenia genes, linkage and association results have not been entirely consistent, which is not unusual for complex disorders [15].  Genetic heterogeneity, an important confounding factor in both linkage and association studies is a known characteristic of schizophrenia, and may perhaps exist at a greater level than previously thought [133].  In situations where there are complicating factors like those common to complex diseases (multiple potential disease pre-disposing variants of modest individual effect, gene-gene interactions, gene-environment interactions, or inter- population (allelic or locus) heterogeneity), linkage and association methods require very large sample sizes to have a chance for successful outcome [125, 134].  In fact, schizophrenia provides a striking example of the limitations in the ability of traditional methods to identify the genetic etiology of a highly heritable complex disorder.  Existing linkage studies have resulted in claims of linkage for 21 of the 23 pairs of chromosomes; however, the majority of genome scans have failed to replicate these findings [122]. No strong linkage signal appears to be emerging for schizophrenia and more importantly, there is no obvious criterion for distinguishing signal from noise [122]. This leads to several possible conclusions, including that many genes of small effect are relevant, that there is significant heterogeneity across populations or that samples much larger than those currently available will be required to detect reliable linkage.  Some researchers have attempted to address these issues by using quantitative phenotypes to subdivide schizophrenia patients based on the theory that the genetic underpinnings of these phenotypes are a subset of those contributing to schizophrenia etiology.  There is concern about using linkage and association to determine candidate genes in complex disease as even those few regions that do show strong results will not necessarily account for the majority of genetic liability and may not segregate in many affected individuals [41].  However, even if a confirmed susceptibility gene for schizophrenia proved to be of minor effect, the biochemical pathways and  9 molecular mechanisms it implicates may prove relevant to the disorder in general [19].  Conversely, candidate regions defined by linkage studies alone are not sufficient as determinants of potential genes. Proof of this is the discovery of D-amino acid oxidase (DAO), found at locus 12q24 [135], as a candidate schizophrenia susceptibility gene.  Based solely on the schizophrenia candidate regions from linkage studies, DAO was not previously considered to be a schizophrenia susceptibility gene.  Chumakov et al. [135] revealed DAO as a susceptibility gene while investigating the protein DAOA, a potential candidate gene for schizophrenia.  DAOA was found to have positive association within schizophrenia populations and was therefore subject to a yeast two-hybrid analysis to determine potential interacting proteins.  One of the interacting proteins identified was found to be DAO.  Upon repeating their association analysis with DAO they found positive association with schizophrenia.  This not only supports the need to further refine existing and identify novel schizophrenia linkage regions, but supports the idea that identification of interactions and pathways may be of great importance in finding candidate susceptibility genes.  1.6 Recombination Genetic recombination occurs during mitosis and meiosis and is the process by which genetic material is broken and then joined (recombined) at a location different than its origin.  In meiosis, recombination facilitates chromosomal crossover which leads to offspring having different combinations of genes than their parents.  This is the predominant source of genetic variation between parents and offspring. Recombination can theoretically occur at any location along the chromosome and the frequency of recombination between two locations generally depends on distance [136].  However, recombination occurs with different frequencies at different locations along the genome.  Recombination is an important aspect to consider when investigating linkage or association with disease.  The higher the recombination rate in a region of interest, the more likely recombination has occurred between any two points within that region.  One resource for information on the actual rates of recombination in the human genome is the HapMap project, which is an international project focused on the identification and cataloging of genetic patterns, similarities, and differences in human beings [136].  They have determined the frequencies of millions of sequence variants as well as the degree of association between them for individuals from various populations, including Africa, Asia, and Europe [136].  In Chapter 4, one of the key unique aspects of the whole genome linkage meta-analysis method that I developed is the incorporation of recombination, as defined by the HapMap project, to determine how non-overlapping markers from different studies can be combined.  By using HapMap recombination rates, the probability of recombination between 2 points can be estimated, reducing the possibility of incorrectly combining markers from different studies when they have a low probability of representing the same linkage signal.   10 1.7 Non-Traditional Disease Gene Finding Tools There is a need to supplement traditional genetic methods for investigating complex diseases.  Advances in genomic, proteomic, and metabolomic technologies have provided researchers with new tools to investigate complex disease, resulting in an explosion of data [137] and allowing for a more detailed investigation into the underlying genetic mechanics of complex disorders.  Sophisticated molecular studies, including microarrays and proteomics, as well as appropriate transgenic and in-vitro characterizations of genes, their functions, and their interactions are required for the evaluation of the speculative integration of genetics and neurobiology [19].  While proteomic techniques including mass spectrometry, yeast-2-hybrid (Y2H), and protein microarrays provide information on protein interactions and expression, genomic techniques such as massively parallel sequencing, single nucleotide polymorphism (SNP) microarrays, and expression microarrays allow identification of disease-associated mutations and mRNA expression patterns. In addition, gene co-expression and differential expression data help inform the generation of biological system maps.  Large-scale open-access projects also have a huge potential to impact genetic disease research (e.g. the human genome project, HapMap, etc.) [1, 17, 20].  Extensions of methodologies designed for monogenic inheritance will, when applied to complex disease, have much greater impact if they leverage knowledge of protein interaction pathways and networks.  Repositories are being developed to house a variety of types of information including pathway information (e.g. Kyoto encyclopedia of genes and genomes (KEGG) [138]), protein interaction data (e.g. Human protein reference database (HPRD) [139], and many other more focused repositories [140]), disease association data (e.g. genetic association database (GAD http://geneticassociationdb.nih.gov/) [141]), and human genetic mutation data (e.g. Human gene mutation database (HGMD) [124]) so that it is more easily accessible to researchers.  Protein interaction networks have been built from high throughput datasets for model organisms [142-145] and visualization tools are being developed so that pathways can more easily be generated, visualized, and manipulated (e.g. Cytoscape; [146]).  Researchers need to consider all available evidence in the search for candidate genes in complex disease rather than discarding genes simply because they do not appear in candidate regions.  An improved understanding of the specific relationships or pathways connecting genes and gene products would help provide context for future schizophrenia research.  1.7.1 Protein-protein interactions Genetic mutations can affect the protein product of the gene in a number of ways, including protein abundance, posttranslational modifications, and the ability of proteins to interact with other molecules in the cell [147].  Changes in the properties of one protein can affect not only the protein itself, but also properties of other proteins [147].  Proteins can participate in multiple protein complexes, each of which may have different cellular roles [148-150].  In this way, a single protein can have multiple different functions according to its interaction partners and localization [151].  Identification of protein-protein interactions is an important way to understand a protein’s function [151].  Therefore, a targeted investigation into the protein-protein interactions of a disease susceptibility gene product provides insight  11 into the underlying functions impaired in disease, and also provides a list of other proteins that may be impacted by mutations in the disease susceptibility gene, or conversely, a list of other proteins that may impact the disease susceptibility gene should their parent gene contain genetic mutations.  There are a large number of public protein interaction databases that collect data from various types of protein-protein interaction experiments, including biomolecule interaction network database (BIND) [152], database of interacting proteins (DIP) [153], protein interaction database and analysis system (IntAct) [154], molecular interactions database (MINT) [155], mammalian protein-protein interaction database (MIPS) [156], the general repository for interaction datasets (BioGRID) [157], and HPRD [139].  The information contained in these databases can be leveraged in complex disease investigations as a central location for protein interaction information that may provide context for disease genes where possible.  Databases of the human protein interactome, however, are largely incomplete and contain false interaction data, therefore these databases may only provide a starting point for information gathering.  1.7.1.1 High-throughput protein-protein interaction detection methods There are several methods for investigating protein-protein interactions for a gene product of interest. The high-throughput techniques that are most commonly used include Y2H and immunoprecipitation coupled to tandem mass spectrometry (IP-MS/MS) [156].  The Y2H assay capitalizes on the premise that transcription factors have a DNA binding domain and an activation domain [151].  For Y2H assay, the DNA binding domain and the activation domain are separated and each fused to one of two potentially interacting proteins [159].  If the proteins interact, they bring together the two domains of the transcription factor that then transactivates an easily detectable reporter gene.  Some benefits of using Y2H are that it is highly scalable, large numbers of proteins can be processed at the same time, and only binary interactions are identified.  The main criticism of this technique is that it can have high false-positive and false-negative rates [160].  The high false positive and negative rates are due to the fundamental nature of the screen.  Y2H investigates interactions between overexpressed fusion proteins in the yeast nucleus. This environment is fundamentally different than that of proteins expressed at endogenous levels, in vivo, in human cells, at their natural subcellular localization, and all of these differences can contribute to false positive and negative interactions.  Overexpression can result in non-specific interactions; the fused domains may interfere with true interacting partner binding, and promote spurious interactions; mammalian proteins are sometimes not correctly modified in yeast (e.g. post-translational modifications); and proteins may interact in the yeast nucleus that are normally never physically co-localized or temporally co-expressed in the cell.  Immunoprecipitation is one of the most commonly used methods for examining protein–protein interactions [161].  IP-MS/MS methods have been shown to allow the efficient and sensitive detection of protein binding partners [148, 162].  For the IP, protein complexes are purified using antibodies that either  12 recognize the endogenously expressed protein (or its mutant form) or recognize an affinity tag that is fused to the protein of interest.  Affinity tags are generally made of short hydrophilic peptides (e.g. FLAG, hemagglutinin, or poly-His) or small proteins (e.g. Glutathione S-transferase (GST), thioredoxin, or Green Fluorescent Protein (GFP)) [151].  Sample preparation greatly affects the quality and coverage of results; therefore, after IP a second fractionation step is generally required.  There are many methods to accommodate a second fractionation step, including one- or two-dimensional sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE), protein chromatography, peptide fractionation, isoelectric focusing (IEF), and strong cation exchange (SCX).  The type of fractionation generally depends on the type of sample being analyzed, availability of instrumentation, cost, and desired level of coverage.  After a second level of fractionation, the protein samples are digested into small peptides that are then identified using MS based analysis.  Protein and peptide molecules have only been able to be routinely analyzed by MS for the past couple decades due to the advent of electrospray ionization (ESI) [4] and matrix-assisted laser desorption / ionization (MALDI) [5] that allow the ionization of the large molecules.  Identification of proteins via MS analysis is another recent development made feasible due to available sequencing and annotation of complete genomes, resulting in the availability of a wealth of sequence databases that are the foundation of the high-throughput bioinformatics protein and peptide analysis methods [151].  Access to technology and expertise, reduced potential for false positives, and the ability for investigation within a human context using cell lines resulted in the decision to use IP via 3XFLAG affinity tags followed by one-dimensional SDS-PAGE, trypsin digestion, high performance liquid chromatography (HPLC), and MS analysis as the primary technique for protein-protein interaction investigation within this thesis.  The 3XFLAG affinity tag was chosen as the primary tag for IP due to its efficiency, robustness, and ease of elution.  Similar to Y2H, however, the use of a tagged protein results in protein expression levels that are not at physiological levels, potentially producing spurious interactions, and there is potential for the epitope tag to interfere with true interactions.  The 3XFLAG affinity tag used here has the advantage of being a small hydrophilic peptide of only ~3 kDa [151], reducing the probability of physically impeding an interaction.  Both N- and C- terminally tagged proteins were analyzed to improve the odds of identifying interaction partners that bind at the ends of the proteins and to avoid false positive and negative interactions due to the potentially altered protein structure.  One-dimensional SDS-PAGE was performed to reduce the complexity of samples for MS analysis.  The entire gel lane was excised and tryptic peptides extracted from the gel slices.  Further fractionation was performed using HPLC separation.  The combination of SDS-PAGE and HPLC fractionation of the protein and peptide samples generally provides sufficient fractionation that good coverage of abundant peptides is possible by MS analysis for most protein complexes.  The fractionated peptide samples were introduced to the MS using electrospray ionization (ESI), which is a technique that allows for ionization of macromolecules.  MS in this thesis actually represents a number of steps that allow for peptide sequence determination.  After ionization, in the first stage of MS a spectrum of the mass/charge and abundance of ionized peptides is determined.  In the second stage of MS, specific peptide ion mass/charge windows are selected for fragmentation and analysis.  Collision induced fragmentation of the peptide ion allows a peptide to be fragmented into many  13 of its possible composite ions, called b- and y-ions for those ions generated from a charged N- and C- terminal fragment, respectively.  The mass/charge values of the b- and y-ions are then analyzed in the second MS stage.  The combination of the peptide mass/charge values and the spectrum of b- and y-ion mass/charge values generated are then submitted to proteomics programs like Mascot [163] and X!Tandem [164] which use annotated genome sequences to predict peptide and protein identity.  To reduce the false discovery rate, multiple replicate experiments and controls were performed and only those proteins that met specific criteria were considered possible protein interacting partners.  Criteria to be considered a possible interacting partner included being identified in a minimum of two experiments, and not in more than one negative control, and having at least one of the experimental identifications with an X!Tandem log(E) score of less than -3.  1.7.1.2 Protein-protein interaction validation While high-throughput methods allow the researcher the potential of discovering multiple protein-protein interactions within a small number of experiments, they can often produce results that contain false positives even after processing for non-specific interactions; therefore it is important to perform some level of verification to increase confidence that two proteins are true interacting partners.  While there are a variety of methods that can be used for verification of protein interactions, the primary method used in this thesis was IP followed by western blot analysis (IP-western).  For IP-westerns, protein complexes were purified using antibodies that either recognized the endogenously expressed protein or an affinity tag fused to the protein of interest.  The purified sample was then separated by 1D SDS-PAGE, transferred to a nitrocellulose membrane and probed using antibodies that specifically recognize the protein of interest.  In this way, identification of potential protein interaction partners was performed by IP of the protein of interest and identification of interacting proteins through comparative MS analysis, while verification of interaction partners was performed by IP of the interacting partner followed by a western blot to probe for the initial protein of interest.  1.7.2 Protein-DNA interactions As mentioned previously, protein abundance may also play an important role in disease etiology.  There are many ways in which the abundance of the functional form of a protein can be altered including the accessibility of the gene to transcriptional machinery, targeting of transcriptional machinery to the genomic region, competitive expression between isoforms, modifications to the transcriptional, transportation, and translational machinery that participates in producing the protein, correct folding of the protein, transportation of the protein to its proper destination, post-translational modifications, and its degradation.  Most of these functions are accommodated through protein-protein interactions; however protein-DNA interactions also play an important role in the recruitment or blocking of transcriptional  14 machinery at specific locations.  Due to recent technology advances we now have the ability to investigate protein-DNA interactions in a high-throughput manner.  Broadly defined, a transcription factor is a protein that interacts with specific DNA sequences to effect transcriptional change.  Transcription factors can effect gene transcription alone or in complexes. Transcription factors are generally composed of at least two types of domains: a DNA binding domain that targets a specific genomic site and a transcriptional regulation domain that recruits or blocks RNA polymerase from initiating transcription [165].  Other proteins in transcription factor complexes may perform other roles including, for example, coactivation, repression, chromatin remodeling, histone acetylation, and deacetylation, but do not require DNA binding domains.  Transcription factors bind a variety of types of DNA sequences, including promoters, enhancers, and silencers, which in higher eukaryotes can be located as close as a few hundred base pairs from a gene’s transcriptional start site in the case of promoters, to a great genomic distance away in the case of enhancers and silencers and can also be found upstream, downstream, or within the introns of genes [165].  To identify protein-DNA interactions there are two general strategies.  Either one identifies the genomic locations or DNA sequences that a specific transcription factor binds to (transcription factor centric), or you can identify all the transcription factors that may bind to a specific DNA sequence (gene centric) [165].  For the purposes of the investigation described within this thesis, the focus was on identifying all the genes whose expression may be affected by the NRG1 ICD, for which there is evidence of being involved in transcription factor complexes.  A transcription factor centric approach was taken using chromatin-immunoprecipitation (ChIP) followed by high-throughput sequencing (ChIP-seq).  For ChIP, an antibody for the transcription factor of interest, or in this case a member of a transcription factor complex, is typically used to precipitate and purify DNA bound to the transcription factor (complex).  These small DNA fragments may then be identified using a variety of methods, including polymerase chain reaction (PCR), microarrays, serial analysis of gene expression, or high throughput DNA sequencing [165].  For the research within this thesis, high-throughput sequencing was used to identify the genomic regions bound by the transcription factor complex containing the protein of interest.  This was only possible due to the recent development of next generation massively parallel sequencing systems that has recently revolutionalized the sequencing field.  In the case of ChIP-seq, it has not only decreased the cost and facilitated replicate experiments, but has also allowed for greater sensitivity and resolution in the identification of genome-wide binding sites than ever before [166].  Our knowledge of transcription factors and their binding sites in humans is largely incomplete [165]. There are databases available, however, that compile information on transcription factors, including a public database of transcription factors, their experimentally-proven binding sites, and regulated genes (TRANSFAC) [167].  There are a variety of tools built to interface with TRANSFAC that enable users to easily perform a variety of different types of queries.  One such tool, called MATCH, can be used to identify transcription factor binding sites (TFBS) within a set of DNA sequences.  For the research in this  15 thesis, I used MATCH and information within the TRANSFAC database to analyze the ChIP-seq results and identify known TFBS found within the ChIP-seq sequences.  To determine the genes likely to be affected by the binding of transcription factor complexes to genomic sites, genes most proximal to the protein-DNA binding sites were identified.  Available data on gene expression pattern changes that result from mutations or changes in abundance of a candidate transcription factor protein can be used to increase confidence in specific protein-DNA binding events. Methods available to test expression changes include quantitative real time polymerase chain reaction (qPCR) and differential expression by microarray analysis.  qPCR uses short oligonucleotide primers that span exon boundaries of genes whose expression pattern is being tested and quantitatively compares the rate of amplification between the reverse transcribed mRNA of different sample conditions for the protein of interest.  Differential expression analysis by the simultaneous evaluation of quantitative measurements of the expression of thousands of genes using microarray technology can be used to assess differences in gene expression between different sample conditions for the candidate transcription factor protein of interest.  1.7.3 Gene function Determining the function of a susceptibility gene can contribute to our understanding of disease etiology and pathology and therefore is of great interest.  As discussed above, proteins can have multiple functions, and our knowledge of protein function is largely incomplete.  However, most proteins perform functions by interacting with other proteins and interacting proteins are more likely to share a common function than non-interacting proteins [168].  Therefore, we can increase our understanding of the function of a protein of interest by identifying their its interacting partners.  We have seen an explosion in the accumulation of biological data in the past few decades.  The gene ontology (GO) has emerged as a core repository for the functional annotation of gene products.  GO provides structured datasets describing gene product attributes in three non-overlapping molecular biology domains (molecular function (MF), biological process (BP), and cellular component (CC)) [169]. Several tools have been developed that allow researchers to perform statistical analyses on GO data. One set of tools that was used in this thesis was the Database for Annotation, Visualization and Integrated Discovery (DAVID) bioinformatics resources, which provides a gene functional classification tool allowing statistical functional analyses of large gene lists [168].  1.7.4 Visualization of biological information using networks Traditionally, researchers have operated at the individual gene or protein level.  However, with the explosion of biological information and the development of high-throughput methodologies that allow  16 interrogation of whole genome attributes within individual experiments (e.g. genome-wide association studies, microarray technology), there is a need to be able to visualize results and relationships in a way that encompasses as much of the information as possible.  Visualizing protein-protein and protein-DNA interactions as networks provides context for researchers to analyze how genes and gene products relate to each other.  A network is typically represented as a set of points or nodes that are connected by lines or edges and can comprise one or more types of relationships.  In the case of protein-protein and protein- DNA interactions, networks are displayed such that the nodes are specific genes or their gene products and the edges are their physical interactions.  Networks pervade all aspects of human health [20], and have shown us that most gene products are highly interconnected with other gene products. The greater number of functional links associated with a gene containing a defect, the higher the probability for potential impact on a larger portion of the network.  In this way, a defect in one gene can affect the activity of genes that otherwise carry no defects, but instead have a functional relationship with the defective gene [20].  To better understand the various disease mechanisms involved in complex disease in particular, it is not sufficient to know the precise list of disease genes; rather, we require a detailed understanding of the pathways that are influenced by the genes [20].  By extension, identifying potential disease mechanisms, or subsets of disease genes in the context of a network, can highlight molecular pathways for potential disease involvement.  1.8 Statistical Analyses The explosion of biological data and the use of large scale analyses necessitate the use of computational and statistical techniques to determine result significance.  Various computational re-sampling methods have emerged as common bioinformatic tools for the estimation of result significance, including random permutation analysis.  For the purposes of the work done in this thesis, statistical significance was determined through random permutation-based analyses.  Random permutation analysis is a statistical method used to determine how likely the result of the original analysis was due to random chance.  To statistically analyze this, the raw inputs that produced the result are randomized and the analysis is re-run multiple times.  The more randomized inputs that produce a result similar to the original result, the less significant the original result.  Another statistical test, the Chi-squared test for independence, was also used in Chapters 2 and 3 to determine if the ratio of genes in schizophrenia linkage regions whose gene products were found to interact with my protein of interest was significant when compared to the background of all linkage regions identified in the genome.  When multiple hypotheses are being tested at the same time, for example in GWAS, where each SNP is being tested for association with a specific trait or disease, there is a need to adjust the threshold for significance as there is a greater chance of false discovery than in individual tests.  Traditionally, a Bonferroni correction would be used to determine a threshold of significance that would accommodate for multiple tests [170, 171].  The Bonferroni correction attempts to maintain the family-wise error rate by testing each hypothesis at a statistical significance level of [1 / the number of experiments] times what it would be if only one hypothesis was tested.  However, in cases where thousands or more tests are being performed, the Bonferroni correction is so highly  17 stringent that in some cases there is no possibility of finding significance for truly positive results.  A variety of other methods have been developed that are much less stringent while still adjusting for multiple tests and are therefore more amenable to situations with large numbers of tests.  These multiple test correction methods include those which estimate false discovery rate (FDR), for example using the Benjamini and Hochberg method [172].  FDR methods are less conservative and control the expected proportion of incorrectly rejected null hypotheses (type I errors, false positives).  The Benjamini and Hochberg method involves a sequential p-value procedure where the p-values of each test are ranked from smallest to largest, the largest value remains as it is and each subsequent value is adjusted by the total number of tests divided by its rank.  Within this thesis, where multiple testing was an issue in determining statistical significance, the Benjamini and Hochberg method was used to adjust for multiple tests [172].  1.9 Thesis Chapters Summary The objective of this thesis was to identify novel potential schizophrenia candidate genes using non- traditional methods of disease gene investigation.  Two main approaches were taken.  The first approach is based on the idea that the combined effect of multiple genetic perturbations within specific functional pathways may result in the disease phenotype, or that the underlying genetic susceptibility for disease may stem from mutations in one or more members of specific functional pathways.  Therefore, any protein that interacts with a promising schizophrenia susceptibility gene product should be considered a potential schizophrenia candidate gene.  The second approach takes advantage of whole genome linkage analysis, a method that directly investigates the relationship between disease and genetics across the entire genome in an unbiased way.  Information from multiple whole genome linkage analyses can be combined to estimate schizophrenia linkage signals across the genome.  To overcome the lack of reproduction between linkage studies and attempt to refine existing linkage regions through identification of consistent linkage signals across multiple studies, a new method of whole genome linkage meta- analysis was developed.  Chapter 2 focuses on an investigation of protein-protein interactions for the schizophrenia susceptibility gene DTNBP1.  A set of 83 high confidence potential interacting proteins was identified for the DTNBP1 protein using IP-MS/MS.  A subset of these interacting partners was chosen for verification using IP- western analysis.  Novel interactions with members of the dynactin and exocyst complexes were identified and verified in this way.  These interactions are of particular interest as both dynactin and exocyst complex members were previously identified as interacting partners of the schizophrenia susceptibility gene DISC1 through a Y2H investigation [173].  To further expand the protein interaction network, IP-MS/MS analyses of members of the dynactin and exocyst complexes were performed.  To determine whether the resulting DTNBP1, dynactin complex, and exocyst complex protein interaction sets were overrepresented with proteins linked to schizophrenia, a statistical analysis was performed based on linkage data found in the psychiatric genetics evidence project linkage database.  A gene  18 ontology analysis was also performed on the resulting protein interaction sets to identify potential functional themes.  The resulting DTNBP1, dynactin complex, and exocyst complex interaction network links two of the best schizophrenia susceptibility genes currently identified, DTNBP1 and DISC1, encompasses many aspects of the vesicle life cycle, and involves many cytoskeletal proteins.  Chapter 3 focuses on the intracellular domain (ICD) of the schizophrenia susceptibility gene NRG1 to identify its protein-protein and protein-DNA interactions.  A set of 22 novel high confidence potential interacting proteins were identified for NRG1 ICD using IP-MS/MS.  A subset of these interactions was chosen for verification using IP-western analyses, including utrophin (UTRN), a known interacting partner of DTNBP1.  To determine whether the resulting protein interaction sets were overrepresented with proteins found in genomic regions known to be linked to schizophrenia, a statistical analysis was performed based on linkage data found in the psychiatric genetics evidence project linkage database.  A gene ontology analysis was also performed on the resulting protein interaction sets to identify potential functional themes.  The resulting protein interaction network links two of the best schizophrenia susceptibility genes currently identified, NRG1 and DTNBP1, and includes cytoskeletal and transport related proteins as well as many nuclear and transcription factor proteins, supporting a role for NRG1 ICD in the nucleus and its involvement in transcription factor complexes.  There is evidence that the NRG1 ICD is translocated to the nucleus and affects transcription [109, 110, 174].  To investigate NRG1 as a potential transcription factor, a ChIP-seq analysis was performed, producing a set of 5,674 potential genomic binding sites.  The ChIP-seq analysis identified genomic binding sites proximal to genes whose expression patterns are known to be influenced by NRG1 ICD nuclear localization, including DLG4 and ERBB2 [110].  These 5,674 regions were investigated to identify transcription factors with known TFBS within the ChIP-seq sequences using MATCH.  The MATCH analysis identified the IK1 binding site as one of 164 TFBS overrepresented in the ChIP-seq peak sequences.  The IK1 TFBS has previously been shown to be a binding site for a NRG1 ICD / IKZF4 transcription factor complex [110].  The genes proximal to these binding sites were investigated to determine: a) if there was existing literature evidence of NRG1 impacting their expression, and b) if there was an overrepresentation of genes with a known association to schizophrenia based on the annotations contained within GAD.  The implication of these results is that, once cleaved, the NRG1 ICD translocates to the nucleus where it binds with transcription factors to form transcription factor complexes that affect transcriptional change.  Chapter 4 describes a novel whole genome linkage meta-analysis method called Marker Footprint Linkage Meta-analysis (MFLM) that allows multiple linkage studies to be combined in a way that incorporates the use of recombination rates determined by the HapMap project.  Linkage meta-analysis methods have many problems that must be overcome.  First and foremost, it is difficult to gain access to linkage data from existing studies, including published datasets.  For most linkage studies it is not possible to procure detailed results that include complete family trees along with all individual marker results for each individual such that the raw data can be pooled to perform a meta-analysis.  In my  19 experience, even requests for summary statistics of marker ID and their associated linkage scores could not be met.  Secondly, there are many different types of linkage scores resulting from different studies, including p-values, logarithm of odds (LOD) scores, and non-parametric linkage (NPL) scores which complicate the comparison of scores between studies.  Thirdly, different linkage studies do not always use identical marker sets and therefore the meta-analysis method must have a way of determining how to combine signals from markers in similar regions but different studies that do not overlap.  Finally, issues inherent in complex disease are often difficult to address, including underlying genetic heterogeneity and multiple genes of low penetrance.  The MFLM method attempts to overcome some of these issues by: a) only requiring the summary statistics of marker ID and linkage score, b) converting the linkage score into a genomic probability score based on the relationship between the different types of linkage scores and their significance threshold as described in Schulze and McMahon [175], and c) using the recombination rates determined by the HapMap project to determine how to combine non-overlapping markers between studies.  Eight schizophrenia whole genome linkage study summary statistics were procured and combined using the MFLM method.  The results identified 24 nominally significant regions including one novel region, 9q31.1.  20 1.10 References 1. Iyengar, S.K., The quest for genes causing complex traits in ocular medicine: successes, interpretations, and challenges. Arch Ophthalmol, 2007. 125(1): p. 11-8. 2. Plomin, R., M.J. Owen, and P. McGuffin, The genetic basis of complex human behaviors. Science, 1994. 264(5166): p. 1733-9. 3. Owen, M.J., N. Craddock, and M.C. O'Donovan, Schizophrenia: genes at last? Trends Genet, 2005. 21(9): p. 518-25. 4. Fenn, J.B., et al., Electrospray ionization for mass spectrometry of large biomolecules. Science, 1989. 246(4926): p. 64-71. 5. Tanaka, K., et al., Protein and polymer analyses up to m/z 100 000 by laser ionization time-of- flight mass spectrometry. Rapid Communications in Mass Spectrometry, 2005. 2(8): p. 151-153. 6. Craddock, N., M.C. O'Donovan, and M.J. Owen, Phenotypic and genetic complexity of psychosis. Invited commentary on ... Schizophrenia: a common disease caused by multiple rare alleles. Br J Psychiatry, 2007. 190: p. 200-3. 7. Lander, E.S. and N.J. Schork, Genetic dissection of complex traits. Science, 1994. 265(5181): p. 2037-48. 8. Barnetche, T., P.A. Gourraud, and A. Cambon-Thomsen, Strategies in analysis of the genetic component of multifactorial diseases; biostatistical aspects. Transpl Immunol, 2005. 14(3-4): p. 255-66. 9. Schork, N.J., Genetics of complex disease: approaches, problems, and solutions. Am J Respir Crit Care Med, 1997. 156(4 Pt 2): p. S103-9. 10. McClellan, J.M., E. Susser, and M.C. King, Schizophrenia: a common disease caused by multiple rare alleles. Br J Psychiatry, 2007. 190: p. 194-9. 11. Lander, E. and L. Kruglyak, Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results. Nat Genet, 1995. 11(3): p. 241-7. 12. Gulko, P.S., Contribution of genetic studies in rodent models of autoimmune arthritis to understanding and treatment of rheumatoid arthritis. Genes Immun, 2007. 13. Botstein, D. and N. Risch, Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease. Nat Genet, 2003. 33 Suppl: p. 228-37. 14. Goldstein, D.B., G.L. Cavalleri, and K.R. Ahmadi, The genetics of common diseases: 10 million times as hard. Cold Spring Harb Symp Quant Biol, 2003. 68: p. 395-401. 15. Maier, W., A. Zobel, and M. Rietschel, Genetics of schizophrenia and affective disorders. Pharmacopsychiatry, 2003. 36 Suppl 3: p. S195-202. 16. Cordell, H.J., Epistasis: what it means, what it doesn't mean, and statistical methods to detect it in humans. Hum Mol Genet, 2002. 11(20): p. 2463-8. 17. Bomprezzi, R., P.E. Kovanen, and R. Martin, New approaches to investigating heterogeneity in complex traits. J Med Genet, 2003. 40(8): p. 553-9.  21 18. Slattery, M.L., et al., Associations among IRS1, IRS2, IGF1, and IGFBP3 genetic polymorphisms and colorectal cancer. Cancer Epidemiol Biomarkers Prev, 2004. 13(7): p. 1206-14. 19. Harrison, P.J. and M.J. Owen, Genes for schizophrenia? Recent findings and their pathophysiological implications. Lancet, 2003. 361(9355): p. 417-9. 20. Barabasi, A.L., Network medicine--from obesity to the "diseasome". N Engl J Med, 2007. 357(4): p. 404-7. 21. King, R.C. and W.D. Stansfield, A Dictionary of Genetics, in A Dictionary of Genetics. 1997, Oxford University Press, Inc: New York. p. 439. 22. Cardno, A.G., et al., Heritability estimates for psychotic disorders: the Maudsley twin psychosis series. Arch Gen Psychiatry, 1999. 56(2): p. 162-8. 23. Cardno, A.G. and Gottesman, II, Twin studies of schizophrenia: from bow-and-arrow concordances to star wars Mx and functional genomics. Am J Med Genet, 2000. 97(1): p. 12-7. 24. Sullivan, P.F., K.S. Kendler, and M.C. Neale, Schizophrenia as a complex trait: evidence from a meta-analysis of twin studies. Arch Gen Psychiatry, 2003. 60(12): p. 1187-92. 25. Giegling, I., et al., Systems biology and complex neurobehavioral traits. Pharmacopsychiatry, 2008. 41 Suppl 1: p. S32-6. 26. McGuffin, P., et al., The heritability of bipolar affective disorder and the genetic relationship to unipolar depression. Arch Gen Psychiatry, 2003. 60(5): p. 497-502. 27. Kieseppa, T., et al., High concordance of bipolar I disorder in a nationwide sample of twins. Am J Psychiatry, 2004. 161(10): p. 1814-21. 28. Breitner, J.C., et al., Alzheimer's disease in the National Academy of Sciences-National Research Council Registry of Aging Twin Veterans. III. Detection of cases, longitudinal results, and observations on twin concordance. Arch Neurol, 1995. 52(8): p. 763-71. 29. Raiha, I., et al., Alzheimer's disease in Finnish twins. Lancet, 1996. 347(9001): p. 573-8. 30. Bergem, A.L., K. Engedal, and E. Kringlen, The role of heredity in late-onset Alzheimer disease and vascular dementia. A twin study. Arch Gen Psychiatry, 1997. 54(3): p. 264-70. 31. Gatz, M., et al., Heritability for Alzheimer's disease: the study of dementia in Swedish twins. J Gerontol A Biol Sci Med Sci, 1997. 52(2): p. M117-25. 32. Gatz, M., et al., Role of genes and environments for explaining Alzheimer disease. Arch Gen Psychiatry, 2006. 63(2): p. 168-74. 33. Pinto, L.A., R.T. Stein, and M. Kabesch, Impact of genetics in childhood asthma. J Pediatr (Rio J), 2008. 84(4 Suppl): p. S68-75. 34. Ju, W., et al., An epidemiology and molecular genetic study on breast cancer susceptibility. Chin Med Sci J, 2000. 15(4): p. 231-7. 35. Czene, K., P. Lichtenstein, and K. Hemminki, Environmental and heritable causes of cancer among 9.6 million individuals in the Swedish Family-Cancer Database. Int J Cancer, 2002. 99(2): p. 260-6. 36. Locatelli, I., et al., A correlated frailty model with long-term survivors for estimating the heritability of breast cancer. Stat Med, 2007. 26(20): p. 3722-34. 37. Harrison, P.J., The neuropathology of schizophrenia. A critical review of the data and their  22 interpretation. Brain, 1999. 122 ( Pt 4): p. 593-624. 38. American Psychiatric Association., Diagnostic criteria from DSM-IV-TR. 2000, Washington, D.C.: American Psychiatric Association. xii, 370 p. 39. Stefansson, H., et al., Neuregulin 1 and schizophrenia. Ann Med, 2004. 36(1): p. 62-71. 40. Owen, M.J., M.C. O'Donovan, and P.J. Harrison, Schizophrenia: a genetic disorder of the synapse? BMJ, 2005. 330(7484): p. 158-9. 41. Weinberger, D.R., et al., Prefrontal neurons and the genetics of schizophrenia. Biol Psychiatry, 2001. 50(11): p. 825-44. 42. Stephan, K.E., K.J. Friston, and C.D. Frith, Dysconnection in schizophrenia: from abnormal synaptic plasticity to failures of self-monitoring. Schizophr Bull, 2009. 35(3): p. 509-27. 43. Drachman, D.A., Do we have brain to spare? Neurology, 2005. 64(12): p. 2004-5. 44. Laruelle, M., L.S. Kegeles, and A. Abi-Dargham, Glutamate, dopamine, and schizophrenia: from pathophysiology to treatment. Ann N Y Acad Sci, 2003. 1003: p. 138-58. 45. Svenningsson, P., et al., Diverse psychotomimetics act through a common signaling pathway. Science, 2003. 302(5649): p. 1412-5. 46. Gaspar, P.A., et al., Molecular mechanisms underlying glutamatergic dysfunction in schizophrenia: therapeutic implications.  J. Neurochem, 2009. 111(4): p. 891-900. 47. Tsai, G. and J.T. Coyle, Glutamatergic mechanisms in schizophrenia. Annu Rev Pharmacol Toxicol, 2002. 42: p. 165-79. 48. Carter, C.J., Schizophrenia susceptibility genes converge on interlinked pathways related to glutamatergic transmission and long-term potentiation, oxidative stress and oligodendrocyte viability. Schizophr Res, 2006. 86(1-3): p. 1-14. 49. Boks, M.P., et al., Reviewing the role of the genes G72 and DAAO in glutamate neurotransmission in schizophrenia. Eur Neurophychopharmacol, 2007. 17(9): p. 567-72. 50. Harrison, P.J. and D.R. Weinberger, Schizophrenia genes, gene expression, and neuropathology: on the matter of their convergence. Mol Psychiatry, 2005. 10(1): p. 40-68; image 5. 51. Sklar, P., Linkage analysis in psychiatric disorders: the emerging picture. Annu Rev Genomics Hum Genet, 2002. 3: p. 371-413. 52. Owen, M.J., H.J. Williams, and M.C. O'Donovan, Schizophrenia genetics: advancing on two fronts. Curr Opin Genet Dev, 2009. 19(3): p. 266-70. 53. Austin, J., Schizophrenia: An update and review. Genetic Counseling, 2005. 14(5): p. 329-340. 54. Tsuang, M., Schizophrenia: genes and environment. Biol Psychiatry, 2000. 47(3): p. 210-20. 55. Corfas, G., K. Roy, and J.D. Buxbaum, Neuregulin 1-erbB signaling and the molecular/cellular basis of schizophrenia. Nat Neurosci, 2004. 7(6): p. 575-80. 56. Cloninger, C.R., The discovery of susceptibility genes for mental disorders. Proc Natl Acad Sci U S A, 2002. 99(21): p. 13365-7. 57. Antonarakis, S.E., et al., Schizophrenia susceptibility and chromosome 6p24-22. Nat Genet, 1995. 11(3): p. 235-6. 58. Moises, H.W., et al., An international two-stage genome-wide search for schizophrenia  23 susceptibility genes. Nat Genet, 1995. 11(3): p. 321-4. 59. Straub, R.E., et al., A potential vulnerability locus for schizophrenia on chromosome 6p24-22: evidence for genetic heterogeneity. Nat Genet, 1995. 11(3): p. 287-93. 60. Turecki, G., et al., Schizophrenia and chromosome 6p. Am J Med Genet, 1997. 74(2): p. 195-8. 61. Schwab, S.G., et al., A genome-wide autosomal screen for schizophrenia susceptibility loci in 71 families with affected siblings: support for loci on chromosome 10p and 6. Mol Psychiatry, 2000. 5(6): p. 638-49. 62. Straub, R.E., et al., Genome-wide scans of three independent sets of 90 Irish multiplex schizophrenia families and follow-up of selected regions in all families provides evidence for multiple susceptibility genes. Mol Psychiatry, 2002. 7(6): p. 542-59. 63. Hallmayer, J.F., et al., Genetic evidence for a distinct subtype of schizophrenia characterized by pervasive cognitive deficit. Am J Hum Genet, 2005. 77(3): p. 468-76. 64. Maziade, M., et al., Shared and specific susceptibility loci for schizophrenia and bipolar disorder: a dense genome scan in Eastern Quebec families. Mol Psychiatry, 2005. 10(5): p. 486-99. 65. Straub, R.E., et al., Genetic variation in the 6p22.3 gene DTNBP1, the human ortholog of the mouse dysbindin gene, is associated with schizophrenia. Am J Hum Genet, 2002. 71(2): p. 337- 48. 66. Schwab, S.G., et al., Support for association of schizophrenia with genetic variation in the 6p22.3 gene, dysbindin, in sib-pair families with linkage and in an additional sample of triad families. Am J Hum Genet, 2003. 72(1): p. 185-90. 67. Tang, J.X., et al., Family-based association study of DTNBP1 in 6p22.3 and schizophrenia. Mol Psychiatry, 2003. 8(8): p. 717-8. 68. Van Den Bogaert, A., et al., The DTNBP1 (dysbindin) gene contributes to schizophrenia, depending on family history of the disease. Am J Hum Genet, 2003. 73(6): p. 1438-43. 69. van den Oord, E.J., et al., Identification of a high-risk haplotype for the dystrobrevin binding protein 1 (DTNBP1) gene in the Irish study of high-density schizophrenia families. Mol Psychiatry, 2003. 8(5): p. 499-510. 70. Funke, B., et al., Association of the DTNBP1 locus with schizophrenia in a U.S. population. Am J Hum Genet, 2004. 75(5): p. 891-8. 71. Kirov, G., et al., Strong evidence for association between the dystrobrevin binding protein 1 gene (DTNBP1) and schizophrenia in 488 parent-offspring trios from Bulgaria. Biol Psychiatry, 2004. 55(10): p. 971-5. 72. Numakawa, T., et al., Evidence of novel neuronal functions of dysbindin, a susceptibility gene for schizophrenia. Hum Mol Genet, 2004. 13(21): p. 2699-708. 73. Williams, N.M., et al., Identification in 2 independent samples of a novel schizophrenia risk haplotype of the dystrobrevin binding protein gene (DTNBP1). Arch Gen Psychiatry, 2004. 61(4): p. 336-44. 74. Li, T., et al., Identifying potential risk haplotypes for schizophrenia at the DTNBP1 locus in Han Chinese and Scottish populations. Mol Psychiatry, 2005. 10(11): p. 1037-44. 75. Chagnon, Y.C., et al., Differential RNA expression between schizophrenic patients and controls  24 of the dystrobrevin binding protein 1 and neuregulin 1 genes in immortalized lymphocytes. Schizophr Res, 2008. 100(1-3): p. 281-90. 76. Chen, X.W., et al., DTNBP1, a schizophrenia susceptibility gene, affects kinetics of transmitter release. J Cell Biol, 2008. 181(5): p. 791-801. 77. Frankle, W.G., J. Lerma, and M. Laruelle, The synaptic hypothesis of schizophrenia. Neuron, 2003. 39(2): p. 205-16. 78. Talbot, K., et al., Dysbindin-1 is reduced in intrinsic, glutamatergic terminals of the hippocampal formation in schizophrenia. J Clin Invest, 2004. 113(9): p. 1353-63. 79. Weickert, C.S., et al., Reduced DTNBP1 (dysbindin-1) mRNA in the hippocampal formation of schizophrenia patients. Schizophr Res, 2008. 98(1-3): p. 105-10. 80. Weickert, C.S., et al., Human dysbindin (DTNBP1) gene expression in normal brain and in schizophrenic prefrontal cortex and midbrain. Arch Gen Psychiatry, 2004. 61(6): p. 544-55. 81. Meyer, D. and C. Birchmeier, Multiple essential functions of neuregulin in development. Nature, 1995. 378(6555): p. 386-90. 82. Liu, X., et al., Domain-specific gene disruption reveals critical regulation of neuregulin signaling by its cytoplasmic tail. Proc Natl Acad Sci U S A, 1998. 95(22): p. 13024-9. 83. Rio, C., et al., Neuregulin and erbB receptors play a critical role in neuronal migration. Neuron, 1997. 19(1): p. 39-50. 84. Vaskovsky, A., et al., ErbB-4 activation promotes neurite outgrowth in PC12 cells. J Neurochem, 2000. 74(3): p. 979-87. 85. Huang, Y.Z., et al., Regulation of neuregulin signaling by PSD-95 interacting with ErbB4 at CNS synapses. Neuron, 2000. 26(2): p. 443-55. 86. Li, B., et al., The neuregulin-1 receptor erbB4 controls glutamatergic synapse maturation and plasticity. Neuron, 2007. 54(4): p. 583-97. 87. Garratt, A.N., S. Britsch, and C. Birchmeier, Neuregulin, a factor with many functions in the life of a schwann cell. Bioessays, 2000. 22(11): p. 987-96. 88. Badner, J.A. and E.S. Gershon, Meta-analysis of whole-genome linkage scans of bipolar disorder and schizophrenia. Mol Psychiatry, 2002. 7(4): p. 405-11. 89. Bakker, S.C., et al., Neuregulin 1: genetic support for schizophrenia subtypes. Mol Psychiatry, 2004. 9(12): p. 1061-3. 90. Blouin, J.L., et al., Schizophrenia susceptibility loci on chromosomes 13q32 and 8p21. Nat Genet, 1998. 20(1): p. 70-3. 91. Corvin, A.P., et al., Confirmation and refinement of an 'at-risk' haplotype for schizophrenia suggests the EST cluster, Hs.97362, as a potential susceptibility gene at the neuregulin-1 locus. Mol Psychiatry, 2004. 9(2): p. 208-13. 92. Kendler, K.S., et al., Evidence for a schizophrenia vulnerability locus on chromosome 8p in the Irish Study of High-Density Schizophrenia Families. Am J Psychiatry, 1996. 153(12): p. 1534-40. 93. Li, T., et al., Identification of a novel neuregulin 1 at-risk haplotype in Han schizophrenia Chinese patients, but no association with the Icelandic/Scottish risk haplotype. Mol Psychiatry, 2004. 9(7): p. 698-704.  25 94. Petryshen, T.L., et al., Support for involvement of neuregulin 1 in schizophrenia pathophysiology. Mol Psychiatry, 2005. 10(4): p. 366-74, 328. 95. Pulver, A.E., et al., Schizophrenia: a genome scan targets chromosomes 3p and 8p as potential sites of susceptibility genes. Am J Med Genet, 1995. 60(3): p. 252-60. 96. Pulver, A.E., et al., Genetic heterogeneity in schizophrenia: stratification of genome scan data using co-segregating related phenotypes. Mol Psychiatry, 2000. 5(6): p. 650-3. 97. Stefansson, H., et al., Association of neuregulin 1 with schizophrenia confirmed in a Scottish population. Am J Hum Genet, 2003. 72(1): p. 83-7. 98. Tang, J.X., et al., Polymorphisms within 5' end of the neuregulin 1 gene are genetically associated with schizophrenia in the Chinese population. Mol Psychiatry, 2004. 9(1): p. 11-2. 99. Williams, N.M., et al., Support for genetic variation in neuregulin 1 and susceptibility to schizophrenia. Mol Psychiatry, 2003. 8(5): p. 485-7. 100. Yang, J.Z., et al., Association study of neuregulin 1 gene with schizophrenia. Mol Psychiatry, 2003. 8(7): p. 706-9. 101. Zhao, X., et al., A case control and family based association study of the neuregulin1 gene and schizophrenia. J Med Genet, 2004. 41(1): p. 31-4. 102. Norton, N., H.J. Williams, and M.J. Owen, An update on the genetics of schizophrenia. Curr Opin Psychiatry, 2006. 19(2): p. 158-64. 103. Wang, X.D., et al., Chronic antipsychotic drug administration alters the expression of neuregulin 1beta, ErbB2, ErbB3, and ErbB4 in the rat prefrontal cortex and hippocampus. Int J Neuropsychopharmacol, 2008. 11(4): p. 553-61. 104. Woo, R.S., et al., Neuregulin-1 enhances depolarization-induced GABA release. Neuron, 2007. 54(4): p. 599-610. 105. Segal, D., et al., Oligodendrocyte pathophysiology: a new view of schizophrenia. Int J Neuropsychopharmacol, 2007. 10(4): p. 503-11. 106. Norton, N., et al., Evidence that interaction between neuregulin 1 and its receptor erbB4 increases susceptibility to schizophrenia. Am J Med Genet B Neuropsychiatr Genet, 2006. 141B(1): p. 96-101. 107. Silberberg, G., et al., The involvement of ErbB4 with schizophrenia: association and expression studies. Am J Med Genet B Neuropsychiatr Genet, 2006. 141B(2): p. 142-8. 108. Nicodemus, K.K., et al., Further evidence for association between ErbB4 and schizophrenia and influence on cognitive intermediate phenotypes in healthy controls. Mol Psychiatry, 2006. 11(12): p. 1062-5. 109. Bao, J., et al., Back signaling by the Nrg-1 intracellular domain. J Cell Biol, 2003. 161(6): p. 1133- 41. 110. Bao, J., et al., Activity-dependent transcription regulation of PSD-95 by neuregulin-1 and Eos. Nat Neurosci, 2004. 7(11): p. 1250-8. 111. St Clair, D., et al., Association within a family of a balanced autosomal translocation with major mental illness. Lancet, 1990. 336(8706): p. 13-6. 112. Blackwood, D.H., et al., Schizophrenia and affective disorders--cosegregation with a translocation  26 at chromosome 1q42 that directly disrupts brain-expressed genes: clinical and P300 findings in a family. Am J Hum Genet, 2001. 69(2): p. 428-33. 113. Hodgkinson, C.A., et al., Disrupted in schizophrenia 1 (DISC1): association with schizophrenia, schizoaffective disorder, and bipolar disorder. Am J Hum Genet, 2004. 75(5): p. 862-72. 114. Thomson, P.A., et al., Association between the TRAX/DISC locus and both bipolar disorder and schizophrenia in the Scottish population. Mol Psychiatry, 2005. 10(7): p. 657-68, 616. 115. Callicott, J.H., et al., Variation in DISC1 affects hippocampal structure and function and increases risk for schizophrenia. Proc Natl Acad Sci U S A, 2005. 102(24): p. 8627-32. 116. Cannon, T.D., et al., Association of DISC1/TRAX haplotypes with schizophrenia, reduced prefrontal gray matter, and impaired short- and long-term memory. Arch Gen Psychiatry, 2005. 62(11): p. 1205-13. 117. Hennah, W., et al., DISC1 association, heterogeneity and interplay in schizophrenia and bipolar disorder. Mol Psychiatry, 2008. 118. Zhang, F., et al., Genetic association between schizophrenia and the DISC1 gene in the Scottish population. Am J Med Genet B Neuropsychiatr Genet, 2006. 141B(2): p. 155-9. 119. Yamada, K., et al., Association analysis of FEZ1 variants with schizophrenia in Japanese cohorts. Biol Psychiatry, 2004. 56(9): p. 683-90. 120. Pickard, B.S., et al., The PDE4B gene confers sex-specific protection against schizophrenia. Psychiatr Genet, 2007. 17(3): p. 129-33. 121. Matsuzaki, S. and M. Tohyama, Molecular mechanism of schizophrenia with reference to disrupted-in-schizophrenia 1 (DISC1). Neurochem Int, 2007. 51(2-4): p. 165-72. 122. Crow, T.J., How and why genetic linkage has not solved the problem of psychosis: review and hypothesis. Am J Psychiatry, 2007. 164(1): p. 13-21. 123. Yue, P., E. Melamud, and J. Moult, SNPs3D: candidate gene and SNP selection for association studies. BMC Bioinformatics, 2006. 7: p. 166. 124. Stenson, P.D., et al., Human Gene Mutation Database (HGMD): 2003 update. Hum Mutat, 2003. 21(6): p. 577-81. 125. Elston, R.C. and M. Anne Spence, Advances in statistical human genetics over the last 25 years. Stat Med, 2006. 25(18): p. 3049-80. 126. Manolio, T.A., et al., Finding the missing heritability of complex diseases. Nature, 2009. 461(7265): p. 747-53. 127. Tosato, S., P. Dazzan, and D. Collier, Association between the neuregulin 1 gene and schizophrenia: a systematic review. Schizophr Bull, 2005. 31(3): p. 613-7. 128. Morris, D.W., et al., Dysbindin (DTNBP1) and the biogenesis of lysosome-related organelles complex 1 (BLOC-1): main and epistatic gene effects are potential contributors to schizophrenia susceptibility. Biol Psychiatry, 2008. 63(1): p. 24-31. 129. Wood, L.S., E.H. Pickering, and B.M. Dechairo, Significant support for DAO as a schizophrenia susceptibility locus: examination of five genes putatively associated with schizophrenia. Biol Psychiatry, 2007. 61(10): p. 1195-9. 130. Nicodemus, K.K., et al., Evidence for statistical epistasis between catechol-O-methyltransferase  27 (COMT) and polymorphisms in RGS4, G72 (DAOA), GRM3, and DISC1: influence on risk of schizophrenia. Hum Genet, 2007. 120(6): p. 889-906. 131. Song, W., et al., Identification of high risk DISC1 structural variants with a 2% attributable risk for schizophrenia. Biochem Biophys Res Commun, 2008. 367(3): p. 700-6. 132. Schaid, D.J., Transmission disequilibrium, family controls, and great expectations. Am J Hum Genet, 1998. 63(4): p. 935-41. 133. Sebat, J., D.L. Levy, and S.E. McCarthy, Rare structural variants in schizophrenia: one disorder, multiple mutations; one mutation, multiple disorders. Trends Genet, 2009. 25(12): p. 528-35. 134. Wellcome Trust Case Control Consortium., Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature, 2007. 447(7145): p. 661-78. 135. Chumakov, I., et al., Genetic and physiological data implicating the new human gene G72 and the gene for D-amino acid oxidase in schizophrenia. Proc Natl Acad Sci U S A, 2002. 99(21): p. 13675-80. 136. The International HapMap Project. Nature, 2003. 426(6968): p. 789-96. 137. Khalil, I.G. and C. Hill, Systems biology for cancer. Curr Opin Oncol, 2005. 17(1): p. 44-8. 138. Kanehisa, M. and S. Goto, KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res, 2000. 28(1): p. 27-30. 139. Peri, S., et al., Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome Res, 2003. 13(10): p. 2363-71. 140. Potapov, A., et al., EndoNet: an information resource about endocrine networks. Nucleic Acids Res, 2006. 34(Database issue): p. D540-5. 141. Becker, K.G., et al., The genetic association database. Nat Genet, 2004. 36(5): p. 431-2. 142. Giot, L., et al., A protein interaction map of Drosophila melanogaster. Science, 2003. 302(5651): p. 1727-36. 143. Lee, I., et al., A probabilistic functional network of yeast genes. Science, 2004. 306(5701): p. 1555-8. 144. Tong, A.H., et al., Global mapping of the yeast genetic interaction network. Science, 2004. 303(5659): p. 808-13. 145. Li, S., et al., A map of the interactome network of the metazoan C. elegans. Science, 2004. 303(5657): p. 540-3. 146. Shannon, P., et al., Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res, 2003. 13(11): p. 2498-504. 147. Gstaiger, M. and R. Aebersold, Applying mass spectrometry-based proteomics to genetics, genomics and network biology. Nat Rev Genet, 2009. 10(9): p. 617-27. 148. Glatter, T., et al., An integrated workflow for charting the human interaction proteome: insights into the PP2A system. Mol Syst Biol, 2009. 5: p. 237. 149. Gingras, A.C., et al., A novel, evolutionarily conserved protein phosphatase complex involved in cisplatin sensitivity. Mol Cell Proteomics, 2005. 4(11): p. 1725-40. 150. Goudreault, M., et al., A PP2A phosphatase high density interaction network identifies a novel striatin-interacting phosphatase and kinase complex linked to the cerebral cavernous  28 malformation 3 (CCM3) protein. Mol Cell Proteomics, 2009. 8(1): p. 157-71. 151. Abu-Farha, M., F. Elisma, and D. Figeys, Identification of protein-protein interactions by mass spectrometry coupled techniques. Adv Biochem Eng Biotechnol, 2008. 110: p. 67-80. 152. Bader, G.D., D. Betel, and C.W. Hogue, BIND: the Biomolecular Interaction Network Database. Nucleic Acids Res, 2003. 31(1): p. 248-50. 153. Xenarios, I., et al., DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res, 2002. 30(1): p. 303-5. 154. Hermjakob, H., et al., IntAct: an open source molecular interaction database. Nucleic Acids Res, 2004. 32(Database issue): p. D452-5. 155. Zanzoni, A., et al., MINT: a Molecular INTeraction database. FEBS Lett, 2002. 513(1): p. 135-40. 156. Guldener, U., et al., MPact: the MIPS protein interaction resource on yeast. Nucleic Acids Res, 2006. 34(Database issue): p. D436-41. 157. Stark, C., et al., BioGRID: a general repository for interaction datasets. Nucleic Acids Res, 2006. 34(Database issue): p. D535-9. 158. Fields, S. and O. Song, A novel genetic system to detect protein-protein interactions. Nature, 1989. 340(6230): p. 245-6. 159. Gietz, R.D., et al., Identification of proteins that interact with a protein of interest: applications of the yeast two-hybrid system. Mol Cell Biochem, 1997. 172(1-2): p. 67-79. 160. Deane, C.M., et al., Protein interactions: two methods for assessment of the reliability of high throughput observations. Mol Cell Proteomics, 2002. 1(5): p. 349-56. 161. Masters, S.C., Co-immunoprecipitation from transfected cells. Methods Mol Biol, 2004. 261: p. 337-50. 162. Rigaut, G., et al., A generic protein purification method for protein complex characterization and proteome exploration. Nat Biotechnol, 1999. 17(10): p. 1030-2. 163. Hirosawa, M., et al., MASCOT: multiple alignment system for protein sequences based on three- way dynamic programming. Comput Appl Biosci, 1993. 9(2): p. 161-7. 164. Craig, R. and R.C. Beavis, TANDEM: matching proteins with tandem mass spectra. Bioinformatics, 2004. 20(9): p. 1466-7. 165. Walhout, A.J., Unraveling transcription regulatory networks by protein-DNA and protein-protein interaction mapping. Genome Res, 2006. 16(12): p. 1445-54. 166. Barski, A. and K. Zhao, Genomic location analysis by ChIP-Seq. J Cell Biochem, 2009. 107(1): p. 11-8. 167. Wingender, E., et al., TRANSFAC: a database on transcription factors and their DNA binding sites. Nucleic Acids Res, 1996. 24(1): p. 238-41. 168. Huang da, W., et al., The DAVID Gene Functional Classification Tool: a novel biological module- centric algorithm to functionally analyze large gene lists. Genome Biol, 2007. 8(9): p. R183. 169. Harris, M.A., et al., The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res, 2004. 32(Database issue): p. D258-61. 170. Bonferroni, C.E., Teoria statistia delle classie calcolo delle probabilita. R. Ist. Super. Sci. Econ. Commer. Firenze, 1936. 8: p. 3-62.  29 171. Bland, J.M. and D.G. Altman, Multiple significance tests: the Bonferroni method. BMJ, 1995. 310(6973): p. 170. 172. Benjamini, Y. and Y. Hochberg, Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc [Ser B], 1995. 57(1): p. 289-300. 173. Camargo, L.M., et al., Disrupted in Schizophrenia 1 Interactome: evidence for the close connectivity of risk genes and a potential synaptic basis for schizophrenia. Mol Psychiatry, 2007. 12(1): p. 74-86. 174. Iivanainen, E., et al., Intra- and extracellular signaling by endothelial neuregulin-1. Exp Cell Res, 2007. 313(13): p. 2896-909. 175. Schulze, T.G. and F.J. McMahon, Genetic linkage and association studies in bipolar affective disorder: a time for optimism. Am J Med Genet C Semin Med Genet, 2003. 123(1): p. 36-47.  30  31  2 Cytosolic Protein Interactions of the Schizophrenia Susceptibility Gene Dysbindin1 2.1 Introduction Schizophrenia is a complex psychiatric disorder with a strong genetic component and unknown etiology [1, 2].  A number of genes have been identified as schizophrenia susceptibility genes including DTNBP1 (DTNBP1), neuregulin (NRG1), catechol-O-methyltransferase (COMT), and disrupted-in-schizophrenia-1 (DISC1), among others [3].  DTNBP1 is one of the most robust schizophrenia susceptibility genes identified to date, located within one of the most consistently replicated schizophrenia linkage regions (6p22.3) [4-11] and having one of the most replicated schizophrenia association findings [12-21].  The DTNBP1 knockout mouse (sandy) shows increased dopamine turnover in specific brain regions [22], and schizophrenia patients have been found to have decreased expression of DTNBP1 at presynaptic glutamatergic terminals in the hippocampus [23, 24] and dorsolateral prefrontal cortex [25].  Defects in neurosecretion and vesicular morphology in neuroendocrine cells and hippocampal synapses have been identified at the single vesicle level in sandy mice and implicate DTNBP1 functions in the regulation of exocytosis and vesicle biogenesis in endocrine cells and neurons [26].  DTNBP1 was initially discovered as an interacting protein of dystrobrevin proteins (DTNA, DTNB) [27]. Mutations in the DTNBP1 gene have been shown to cause Hermansky-Pudlak Syndrome (HPS) type 7, one of eight known human HPS types caused by defects in intracellular protein trafficking resulting from the dysfunction of lysosome-related organelles.  DTNBP1 is a member of the biogenesis of lysosomal related complex-1 (BLOC1) [28], which is known to interact with the biogenesis of lysosomal related complex-2 (BLOC2) and adapter-related protein complex-3 (AP3) complexes [29], and functions in organelle biogenesis and the protein transport pathway [30].  BLOC1 subunits are implicated in synaptic mechanisms [31]; Pallidin (PLDN) is involved in mediating vesicle docking and fusion [32] and disruption of soluble NSF-attachment receptor (SNARE)-associated protein (SNAPIN) causes defective secretion of neurotransmitters in mice [33].  Recently, Hikita et al. (2009) identified syntaxin binding partner-1 (STXBP1 or Munc18-1) as a DTNBP1 interacting protein in an analysis of DTNBP1 membrane interactions.  STXBP1 has been implicated in the exocytosis of synaptic vesicles [34-36].  DISC1 was initially identified at a site for a balanced translocation (1:11) (q42.1;q14.3) that co-segregates with schizophrenia and other psychiatric disorders in a large Scottish pedigree [37, 38].  Several studies provide evidence of DISC1 association with schizophrenia [39-44].  DISC1 has been shown to interact with multiple proteins including nuclear protein distribution nudE-like 1 (NDEL1) [45, 46], platelet- activating factor acetylhydrolase IB subunit alpha (PAFAH1B1, also called LIS1) [47], cAMP-specific 3',5'- cyclic phosphodiesterase 4B (PDE4B) [48], growth factor receptor-bound protein 2 (GRB2) [49, 50],  1 A version of this chapter has been published:  Carri-Lyn R. Mead, Michael A. Kuzyk, Annie Moradian, Gary M. Wilson, Robert A. Holt, Gregg B. Morin. Cytosolic Protein Interactions of the Schizophrenia Susceptibility Gene Dysbindin.  Journal of Neurochemistry, 2010. fasciculation and elongation protein zeta-1 (FEZ1) [51], DISC1-binding zinc-finger protein (DBZ) [52], and pericentrin (PCNT or kendrin) [53].  PDE4B and FEZ1 have also had positive schizophrenia association results [54, 55].  Through these interactions it is postulated that DISC1 is involved in brain development including neuronal migration, neurite outgrowth and neural maturation through interaction with several cytoskeletal proteins [56].  Defective synaptic transmission and neurotransmitter release is hypothesized to be a pathogenic mechanism in schizophrenia [3, 57].  Defects in presynaptic vesicle proteins have been associated with schizophrenia [58-60] and the schizophrenic brain has been shown to possess reduced levels of mRNA and/or proteins involved in synaptic vesicle fusion [61-64].  Our understanding of the vesicle life cycle involved in neurotransmitter release remains incomplete.  For example, the disruption of DTNBP1 in sandy mice causes defects in neurosecretion and vesicle morphology in neuroendocrine cells and hippocampal synapses at the single vesicle level, including larger vesicle size, slower quantal vesicle release, lower release probability, and smaller total population of the readily releasable vesicle pool [26]. However, the detailed mechanisms for how these DTNBP1-associated phenomena may contribute to neurotransmitter release or schizophrenia pathogenesis are unknown.  We reasoned that a better understanding of DTNBP1 interacting proteins would facilitate the investigation of DTNBP1 pathways and functions and potentially provide insight into schizophrenia etiology.  Using proteomic techniques we have identified DTNBP1 interacting proteins with vesicle trafficking components. Intriguingly, these data also identify potential common links for two of the best schizophrenia susceptibility genes, DTNBP1 and DISC1, to portions of the vesicle trafficking system.  2.2 Materials and Methods 2.2.1 Cloning cDNAs for two isoforms of DTNBP1: version 1 (NM_032122) and version 3 (NM_183041), and exocyst component 4 (EXOC4), exocyst component 3 (EXOC3), AP3 subunit beta-1 (AP3B1), AP3 subunit beta-2 (AP3B2), dynactin subunit 2 (DCTN2), and alpha-centractin (ACTR1A, a member of the dynactin complex) were procured (IMAGE clones 4139934, 6183004, 5590332, 3914400, 7939584, and 3347881 respectively, (American Type Culture Collection,(ATCC), Manassas, VA, USA).  The open reading frames were amplified (see Table 2.11 for PCR primer sets) and ligated into the V954 donor vector of the Creator Splice system [65].  The cDNA containing cassettes were transferred into mammalian expression acceptor vectors for fusion to both N- and C-terminal 3xFLAG tags (V180 and V181) by Cre-lox recombination.  DTNBP1 version 1 was also transferred to GFP-fusion vectors (V4 and V6).  Completed constructs were sequence verified prior to use.   32 2.2.2 Antibodies Antibodies were anti-DTNBP1 (goat, sc-46931, Santa Cruz Biotechnology, Santa Cruz, CA, USA), anti- GFP (rabbit, sc-8334, Santa Cruz Biotechnology), anti-FLAG (mouse, M2, Sigma-Aldrich), anti-DISC1 (goat, ab41985, Abcam Inc, Cambridge, MA, USA), and fluorescent secondary antibodies (IR-700, Roche, Basel, Switzerland).  2.2.3 Cell culture HEK293 (human embryonic kidney, ATCC) and X57 (mouse striatal, obtained from Marian DiFiglia at Massachusetts General Hospital, Boston, MA, USA) cells were maintained at 37˚C and 5% CO2 atmosphere in Dulbecco’s modified eagle’s medium with glutamax (Invitrogen, Carlsbad, CA, USA) supplemented with 10% added fetal bovine serum (Invitrogen).  2.2.4 Immunoprecipitation, protein complex preparation, and mass spectrometry The 3XFLAG-tagged cDNA expression vectors were transfected into HEK293 (for DTNBP1) or X57 (for exocyst and dynactin complex proteins) cells using lipofectamine 2000 reagent (Invitrogen) in Optimem reduced serum media (Invitrogen).  Cells were separately transfected with the parent acceptor vector for the negative control sample.  Approximately 48 hours after transfection, confluent cells were harvested using chilled (4˚C) phosphate buffered solution (1 mM NaH2PO4, 155 mM NaCl, 3 mM Na2HPO4, pH 7.4) and the pellets were stored at -80˚C.  Cell pellets were resuspended in lysis/wash buffer with protease inhibitors (Tris buffered saline (20mM Tris-base, 100mM NaCl, pH 7.4), 1mM EDTA, 1% NP-40, 10 μg/ml Leupeptin, 10 μg/ml Aprotinin, 10μg/ml Pepstatin, 1mM AEBSF, 2mM Na3VO4, 10mM β- Glycerophosphate) and forced through a 20 gauge syringe 3 times, and rocked for 30 minutes at 4˚C. The cell debris was removed by centrifugation (15 min at 12,000 rpm) and the supernatant filtered through a 0.45μm membrane.  The extracts were immunoprecipitated with anti-FLAG M2 agarose resin (Sigma-Aldrich, St. Louis, MO, USA) and proteins released with FLAG peptide, as described previously [66].  The immunoprecipitate eluates, from the 3XFLAG-tagged cDNA and empty vector control transfections, were lyophilized and rehydrated in loading buffer (Tris-Cl/SDS (58mM Tris-Cl, 0.05% SDS, pH 6.8), 5% glycerol, 1.7% SDS, 0.1M dithiothreitol, 0.03 μM bromophenol blue) and heated at 85˚C for 10 minutes. The eluates were separated by 1-dimensional SDS-PAGE using a 4-12% gradient NuPAGE gel (Invitrogen) with NuPAGE MES running buffer (Invitrogen) for ~1.5 hr at 150V.  The gel was then stained in colloidal Coomassie (20% ethanol, 1.6% phosphoric acid, 8% ammonium sulfate, 0.08% Coomassie Brilliant Blue G-250) and de-stained with distilled H2O.  The entirety of each lane was excised and  33 divided into 16 pieces; each slice was finely diced and transferred to a 96-well plate.  Automated in-gel dehydration, alkylation, trypsin digestion, and extraction were performed (Progest; Genomic Solutions, Ann Arbor, MI, USA).  The extracts were lyophilized and re-suspended in 3% acetonitrile, 1.5% formic acid.  High performance liquid chromatography-electrospray-tandem mass spectroscopy (HPLC-ESI-MS/MS) was performed on a 4,000 QTrap mass spectrometer (Applied Biosystems/Sciex, Foster City, CA), except for three DTNBP1 immunoprecipitations which were performed on a LCQ Deca (Thermo-Fisher).  Both instruments were coupled to a Agilent 1,100 Nano-HPLC (Agilent, SantaClara, CA, USA) using a nano- ESI interface.  For the 4,000 QTrap, samples were desalted using an on-line trap column (Agilent, Zorbax, 300SB-C18, 5μm, 5x0.3 mm) and chromatography coupled to electrospray was performed on a 75 μm x 150 mm reverse phase column (Jupiter 4μ Proteo 90 A, Phenomenex, Torrance, CA, USA) using Buffer A (5% HPLC grade acetonitrile (Fisher, Ottawa, Canada), 0.1% formic acid (Fluka, Sigma-Aldrich) and Buffer B (90% acetonitrile, 0.1% formic acid) in a linear gradient of 0-20% B for 37 minutes, 20-39% B for 16 minutes, and 39-90% B for 8 minutes at a flow rate of 300 nl/min.  ESI-MS/MS was performed with ESI at 1,850 V, interface heater at 150 °C, at 4.7x10-5 torr with nitrogen (99.999%, Praxair, Danbury, CT, USA) for nebulizer gas (0.5 ml/min) and curtain gas (2 L/min).  Data were collected using a 400-1,600 m/z Enhanced MS scan followed by an Enhanced Resolution scan to select the top five +2 and +3 ions for collisional-induced dissociation and a final Enhanced Product Ion MS scan.  For the LCQ Deca, the chromatography was performed with flow splitting to produce a 0.2 µl/min rate.  The samples were washed on a trap column (Jupiter 4µ, Proteo 90 A, Phenomenex) using buffer A (5% HPLC grade acetonitrile, 0.1% formic acid and 0.02% TFA) for 5 minutes.  Chromatography was then performed (column: 75 µm x 100 mm; Jupiter 4µ, Proteo 90 A, Phenomenex) using buffer B (90% acetonitrile, 0.1% formic acid, 0.02% TFA) in a linear gradient of 0-35% B for 37 minutes, 35-65% for 8 minutes and 65- 100% for 2 minutes.  The capillary was at 160˚C and ESI at 1,800 V.  The MS was operated at 1-1.9x10-5 torr in data-dependent acquisition mode with a full scan MS (400-1,400 m/z) followed by MS/MS of the three most intense precursor ions using collisional-induced dissociation (helium).  The MS/MS spectra emanating from the gel slices for each lane were concatenated and searched against the Ensembl human or mouse databases, as appropriate, using the Mascot (Matrix Science, Boston, MA, USA) and X!Tandem (http://www.thegpm.org/TANDEM/) search engines.  Search parameters were 0.3 Da and 0.4 Da for precursor and product ion mass tolerance, respectively; trypsin digestion; one missed cleavage; oxidation (methionine); deamidation (asparagine, glutamine); phosphorylation (serine, threonine, and tyrosine), and carbamidomethylation of cysteine.  The raw data are available through the PRIDE database (http://www.ebi.ac.uk/pride/).  2.2.5 Identification of candidate interacting proteins Using in house SpecterWeb software we processed the identified proteins from the MS/MS analyses by subtracting:  1) proteins found in more than one negative empty vector control sample; 2) common non-  34 specific binding proteins (heat shock, transcriptional and translational machinery, keratins, and protein arginine N-methyltransferase (PRMT5)); 3) proteins found in only 1 experimental sample; and 4) proteins without 2 observations each having X!Tandem log(E) scores ≤ -3.  The candidate interacting proteins were then subdivided into two tiers, where first tier proteins were those having a minimum of 2 unique peptides in each of at least 2 replicates, as identified by X!Tandem.  The average X!Tandem log(E) score was calculated for each candidate protein across the experiments where it was observed.  The list of candidate interacting proteins and the analysis data for the DTNBP1, dynactin and exocyst experiments are reported in Tables 2.2A, 2.3A, 2.4A, and Figures 2.9, 2.10, and 2.11, respectively. 2.2.6 Validation of protein interactions through immunoprecipitation- western analysis The 3XFLAG-tagged constructs were co-transfected with GFP-tagged DTNBP1 (N-tagged for X57 experiments, C-tagged for HEK293 experiments) into HEK293 or X57 cells as described above.  Anti- FLAG immunoprecipitation and separation by one-dimensional SDS-PAGE is described above.  The negative controls were cells transfected with the parent acceptor vectors.  An aliquot of the input lysate without immunoprecipitation was used as a positive control.  The proteins were electro-transferred to nitrocellulose for ~16 hrs at 100mA in towbin buffer (0.25M Tris-base, 2M glycine, pH 8.5, 20% methanol).  The nitrocellulose was blocked using NuPAGE Odyssey blocking buffer (Invitrogen) before being probed using the appropriate antibody (1:200 anti-GFP antibody, or 1:500 anti-DISC1 antibody) for ~ 16 hrs at 4˚C in blocking buffer.  The filter was washed (three times, 5 minutes each) with TBS-Tween buffer (Tris buffered saline, 0.01%Tween-20) and then probed with the appropriate secondary antibody for 30 minutes at 22˚C.  The nitrocellulose was then washed with TBS-Tween buffer (three times, 5 minutes each) and imaged using an Odyssey system (LI-COR Biosciences, Lincoln, NE, USA).  2.2.7 Linkage and association with schizophrenia analysis The genetic association database (GAD, http://geneticassociationdb.nih.gov/) and the psychiatric genetics evidence project linkage database (https://slep.unc.edu/evidence/) were used to determine whether any of the protein interaction sets were overrepresented with proteins having previous linkage or association evidence of involvement with schizophrenia.  A query of the GAD database using the HGNC names showed few of the proteins have been investigated in schizophrenia association studies, thus GAD association data did not contribute to the schizophrenia over-representation analysis.  Therefore, only the linkage database was queried with the cytogenetic location associated with each protein in order to perform the overrepresentation analysis.  For AP3, BLOC1, CCT, dynactin, and exocyst all complex members were included in the analysis if any members of the complex had been identified in the protein interaction set.  A Chi-squared test was used to determine the significance of finding the number of schizophrenia linked cytogenetic regions observed for each protein interaction set, given the number of cytogenetic regions across the entire human genome with evidence of schizophrenia linkage.  35  2.3 Results 2.3.1 DTNBP1 protein interactions DTNBP1 interacts with vesicle trafficking complexes: exocyst, dynactin, AP3, CCT and cytoskeletal components.  To obtain high confidence protein interaction partners for DTNBP1 we used a co-immunoprecipitation comparative mass spectrometry (MS) analysis procedure where peptides from immunoprecipitated DTNBP1 protein complexes were compared to peptides found in protein complexes from control immunoprecipitates.  HEK293 cells were transfected with FLAG epitope-tagged versions of DTNBP1 isoform 1 and isoform 3 or an empty vector control.  Protein complexes from cytosolic fractions of the experimental and control cells were immunoprecipitated by an anti-FLAG antibody and size fractionated using one dimensional SDS-PAGE.  Rather than excise only individual visually observable bands and to obtain high sensitivity, accuracy and coverage we processed the entire lanes and compared the peptide (protein) composition between the experimental and control samples using the MS data. Seven independent experiments were performed; two experiments each for N- and C-tagged DTNBP1 isoform 1 and N-tagged isoform 3, and one experiment for C-tagged isoform 3, which were compared to six empty vector control experiments.  In Figure 2.5, a representative gel image of a control and experimental sample is shown and in Figure 2.6 representative ion chromatograms, MS spectra and MS/MS fragment assignments of two peptides for each of six candidate interacting proteins between the experimental gel slice and the cognate control gel slice.  This demonstrates the observation of an assigned peptide in the experimental sample and its absence in the control sample.  From these experiments we identified 83 unique candidate DTNBP1-interacting proteins that met score quality thresholds (see Methods) and were observed in at least two independent experiments (Table 2.1, Figure 2.1, Table 2.2).  Fourteen of the proteins were previously identified as DTNBP1 interacting proteins, including all seven members of BLOC1 [28], all six members of the AP3 complex, and DNA-dependent protein kinase catalytic subunit (PRKDC) [67] that previously had been shown to interact with either BLOC1 [29] or DTNBP1 itself [68].  The remaining 68 novel DTNBP1 interacting proteins include several members of the exocyst and dynactin complexes, and components of the cytoskeleton, including actin and tubulin and the CCT complex.  These interactions support a role for DTNBP1 in vesicle trafficking.  Six DTNBP1 interactions were chosen for validation, including interactions with two members each of the dynactin (ACTR1A, DCTN2), exocyst (EXOC3, EXOC4), and AP3 (AP3B1, AP3B2) complexes (Table 2.1).  Epitope-tagged versions of these proteins and DTNBP1 were co-expressed in HEK293 and X57 cells.  Immunoprecipitation and western blotting confirmed the interaction of DTNBP1 with the six proteins in all 12 experiments with both HEK293 and X57 cells (Figure 2.2).  Until recently (Hikita et al 2009), the interaction of DTNBP1 with individual members of the AP3 complex had not been shown.  A Y2H experiment identified a candidate interaction between DTNBP1 and EXOC4 [58]; however, this interaction had not previously been validated in mammalian cells.  The average log(E) scores for most of  36 the 68 novel DTNBP1 interactors were within the range of scores for the 18 known and/or validated DTNBP1 interacting proteins (Figure 2.9, and Table 2.2).  This indicates that most of the proteins identified are likely to be true DTNBP1 interaction partners.  2.3.2 Exocyst and dynactin protein interactions The exocyst and dynactin complexes interact with the CCT complex and vesicular transport, trafficking, and transporter associated proteins.  In order to extend the DTNBP1-associated protein interaction network co-immunoprecipitation comparative mass spectrometry experiments were performed in X57 cells for two exocyst complex proteins (EXOC3, EXOC4) and two dynactin complex proteins (ACTR1A, DCTN2).  Two experiments were performed for each of these four ‘bait’ proteins and compared to two empty vector controls; Figure 2.7 shows a representative gel image.  The results identified 56 and 31 unique protein identifications for the exocyst and dynactin complexes, respectively, after processing for background, quality, and experimental replication (see Methods).  Of these, 48 and 23 are novel interactions, respectively (Table 2.1, 2.3, and 2.4).  Similar to the DTNBP1 data, the average log(E) scores for the novel exocyst and dynactin interacting proteins overlapped with the scores for their known interactors (Figures 2.10 and 2.11).  In addition, 47 and 15 of the interactions in the exocyst and dynactin data were identified by both bait proteins, respectively (Table 2.1, 2.3, and 2.4), and thus represent independent corroboration of these candidate protein-protein interactions.  2.3.3 Verification of DISC1 interactions with exocyst and dynactin complexes DISC1 interacts with the exocyst and dynactin complexes.  Interactions between DISC1 and DCTN2 of the dynactin complex, and DISC1 with both EXOC1 and EXOC7 of the exocyst complex were previously shown by Y2H experiments [58].  To verify that these interactions occur in mammalian cells, we expressed epitope-tagged ACTR1A of the dynactin complex and EXOC3 of the exocyst complex in HEK293 cells, immunoprecipitated the tagged ACTR1A and EXOC3 complexes and then confirmed the presence of the DISC1 protein by western blot with an anti-DISC1 antibody (Figure 2.3).  With this result we have shown that both DTNBP1 and DISC1 interact with the exocyst and dynactin complexes in human cells (Figure 2.1).  2.3.4 Gene ontology analysis of interacting proteins Gene ontology (GO) analysis shows that exocyst and DTNBP1 interacting proteins share vesicle mediated transport and localization terms.  The online DAVID Bioinformatics Resource (http://david.abcc.ncifcrf.gov/) was used to identify significant GO terms for the DTNBP1, dynactin, and exocyst interacting proteins.  DAVID recognized 76 of the 83 proteins in the DTNBP1 interacting protein  37 list and produced significant GO terms for all 3 ontological categories (biological process (BP), cellular component (CC), molecular function (MF)) (Table 2.5).  DAVID recognized 31 of the 31 proteins in the dynactin interacting protein list but only produced significant GO terms for the CC and MF categories; no significant GO terms were found for the BP category (Table 2.6).  DAVID found 50 of the 56 proteins in the exocyst protein interacting protein list and produced significant GO terms for the BP and CC categories (Table 2.7).  The biological process GO terms significantly overrepresented by DTNBP1 and exocyst interacting proteins are shown as an ontological tree in Figure 2.4.  The 13/15 BP terms for the DTNBP1 interacting proteins were also identified as significant BP terms for the exocyst dataset; these include several localization and transport terms including vesicle-mediated transport and secretion by cell, confirming with this extended dataset that not only is DTNBP1 involved in vesicle trafficking, but it interacts with many other proteins that play a role in vesicle trafficking.  2.3.5 DTNBP1 and dynactin interacting proteins linked to schizophrenia Genes encoding DTNBP1 and dynactin interaction networks are significantly overrepresented within chromosomal regions linked to schizophrenia.  We used whole genome linkage data to determine if the genes that encode members of our DTNBP1, exocyst complex, and dynactin complex protein interaction sets tend to be located in cytogenetic regions linked to schizophrenia.  Genetic linkage data were obtained from the Psychiatric Genetics Evidence Project Linkage Database (https://slep.unc.edu/evidence/) (Tables 2.8, 2.9, and 2.10).  This database compiles findings from manual reviews of 144 papers in psychiatric genetics across a variety of disorders and includes studies on genome wide linkage and association, copy number variation, and gene expression.  It shows that 366 of a total of 826 cytogenetic regions have been found to have strong or suggestive linkage to schizophrenia in whole genome scans.  The DTNBP1 and the dynactin protein interaction networks (including all core complex members) comprise 96 and 34 total proteins, respectively, with 55 and 24 of these in schizophrenia linked cytogenetic regions (Chi-squared test; p = 0.017 and 0.006, respectively). In the 71 member exocyst network 34 genes were in schizophrenia linked cytogenetic regions, but this result was not significant (Chi-squared test: p=0.619).  2.4 Discussion It has been suggested that compromised neurotransmission due to aberrations in synaptic trafficking of endosomal vesicles and their neurotransmitter related cargoes may contribute to the etiology of schizophrenia [69].  DTNBP1 is linked genetically to schizophrenia and has an essential role in synaptic vesicle trafficking and homeostatic modulation of neurotransmission [70].  Here, we investigate DTNBP1 protein interactions to identify additional proteins, pathways, and functions of possible relevance to schizophrenia.  Our data significantly expand the overall protein interaction network for DTNBP1 and show that it is involved throughout the vesicle life cycle and vesicle trafficking system, from vesicle  38 biogenesis and cargo sorting (BLOC1, DTNBP1, AP3) to vesicle trafficking (dynactin, tubulin/actin proteins), to membrane targeting and vesicle tethering (exocyst).  Our results include human cytosolic protein interactions for DTNBP1 obtained from endogenous cellular complexes and complement a recent study that focused on membrane-associated protein interactions of DTNBP1 [68].  Our results are consistent with previous studies that show interactions between DTNBP1 and members of the AP3 complex [29, 68].  Since we focused on cytosolic interactions of DTNBP1 while Hikita et al. 2009 investigated membrane interactions of DTNBP1, there were a number of membrane or membrane associated proteins they observed that were not replicated in our analysis, including STXBP1, glyceraldehyde-3-phosphate dehydrogenase (GAPDH), ADP-ribosylation factor 1 (ARF1), ADP- ribosylation factor GTPase-activating protein 1 (ARFGAP), and Arf-GAP with SH3 domain, ANK repeat and PH domain-containing protein 1 (ASAP1).  Conversely, most of the proteins found in our study were not found in the Hikita et al. 2009 study, including any members of BLOC1, of which DTNBP1 is known to be a member of [28], nor members of the exocyst and dynactin complexes.  These differences are not surprising given the different experimental approaches, but further highlight that DTNBP1 plays a role in many aspects of the vesicle lifecycle, both within the cytoplasm as well as at the membrane surface.  In this way, our results complement the findings in Hikita et al. 2009 and implicate DTNBP1 in vesicle associated protein trafficking processes throughout the cell [68].  Our data indicate DTNBP1 interacts with several actin and tubulin proteins and with members of the CCT and dynactin complexes, which are fundamental components of the cytoskeleton and microtubule matrix. The CCT complex is a large multi-subunit complex that mediates protein folding for a variety of proteins, including several actin, tubulin and cell cycle regulator proteins [71] and plays an important role in the assembly of the cytoskeleton and in cell division [72].  The 11 subunit dynactin complex [73] recruits and links the dynein motor protein to vesicles and to microtubules to facilitate cargo transport along the cytoskeletal matrix [74].  In neuronal axons, dynactin is involved in retrograde as well as anterograde transport [75, 76].  Mutations in dynactin subunits can cause defects in axonal transport; for instance, individuals with mutations in the DCTN1 subunit suffer from motor neuron disease [77].  The function of the dynactin complex may vary based on location, but in all cases it is thought to facilitate or regulate dynein or kinesin II targeting and recruitment.  Our data also show the exocyst complex contacts the cytoskeletal transport system.  In fact, the three proteins in common between DTNBP1 and the exocyst and dynactin complexes are CCT proteins (CCT3 and CCT8) and CAPZB, an actin-capping protein and a member of the dynactin complex (see Table 2.8).  We have also verified a DISC1 interaction with dynactin.  DISC1 has been shown to play an important role in microtubule dynamics through its interactions with pericentrin (PCNT) [78], and its interaction with the dynactin-dynein accessory components PAFAH1B1 and NDEL1 are thought to stabilize the motor assembly on the nuclear surface, the centrosome, and the cell cortex [45-47, 79].  Taken together, our data indicate that cytoskeletal interactions may be important for the vesicle transport functions of DTNBP1, DISC1 and exocyst.   39 Our data indicate DTNBP1 makes contact with several members of the exocyst complex.  We also verified in mammalian cells an interaction of DISC1 with the exocyst complex.  The exocyst complex is composed of 8 subunits, and through SNARE protein recruitment, is thought to integrate signals from several different signaling pathways to determine the location, timing and number of secretory vesicles docked with the plasma membrane [80].  In animal neurons, the exocyst complex is required for neurite branching and synaptogenesis, but not synaptic vesicle release at mature synapses [81, 82].  We also reproduced the interaction of DTNBP1 with the AP3 complex.  AP3 is one of four adaptor protein complexes (AP1-4) that act as scaffolds, coordinating membrane lipids, membrane protein sorting signals, components of the vesicle fusion machinery, and additional components of the vesicle formation apparatus [83-85].  The AP3 complex is involved in the generation of synaptic vesicles in neurons. Mutations in the AP3 complex δ subunit AP3D1 result in the mocha mouse, affect spontaneous and evoked release at hippocampal mossy fiber synapses [86], and show a selective increase in the content of synaptic vesicle cargoes.  Our data potentially extend the involvement of DTNBP1 and DISC1 into vesicle tethering through the exocyst complex and the regulation of active vesicle transport through the dynactin complex.  While our data show both DTNBP1 and DISC1 interact with components of the dynactin and exocyst complexes, it is not known if DTNBP1 and DISC1 reside simultaneously in common protein complexes.  Overall, our data indicate that DTNBP1 is involved throughout the vesicle transport process in vesicle generation, transport and membrane tethering through interactions with the BLOC1, AP3, exocyst and dynactin complexes and that DISC1 is also involved in the vesicle lifecycle at the transport and membrane tethering stages.  Defects in any of the many factors that coordinately regulate neurotransmitter vesicle cycling could contribute to the etiology of schizophrenia [69].  DTNBP1 over-expression induces expression of the SNARE protein synaptosomal-associated protein 25 (SNAP25) which is involved in mediating vesicle docking and fusion at the cellular membrane, and synapsin1 (SYN1) which is involved in regulating neurotransmitter release as well as increasing glutamatergic release [19].  Defects in BLOC1 indirectly cause redistribution of cargo to the cell surface [30].  AP3 and BLOC1 are linked to the fusion machinery involved in synaptic vesicle secretion that is hypothesized to be involved in schizophrenia; brain tissue from schizophrenia patients has reduced levels of DTNBP1 in hippocampal mossy fibers [23], a phenotype that is also found in the AP3 deficient mocha brain [31].  Ablation of DTNBP1 expression in mouse and rat model systems results in the diversion of dopamine and glutamate receptors from lysosomes to the cell surface [87-89].  Our data are consistent with the possibility that disruption of a DTNBP1-exocyst interaction may contribute to the diversion of glutamate and dopamine receptor containing vesicles from lysosomes to the plasma membrane.  Overall, our results show DTNBP1 and DISC1 make multiple contacts with vesicle trafficking machinery.  Thus, the aberrant function of DTNBP1 or DISC1, or the proteins in their interaction networks provide multiple sites for impairment of synaptic vesicle biogenesis.  Our finding that genes for the proteins in our DTNBP1 and dynactin protein interaction networks are overrepresented in schizophrenia linked cytogenetic regions implies that the  40 interaction networks of DTNBP1 and DISC1 as a whole should be considered in the pathology of schizophrenia.  Defining the protein interaction networks for schizophrenia susceptibility genes like DTNBP1 will help understand their function, but may also illuminate pathways where individual components may make small contributions to disease etiology, but a large contribution when taken together.  An example of this is the epistatic genetic effect in schizophrenia of the non-DTNBP1 BLOC1 subunits BLOC1S3 and MUTED [90].  While members of the dynactin and exocyst complexes have not been considered potential candidate schizophrenia genes in the past, their functions and their interaction with two of the best schizophrenia susceptibility genes strongly supports future genetic investigation of these and other members of the DTNBP1, exocyst, dynactin, and DISC1 interaction networks.   41 2.5 Chapter 2 Tables Table 2.1   - Protein Interactions Identified Through IP-MS This table is a summary of the DTNBP1, exocyst, and dynactin protein interaction partners.  These candidate interacting proteins were identified from the immunoprecipitation comparative mass spectrometry data using a number of filtering and quality criteria (see Methods and Tables 2.2, 2.3, and 2.4).  Results are grouped by complex or interest.  Proteins shown in bold were used in validation experiments.  DTNBP1 Interactors Exocyst Interactors Dynactin Interactors HEK293 cells X57 cells X57 cells 83 interacting proteins 56 interacting proteins  31 interacting proteins BLOC1 Complex CNO BLOC1S1 BLOC1S2 BLOC1S3 DTNBP1 MUTED PLDN SNAPIN  Dynactin Complex ACTR1A CAPZA1 CAPZB DNDH1L DCTN1 DCTN2 DCTN3  CAPZB ACTR1A CAPZB DCTN1 DCTN2 DCTN3 DCTN4 DCTN5 DCTN6 Exocyst Complex EXOC3 EXOC4 EXOC6B EXOC1 EXOC2 EXOC3 EXOC4 EXOC5 EXOC6 EXOC6B EXOC7  Chaperonin Containing TCP1 Complex (CCT) CCT1 CCT3 CCT4 CCT5 CCT8 CCT3 CCT8 CCT2 CCT3 CCT6A CCT7 CCT8 Tubulin / Actin Associated Proteins ACTA1 ACTA2 ACTG1 CFL1 KIF2A MAPT MYO1D MYO6 MYH10 SPTBN1 TUBA1B TUBA1C TUBB2B TUBB4 ACTBL2 MTAP1A TTN TUBA1A TUBB2C TUBB3 TUBB6 ACTR10 ACTR1B KLHL2 PDCL3 TUBA1A TUBB2B TUBB3 TWF1 AP3 Complex AP3B1 AP3B2 AP3D1 AP3M1 AP3S1 AP3S2  Brain / Neuron Associated  ENO1 GDF7 PRPH1 PHGDH CDH15    42  DTNBP1 Interactors Exocyst Interactors Dynactin Interactors Vesicular Transport / Trafficking Associated / Transporter AP2A1 COG5 COG7 KCNMA1 KCTD17 SCYL2 SLC1A5 SLC27A4  ABCF2 ARF3 ARF4 GAPDH HERC1 KCTD2 LRP1 RAB1B SEC61A1 SLC16A1 SLC16A3 UNC80 CLCN1 KCTD2 Other Proteins AFG3L2 BCAS4 BTBD7 C11orf48 C17orf59 CDIPT CSNK1A1 CSNK1A1L CYP4F8 FASN HERC5 IFIT1 IGF2BP3 MTDH NT5C2 PCBP1 PGAM5 PHGDH PKM2 PPM1B PRKDC PTPLAD1 QPCTL SDCCAG3 STAT1 TMCO1 TMEM33 TOP1 UPF0402 YTHDC2 YWHAE ZNF281 ARID1B CAD CAND1 DDOST DHCR7 GALK1 GAPDHS HSD17B12 LDHA MAGEE2 MEST PDIA3 PFKP PKM2 PPM1B PRPSAP1 PRPSAP2 RPN2 SSR1 TPI1 YWHAQ YWHAZ CAD FEM1A GAPDHS MSH5 PFKP SSR1 TPI1  43  44 Table 2.2   - DTNBP1 Immunoprecipitation-Mass Spectrometry Data Table 2.2A: This table shows a summary of the number of unique protein hits resulting from the seven DTNBP1 experiments performed.  The columns from left to right show the number of hits (unique protein assignments) after each successive subtraction to remove background proteins and to apply filtering and quality criteria (see Methods, NSB = non-specific background).  # total hits # hits after vector HEK293 NSB removal # hits after common NSB removal # hits observed in at least 2 replicates # hits remaining that meet minimal quality requirements 1,238 938 765 103 83  Table 2.2B: This table shows a summary of the 83 candidate DTNBP1 protein interactors in HEK293 cells.  Results are grouped by complex or interest.  The 36 second tier interacting proteins are shaded in grey, leaving 47 first tier interacting proteins.  X!Tandem Interacting Proteina HGNC (Uniprot ID) DTNBP1 Isoform Epitope Tag # Times Observed Per # Experiments Mascot Score b log(E) c # Unique Peptides d Average log(E) Tier BLOC1 Complex 3,256 -208.9 22 C-term 3XFLAG 2/2 1,449 -129.3 10 2,702 -140.2 15 Isoform 1  N-term 3XFLAG 2/2 3,183 -138.0 14 210 -52.2 7 N-term 3XFLAG 2/2 312 -26.9 3 DTNBP1 (Q96EV8) Isoform 3 C-term 3XFLAG 1/1 647 -47.3 6 -106.1 1 - -88.6 9 N-term 3XFLAG 2/2 - -86.6 10 Isoform 1 C-term 3XFLAG 1/2 - -59.9 6 SNAPIN (O95295) Isoform 3 C-term 3XFLAG 1/1 - -14.8 2 -62.5 1 156 -58.0 7 C-term 3XFLAG 2/2 125 -39.4 5 110 -43.3 5 CNO (Q9NUP1) Isoform 1 N-term 3XFLAG 2/2 63 -23.7 3 -41.1 1 353 -44.6 5 N-term 3XFLAG 2/2 251 -38.8 5 268 -21.4 3 BLOC1S2 (Q6QNY1) Isoform 1 C-term 3XFLAG 2/2 58 -2.8 1 -26.9 1 204 -37.5 5 C-term 3XFLAG 2/2 174 -25.2 4 254 -29.9 4 MUTED (Q8TDH9) Isoform 1 N-term 3XFLAG 2/2 102 -18.6 3 -27.8 1 X!Tandem Interacting Proteina HGNC (Uniprot ID) DTNBP1 Isoform Epitope Tag # Times Observed Per # Experiments Mascot Score b log(E) c # Unique Peptides d Average log(E) Tier 112 -36.8 5 C-term 3XFLAG 2/2 130 -29.0 4 80 -20.4 3 BLOC1S3 (Q6QNY0) Isoform 1 N-term 3XFLAG 2/2 120 -14.1 2 -25.1 1 190 -35.9 5 N-term 3XFLAG 2/2 141 -29.1 4 51 -19.0 3 BLOC1S1 (P78537) Isoform 1 C-term 3XFLAG 2/2 108 -10.0 2 -23.5 1 77 -32.3 4 N-term 3XFLAG 2/2 216 -24.2 3 91 -4.1 1 PLDN (Q9UL45) Isoform 1 C-term 3XFLAG 2/2 91 -1.0 1 -15.4 1 AP3 Complex C-term 3XFLAG 1/1 460 -86.9 11 Isoform 3 N-term 3XFLAG 1/2 48 -2.2 1 217 -60.9 8 C-term 3XFLAG 2/2 168 -47.4 6 112 -10.3 2 AP3B1 (O00203) Isoform 1 N-term 3XFLAG 2/2 49 -2.9 1 -35.1 1 C-term 3XFLAG 1/1 144 -52.6 7 130 -3.1 1 Isoform 3 N-term 3XFLAG 2/2 76 -3.0 1 77 -18.8 3 C-term 3XFLAG 2/2 146 -12.0 2 AP3S1 (Q92572)  Isoform 1 N-term 3XFLAG 1/2 78 - 1 -17.9 1 677 -194.2 22 N-term 3XFLAG 2/2 544 -154.2 18 420 -18.4 3 Isoform 1 C-term 3XFLAG 2/2 203 -57.6 7 AP3D1 (O14617) Isoform 3 C-term 3XFLAG 1/1 330 -71.3 9 -99.1 1 - -55.1 7 N-term 3XFLAG 2/2 - -1.8 1 Isoform 1 C-term 3XFLAG 1/2 - -18.9 3 AP3M1 (Q9Y2T2) Isoform 3 N-term 3XFLAG 1/2 - -1.5 1 -19.3 1 - -9.2 2 N-term 3XFLAG 2/2 65 -2.9 1 Isoform 1 C-term 3XFLAG 1/2 - -1.6 1 AP3S2 (P59780) Isoform 3 C-term 3XFLAG 1/1 - -1.1 1 -3.7 2 Isoform 3 C-term 3XFLAG 1/1 144 -24.9 3 AP3B2 (Q13367) Isoform 1 C-term 3XFLAG 1/2 79 -18.4 1 -21.7 2 Dynactin Complex 110 -21.6 3 Isoform 1 C-term 3XFLAG 2/2 191 -19.0 3 76 -2.7 1 N-term 3XFLAG 2/2 136 - 9 DCTN2 (Q13561) Isoform 3 C-term 3XFLAG 1/1 86 -2.7 1 -11.5 1  45  X!Tandem Interacting Proteina HGNC (Uniprot ID) DTNBP1 Isoform Epitope Tag # Times Observed Per # Experiments Mascot Score b log(E) c # Unique Peptides d Average log(E) Tier Isoform 3 C-term 3XFLAG 1/1 217 -72.6 9 172 -51.8 7 DCTN1 (Q14203) Isoform 1 C-term 3XFLAG 2/2 182 -21.4 3 -48.6 1 Isoform 3 N-term 3XFLAG 1/2 - -19.6 3 ACTR1A (P61163) Isoform 1 C-term 3XFLAG 1/2 - -1.9 1 -10.8 1 Isoform 1 C-term 3XFLAG 1/2 57 -9.5 2 DCTN3 (O75935) Isoform 3 C-term 3XFLAG 1/1 67 -2.1 1 -5.8 1 Isoform 3 C-term 3XFLAG 1/1 82 -10.9 1 CAPZA1 (P52907) Isoform 1 N-term 3XFLAG 1/2 - -1.9 1 -6.4 2 - -9.2 2 DNHD1L (Q8TEE6) Isoform 1 N-term 3XFLAG 2/2 - -1.6 1 -5.4 2 Isoform 3 C-term 3XFLAG 1/2 57 -9.4 2 CAPZB (P47756) Isoform 1 C-term 3XFLAG 1/2 43 -1.1 1 -5.25 2 Exocyst Complex N-term 3XFLAG 1/2 247 -87.4 10 Isoform 1 C-term 3XFLAG 1/2 134 -26.7 4 EXOC4 (Q96A65) Isoform 3 C-term 3XFLAG 1/1 48 -9.8 2 -41.3 1 80 -17.1 3 EXOC3 (O60645) Isoform 1 N-term 3XFLAG 2/2 79 -9.4 2 -13.3 1 Isoform 3 C-term 3XFLAG 1/2 48 -10.9 2 EXOC6B (Q9Y2D4) Isoform 1 C-term 3XFLAG 1/2 57 -2.4 1 -6.7 2 Chaperonin Containing TCP1 Complex (CCT) Isoform 3 C-term 3XFLAG 1/1 - -12.2 2 C-term 3XFLAG 1/2 - -10.7 2 - -9.4 2 CCT1 (P17987) Isoform 1 N-term 3XFLAG 2/2 - -9.2 2 -10.4 1 Isoform 3 C-term 3XFLAG 1/1 114 -44.7 6 C-term 3XFLAG 1/2 39 -2.2 1 CCT5 (P48643) Isoform 1 N-term 3XFLAG 1/2 70 -2.0 1 -16.3 1 Isoform 3 C-term 3XFLAG 1/1 52 -11.3 2 C-term 3XFLAG 1/2 76 -4.0 1 CCT4 (P50991) Isoform 1 N-term 3XFLAG 1/2 43 -1.4 1 -5.6 1 Isoform 3 C-term 3XFLAG 1/1 242 -69.0 3 CCT8 (P50990) Isoform 1 N-term 3XFLAG 1/2 - -3.1 1 -36.1 2 Isoform 3 C-term 3XFLAG 1/1 70 -21.8 3 CCT3 (P49368) Isoform 1 N-term 3XFLAG 1/2 - -2.3 1 -12.1 2 Tubulin / Actin Associated Proteins 676 -120.3 13 C-term 3XFLAG 2/2 138 -27.7 4 258 -80.7 9 Isoform 1 N-term 3XFLAG 2/2 319 -73.9 8 122 - 0 TUBA1B (P68363) Isoform 3 N-term 3XFLAG 2/2 66 - 1 -60.8 1  46  X!Tandem Interacting Proteina HGNC (Uniprot ID) DTNBP1 Isoform Epitope Tag # Times Observed Per # Experiments Mascot Score b log(E) c # Unique Peptides d Average log(E) Tier C-term 3XFLAG 1/1 634 -66.2 8 Isoform 3 N-term 3XFLAG 1/2 106 - 1 251 -22.7 2 ACTG1 (P63261) C-term 3XFLAG 2/2 Isoform 1 140 - 1 N-term 3XFLAG 1/2 210 - 4 -44.5 1 Isoform 3 C-term 3XFLAG 1/1 236 -31.3 2 67 -13.5 2 CFL1 (P23528) Isoform 1 -15.6 N-term 3XFLAG 2/2 - -2.1 1 C-term 3XFLAG 1/2 53 - 4 1 Isoform 3 C-term 3XFLAG 1/1 359 -126.6 16 C-term 3XFLAG 1/2 115 -3.1 1 SPTBN1 (Q01082) Isoform 1 N-term 3XFLAG 1/2 52 -1.7 1 -43.8 1 Isoform 3 C-term 3XFLAG 1/2 863 -303.3 23 MYH10 (P35580) -177.7 Isoform 1 N-term 3XFLAG 1/2 148 -52.0 3 1 Isoform 3 C-term 3XFLAG 1/1 - -10.1 2 MYO6 (Q9UM54) Isoform 1 -9.0 N-term 3XFLAG 1/2 - -7.9 2 1 Isoform 3 C-term 3XFLAG 211 -72.51/1 1 105 -22.5 1 C-term 3XFLAG 2/2 508 - 2 341 - 0 TUBB4 (P04350) Isoform 1 N-term 3XFLAG 2/2 324 - 2 -47.5 2 660 -160.5 2 C-term 3XFLAG 2/2 217e - 1 Isoform 1 N-term 3XFLAG 1/2 408 e - 0 TUBB2B (Q9BVA1) Isoform 3 C-term 3XFLAG 1/1 226 -64.1 1 -112.3 2 N-term 3XFLAG 1/2 263 -79.0 1 Isoform 1 C-term 3XFLAG 1/2 608 - 0 TUBA1C (Q9BQE3) Isoform 3 C-term 3XFLAG 1/1 225 -38.8 5 -58.9 2 C-term 3XFLAG 1/1 439 -47.7 1 ACTA1 (P68133) Isoform 3 N-term 3XFLAG 1/2 - -1.9 1 -24.8 2 N-term 3XFLAG 1/2 - -41.2 1 ACTA2 (P62736) Isoform 1 C-term 3XFLAG 1/2 - -29.4 4 -35.3 2 Isoform 3 C-term 3XFLAG 1/1 102 -11.6 2 MYO1D (O94832) Isoform 1 N-term 3XFLAG 1/2 - -1.4 1 -6.5 2 - -9.4 2 MAPT (P10636) Isoform 1 C-term 3XFLAG 2/2 - -2.1 1 -5.8 2 45 -3.6 1 KIF2A (O00139) Isoform 1 N-term 3XFLAG 2/2 51 -1.3 1 -2.4 2  47  X!Tandem Interacting Proteina HGNC (Uniprot ID) DTNBP1 Isoform Epitope Tag # Times Observed Per # Experiments Mascot Score b log(E) c # Unique Peptides d Average log(E) Tier Vesicular Transport / Trafficking Associated / Transporter Proteins 1,123 -111.6 12 N-term 3XFLAG 2/2 805 -85.5 8 589 -60.9 6 Isoform 1 C-term 3XFLAG 2/2 177 -26.3 4 C-term 3XFLAG 1/1 390 -75.2 8 128 -10.4 1 KCTD17 (Q8N5Z5) Isoform 3 N-term 3XFLAG 2/2 131 -18.5 3 -55.5 1 66 -70.0 9 N-term 3XFLAG 2/2 43 -19.3 3 SLC27A4 (Q6P1M0) Isoform 1 C-term 3XFLAG 1/1 - -11.9 2 -33.73 1 Isoform 1 C-term 3XFLAG 1/2 709 -123.3 15 SCYL2 (Q6P3W7) Isoform 3 C-term 3XFLAG 1/1 253 -45.9 6 -84.6 1 N-term 3XFLAG 1/2 - -18.9 3 KCNMA1 (Q12791) Isoform 1 C-term 3XFLAG 1/2 - -8.7 2 -13.8 1 Isoform 3 C-term 3XFLAG 1/1 186 -54.5 7 AP2A1 (O95782) Isoform 1 N-term 3XFLAG 1/2 - -2.8 1 -28.7 2 57 -10.2 2 COG7 (P83436) Isoform 1 N-term 3XFLAG 2/2 66 -3.3 1 -6.8 2 67 -10.0 2 COG5 (Q9UP83) Isoform 1 N-term 3XFLAG 2/2 - -1.5 1 -5.8 2 Isoform 3 C-term 3XFLAG 1/1 84 -3.2 1 SLC1A5 (Q15758) Isoform 1 N-term 3XFLAG 1/2 42 -1.3 1 -2.3 2 Other Proteins 124 -44.4 5 N-term 3XFLAG 2/2 - -33.4 5 66 -19.9 3 Isoform 1 C-term 3XFLAG 2/2 - -13.2 2 PKM2 (P14618) Isoform 3 C-term 3XFLAG 1/1 184 -42.0 5 -30.6 1 158 -54.4 7 C-term 3XFLAG 2/2 - -11.6 2 90 -25.7 4 Isoform 1 N-term 3XFLAG 2/2 71 -17.2 3 CSNK1A1 (P48729) Isoform 3 N-term 3XFLAG 1/2 57 -1.5 1 -22.1 1 900 -125.5 13 C-term 3XFLAG 2/2 649 -70.3 9 Isoform 1 N-term 3XFLAG 1/2 64 -9.0 2 PPM1B (O75688) Isoform 3 C-term 3XFLAG 1/1 418 -122.2 13 -81.8 1 378 -112.7 12 N-term 3XFLAG 2/2 472 -71.5 8 46 -9.2 2 C17orf59 (Q96GS4) Isoform 1 C-term 3XFLAG 2/2 134 -4.2 1 -49.4 1  48  X!Tandem Interacting Proteina HGNC (Uniprot ID) DTNBP1 Isoform Epitope Tag # Times Observed Per # Experiments Mascot Score b log(E) c # Unique Peptides d Average log(E) Tier C-term 3XFLAG 1/2 358 -75.9 9 307 -68.8 9 Isoform 1 N-term 3XFLAG 2/2 122 -40.5 6 PRKDC (P78527) Isoform 3 C-term 3XFLAG 1/1 83 -13.4 2 -49.7 1 119 -27.7 2 N-term 3XFLAG 2/2 56 - 1 Isoform 1 C-term 3XFLAG 1/2 46 -11.3 2 PCBP1 (Q15365) Isoform 3 C-term 3XFLAG 1/1 - -1.3 1 -13.4 1 - -22.5 3 N-term 3XFLAG 2/2 88 -12.8 2 57 -2.9 1 BCAS4 (Q8TDM0) Isoform 1 C-term 3XFLAG 2/2 58 - 0 -12.7 1 Isoform 3 C-term 3XFLAG 1/1 - -44.4 6 C-term 3XFLAG 1/2 - -14.5 2 FASN (P49327) Isoform 1 N-term 3XFLAG 1/2 85 - 0 -29.5 1 Isoform 3 C-term 3XFLAG 1/1 142 -46.8 6 104 -34.2 5 AFG3L2 (Q9Y4W6) Isoform 1 C-term 3XFLAG 2/2 - -3.0 1 -28.0 1 90 -19.4 3 Isoform 1 C-term 3XFLAG 2/2 198 -17.9 3 STAT1 (P42224) Isoform 3 C-term 3XFLAG 1/1 41 -2.3 1 -13.2 1 116 -18.6 3 N-term 3XFLAG 2/2 105 -11.5 2 PGAM5 (Q96HS1) Isoform 1 C-term 3XFLAG 1/2 81 -10.7 2 -13.6 1 Isoform 1 C-term 3XFLAG 1/2 - -51.0 7 BTBD7 (Q9P203) Isoform 3 N-term 3XFLAG 1/2 - -47.4 7 -49.2 1 Isoform 3 C-term 3XFLAG 1/1 178 -52.9 7 YWHAE (P62258) Isoform 1 N-term 3XFLAG 1/2 80 -12.1 2 -32.50 1 166 -36.4 5 YTHDC2 (Q9H6S0) Isoform 1 N-term 3XFLAG 2/2 103 -19.0 3 -27.7 1 113 -28.8 2 IGF2BP3 (O00425) Isoform 1 N-term 3XFLAG 2/2 121 -26.4 2 -27.6 1 - -24.0 4 ZNF281 (Q9Y2X9) Isoform 1 C-term 3XFLAG 2/2 - -16.5 3 -20.3 1 64 -11.7 2 N-term 3XFLAG 2/2 46 -3.2 1 QPCTL (Q9NXS2) Isoform 1 C-term 3XFLAG 1/2 - -3.7 1 -6.2 2 C-term 3XFLAG 1/2 89 -3.5 1 - -2.4 1 SDCCAG3 (Q96C92) Isoform 1 N-term 3XFLAG 2/2 - -1.3 1 -2.4 2 Isoform 1 C-term 3XFLAG 1/2 - -44.2 1 CSNK1A1L (Q8N752) Isoform 3 C-term 3XFLAG 1/1 - -16.7 3 -30.5 2 Isoform 1 N-term 3XFLAG 1/2 122 -27.4 4 TOP1 (P11387) Isoform 3 C-term 3XFLAG 1/1 - -1.0 1 -14.2 2  49  X!Tandem Interacting Proteina HGNC (Uniprot ID) DTNBP1 Isoform Epitope Tag # Times Observed Per # Experiments Mascot Score b log(E) c # Unique Peptides d Average log(E) Tier - -24.2 3 TMCO1 (Q9UM00) Isoform 1 N-term 3XFLAG 2/2 - -2.9 1 -13.6 2 Isoform 3 C-term 3XFLAG 1/1 105 -20.8 3 PHGDH (O43175) Isoform 1 C-term 3XFLAG 1/2 - -1.5 1 -11.2 2 Isoform 1 C-term 3XFLAG 1/2 227 -19.8 3 NT5C2 (P49902) Isoform 3 C-term 3XFLAG 1/1 - -2.4 1 -11.1 2 Isoform 1 N-term 3XFLAG 1/2 116 -11.5 2 TMEM33 (P57088) Isoform 3 C-term 3XFLAG 1/1 - -1.3 1 -6.4 2 98 -11.0 2 MTDH (Q86UE4) Isoform 1 N-term 3XFLAG 2/2 78 -5.2 1 -8.1 2 91 -10.9 2 HERC5 (Q9UII4) Isoform 1 N-term 3XFLAG 2/2 41 -1.4 1 -6.2 2 - -9.9 2 UPF0402 (Q96FH0) Isoform 1 N-term 3XFLAG 2/2 - -1.5 1 -5.7 2 Isoform 3 C-term 3XFLAG 1/1 41 -9.5 2 CDIPT (O14735) Isoform 1 N-term 3XFLAG 1/2 51 -1.9 1 -5.7 2 N-term 3XFLAG 1/2 - -8.7 2 IFIT1 (P09914) Isoform 1 C-term 3XFLAG 1/2 - -2.4 1 -5.6 2 C-term 3XFLAG 1/2 - -8.0 2 CYP4F8 (P98187) Isoform 1 N-term 3XFLAG 1/2 - -1.3 1 -4.7 2 - -3.9 1 C11orf48 (Q9BQE6) Isoform 1 N-term 3XFLAG 2/2 - -3.3 1 -3.6 2 N-term 3XFLAG 1/2 106 -3.9 1 PTPLAD1 (Q9P035) Isoform 1 C-term 3XFLAG 1/2 49 -1.1 1 -2.5 2  a – Interacting proteins are those meeting score quality thresholds (see Methods). b – Mascot score as calculated by the Matrix Science Mascot software algorithm against the Ensembl human protein database. c – The log (E) value is an estimate of the probability that the protein assignment happened randomly as calculated by the X!Tandem algorithm against the Ensembl human protein database. d - Number of uniquely assigned peptides assigned to an individual accession that contribute to the log(E) score as calculated by the X!Tandem algorithm unless no X!Tandem score was found, in which case the number of unique peptides contributing to the Mascot score is shown in italics.  Additional non-unique peptides which map to multiple proteins within the Ensembl human protein database may contribute to individual X!Tandem or Mascot scores and are not reported here. e – This protein identification is likely a result of Mascot assigning ion scores from non-unique peptides to multiple proteins.   50 Table 2.3   - Dynactin Immunoprecipitation-Mass Spectrometry Data Table 2.3A: This table shows a summary of the number of unique protein hits resulting from the four dynactin complex experiments performed.  The columns from left to right show the number of hits (unique protein assignments) after each successive subtraction to remove background proteins and to apply filtering and quality criteria (see Methods, NSB = non-specific background).  # hits total # hits after vector X57 NSB removal # hits after common NSB removal # hits observed in at least 2 replicates # hits remaining that meet minimal quality requirements 556 538 463 37 31  Table 2.3B: This table shows a summary of the 31 candidate ACTR1A and DCTN2 interacting proteins in X57 cells. Bait protein = the tagged immunoprecipitated protein, either ACTR1A or DCTN2.  The 9 second tier interacting proteins are shaded in grey, leaving 22 first tier interacting proteins. X!Tandem Interacting Proteina HGNC (Uniprot ID) Bait Protein # Times Observed Per # Experiments Mascot Score b log(E) c # Unique Peptides d Average log(E) Tier Dynactin Complex 3,263 -476.4 47 ACTR1A N-term 3XFLAG 2/2 1,480 -316.9 36 3,162 -438.1 43 DCTN1 (O08788) DCTN2 N-term 3XFLAG 2/2 1,815 -307.9 35 -384.8 1 2,795 -226.4 22 ACTR1A N-term 3XFLAG 2/2 1,684 -135.7 15 1,489 -196.0 21 ACTR1A (P61164) DCTN2 N-term 3XFLAG 2/2 1,002 -123.3 3 -170.4 1 2,208 -215.6 21 ACTR1A N-term 3XFLAG 2/2 578 -128.6 14 3,650 -177.4 19 DCTN2 (Q99KJ8) DCTN2 N-term 3XFLAG 2/2 3,087 -111.3 12 -158.2 1 745 -134.5 16 DCTN2 N-term 3XFLAG 2/2 591 -104.3 13 902 -124.7 13 CAPZB (P47757) ACTR1A N-term 3XFLAG 2/2 305 -115.8 14 -119.8 1 1,515 -161.7 18 DCTN2 N-term 3XFLAG 2/2 540 -78.7 9 862 -98.3 11 DCTN4 (Q8CBY8) ACTR1A N-term 3XFLAG 2/2 211 -52.5 7 -97.8 1  51  X!Tandem Interacting Proteina HGNC (Uniprot ID) Bait Protein # Times Observed Per # Experiments Mascot Score b log(E) c # Unique Peptides d Average log(E) Tier 585 -81.4 9 ACTR1A N-term 3XFLAG 2/2 275 -53.1 7 600 -45.9 6 DCTN3 (Q9Z0Y1) DCTN2 N-term 3XFLAG 2/2 627 -37.9 5 -54.6 1 151 -21.7 3 DCTN2 N-term 3XFLAG 2/2 105 -6.0 1 105 -13.9 2 DCTN5 (Q9QZB9) ACTR1A N-term 3XFLAG 2/2 149 -3.1 1 -11.2 1 100 -21.5 3 ACTR1A N-term 3XFLAG 2/2 74 -4.1 1 DCTN6 (Q9WUB4) DCTN2 N-term 3XFLAG 1/2 80 -12.2 2 -12.6 1 Chaperonin Containing TCP1 Complex (CCT) 1,109 -162.5 16 CCT2 (P80314) ACTR1A N-term 3XFLAG 2/2 724 -115.3 12 -138.9 1 1,130 -158.8 17 CCT8 (P42932) ACTR1A N-term 3XFLAG 2/2 510 -109.0 14 -133.9 1 1,124 -149.0 16 CCT3 (P80318) ACTR1A N-term 3XFLAG 2/2 688 -74.7 9 -111.9 1 874 -111.5 11 CCT7 (P80313) ACTR1A N-term 3XFLAG 2/2 252 -94.1 11 -102.8 1 557 -72.9 9 CCT6A (P80317) ACTR1A N-term 3XFLAG 2/2 184 -56.9 7 -64.9 1 Tubulin / Actin Associated Proteins 1,255 -196.0 21 DCTN2 N-term 3XFLAG 2/2 434 -84.7 11 453 -103.1 12 ACTR10 (Q9QZB7) ACTR1A N-term 3XFLAG 2/2 177 -41.6 6 -106.4 1 1,530 -180.7 7 DCTN2 N-term 3XFLAG 2/2 914 -134.8 16 2,069 -174.2 4 ACTR1B (Q8R5C5) ACTR1A N-term 3XFLAG 2/2 1,244 -89.7 2 -144.9 1 292 -50.6 6 ACTR1A N-term 3XFLAG 2/2 280 -36.9 4 141 -13.1 2 TWF1 (Q91YR1) DCTN2 N-term 3XFLAG 2/2 83 -9.9 2 -27.6 1 204 -24.8 4 ACTR1A N-term 3XFLAG 2/2 233 -10.5 2 PDCL3 (Q8BVF2) DCTN2 N-term 3XFLAG 1/2 80 -10.9 2 -15.4 1  52  X!Tandem Interacting Proteina HGNC (Uniprot ID) Bait Protein # Times Observed Per # Experiments Mascot Score b log(E) c # Unique Peptides d Average log(E) Tier 941 -180.3 6 TUBB3 (Q9ERD7) ACTR1A N-term 3XFLAG 2/2 753 -129.4 4 -154.9 1 231 -43.9 5 KLHL2 (Q8JZP3) DCTN2 N-term 3XFLAG 2/2 120 -37.8 5 -40.85 1 819 -165.1 2 TUBB2B (Q9CWF2) ACTR1A N-term 3XFLAG 2/2 823 -150.5 1 -157.8 2 1,264 -138.2 1 TUBA1A (P68369) ACTR1A N-term 3XFLAG 2/2 780 -114.4 1 -126.3 2 Brain / Neuron Associated Proteins DCTN2 N-term 3XFLAG 1/2 - -8.0 2 CDH15 (P33146) ACTR1A N-term 3XFLAG 1/2 - -1.1 1 -4.6 2 Vesicular Transport / Trafficking Associated / Transport Proteins ACTR1A N-term 3XFLAG 1/2 - -16.0 3 CLCN1 (P35523) DCTN2 N-term 3XFLAG 1/2 - -8.1 2 -12.1 1 - -50.4 3 KCTD2 (Q14681) ACTR1A N-term 3XFLAG 2/2 - -1.5 1 -26.0 2 Other Proteins 128 -71.8 10 CAD (076014) ACTR1A N-term 3XFLAG 2/2 113 -32.3 5 -52.1 1 132 -48.7 7 PFKP (Q9WUA3) ACTR1A N-term 3XFLAG 2/2 92 -18.4 3 -33.6 1 41 -4.0 1 DCTN2 N-term 3XFLAG 2/2 41 -2.7 1 42 -2.7 1 FEM1A (Q9Z2G1) ACTR1A N-term 3XFLAG 2/2 52 -1.4 1 -2.7 2 DCTN2 N-term 3XFLAG 1/2 - -10.0 2 - -3.2 1 MSH5 (Q9QUM7) ACTR1A N-term 3XFLAG 2/2 - -1.3 1 -4.8 2 ACTR1A N-term 3XFLAG 1/2 85 -40.5 3 GAPDHS (Q64467) DCTN2 N-term 3XFLAG 1/2 73 -23.1 1 -31.8 2 DCTN2 N-term 3XFLAG 1/2 40 -8.3 2 TPI1 (P17751) ACTR1A N-term 3XFLAG 1/2 91 -2.5 1 -5.4 2 44 -4.4 1 SSR1 (Q9CY50) ACTR1A N-term 3XFLAG 2/2 49 -2.8 1 -3.6 2   53 a – Interacting proteins are those meeting score quality thresholds (see Methods). b – Mascot score as calculated by the Matrix Science Mascot software algorithm against the Ensembl mouse protein database. c – The log (E) value is an estimate of the probability that the protein assignment happened randomly as calculated by the X!Tandem algorithm against the Ensembl mouse protein database. d - Number of uniquely assigned peptides assigned to an individual accession that contribute to the log(E) score as calculated by the X!Tandem algorithm unless no X!Tandem score was found, in which case the number of unique peptides contributing to the Mascot score is shown in italics.  Additional non-unique peptides which map to multiple proteins within the Ensembl mouse protein database may contribute to individual X!Tandem or Mascot scores and are not reported here.   54 Table 2.4   - Exocyst Immunoprecipitation-Mass Spectrometry Data Table 2.4A: This table shows a summary of the number of unique protein hits resulting from the four exocyst complex experiments performed.  The columns from left to right show the number of hits (unique protein assignments) after each successive subtraction to remove background proteins and to apply filtering and quality criteria (see Methods, NSB = non-specific background).  # hits total # hits after vector X57 NSB removal # hits after common NSB removal # hits observed in at least 2 replicates # hits remaining that meet minimal quality requirements 631 574 479 73 56  Table 2.4B: This table shows a summary of the 56 candidate EXOC3 and EXOC4 interacting proteins in X57 cells. Bait protein = the tagged immunoprecipitated protein, either EXOC3 or EXOC4.  The 24 second tier interacting proteins are shaded in grey, leaving 32 first tier interacting proteins.  X!Tandem Interacting Protein a HGNC (Uniprot ID) Bait Protein # Times Observed Per # Experiments Mascot Score b log(E) c # Unique Peptides d Average log(E) Tier Exocyst Complex 2,733 -461.1 38 EXOC4 N-term 3XFLAG 2/2 2,764 -425.6 34 305 -134.9 15 EXOC4 (O35382) EXOC3 N-term 3XFLAG 2/2 414 -81.5 9 -275.8 1 4,158 -327.8 32 EXOC3 N-term 3XFLAG 2/2 3,192 -295.1 28 311 -78.1 9 EXOC3 (Q6KAR6) EXOC4 N-term 3XFLAG 2/2 193 -46.3 6 -186.8 1 378 -99.4 11 EXOC3 N-term 3XFLAG 2/2 242 -68.2 9 94 -18.1 3 EXOC2 (Q9D4H1) EXOC4 N-term 3XFLAG 2/2 42 -17.9 3 -50.9 1 110 -50.1 7 EXOC3 N-term 3XFLAG 2/2 312 -49.9 6 - -18.7 3 EXOC1 (Q8R3S6) EXOC4 N-term 3XFLAG 2/2 55 -1.2 1 -28.8 1 831 -182.4 18 EXOC7 (O35250) EXOC4 N-term 3XFLAG 2/2 469 -142.2 16 -162.3 1 472 -117.9 13 EXOC5 (Q3TPX4) EXOC4 N-term 3XFLAG 2/2 372 -113.8 13 -115.9 1  55 X!Tandem Interacting Proteina HGNC (Uniprot ID) Bait Protein # Times Observed Per # Experiments Mascot Score b log(E) c # Unique Peptides d Average log(E) Tier 184 -34.7 5 EXOC6 (Q8R313) EXOC4 N-term 3XFLAG 2/2 86 -32.7 -33.7 1 5 65 -19.6 3 EXOC6B (Q6PCN5) EXOC4 N-term 3XFLAG 2/2 54 -8.6 1 -12.2 2 Dynactin Complex EXOC4 N-term 3XFLAG 1/2 154 -18.9 3 CAPZB (P47757) EXOC3 N-term 3XFLAG 75 -15.4 3 -17.2 1 1/2 Chaperonin Containing TCP1 Complex (CCT) 103 -39.8 5 EXOC4 N-term 3XFLAG 2/2 113 -11.1 2 CCT3 (P80318) 133 -12.8 2 EXOC3 N-term 3XFLAG 2/2 137 -11.6 2 -18.8 1 280 -37.6 5 EXOC4 N-term 3XFLAG 2/2 81 -27.1 4 170 -11.3 2 CCT8 (P42932) EXOC3 N-term 3XFLAG -19.7 1 2/2 57 -2.8 1 Tubulin / Actin Associated Proteins 931 -147.4 16 EXOC4 N-term 3XFLAG 2/2 505 -80.3 10 708 -104.7 11 TUBA1A (P68369) EXOC3 N-term 3XFLAG 2/2 494 -85.4 10 -104.5 1 600 -128.1 1 EXOC4 N-term 3XFLAG 2/2 412 -100.7 1 388 -104.8 13 TUBB2C (P68372) EXOC3 N-term 3XFLAG 2/2 656 -90.7 2 -106.1 1 321 -111.4 11 EXOC4 N-term 3XFLAG 2/2 531 -109.1 2 363 -80.3 3 TUBB3 (Q9ERD7) EXOC3 N-term 3XFLAG 2/2 643 -64.8 2 -91.4 1 EXOC4 N-term 3XFLAG 1/2 - -64.6 9 TTN (A2ASS6) EXOC3 N-term 3XFLAG 1/2 - -61.7 9 -48.7 1 210 -69.0 1 EXOC4 N-term 3XFLAG 2/2 130 e - 1 246 -49.4 1 TUBB6 (Q922F4) EXOC3 N-term 3XFLAG 2/2 282 e - 2 -59.2 2  56  X!Tandem Interacting Proteina HGNC (Uniprot ID) Bait Protein # Times Observed Per # Experiments Mascot Score b log(E) c # Unique Peptides d Average log(E) Tier 91 -11.6 1 EXOC3 N-term 3XFLAG 2/2 127 -11.3 1 240e - 1 ACTBL2 (Q8BFZ3) EXOC4 N-term 3XFLAG 2/2 211 e - 2 -11.5 2 EXOC3 N-term 3XFLAG 1/2 59 -28.2 4 MTAP1A (Q9QYR6) EXOC4 N-term 3XFLAG 1/2 44 -9.2 1 -18.7 2 Brain / Neuron Associated Proteins 64 -19.3 3 EXOC4 N-term 3XFLAG 2/2 65 -10.0 2 71 -13.3 2 ENO1 (P17182) EXOC3 N-term 3XFLAG 2/2 - -3.1 1 -11.4 1 EXOC3 N-term 3XFLAG 1/2 157 -36.6 5 107 -26.4 4 PRPH1 (P15331) EXOC4 N-term 3XFLAG 2/2 65 - 4 -24.0 1 EXOC4 N-term 3XFLAG 1/2 - -25.5 4 - -15.8 3 GDF7 (P43029) EXOC3 N-term 3XFLAG 2/2 - -10.1 2 -17.1 1 EXOC4 N-term 3XFLAG 1/2 57 -3.1 1 PHGDH (Q61753) EXOC3 N-term 3XFLAG 1/2 45 -1.4 1 -2.3 2 Vesicular Transport / Trafficking Associated / Transporter Proteins 149 -29.2 4 EXOC3 N-term 3XFLAG 2/2 58 - 2 56 -18.8 3 ARF3 (P61205) EXOC4 N-term 3XFLAG 2/2 58 -3.9 1 -17.3 1 79 -18.2 2 EXOC4 N-term 3XFLAG 2/2 58 - 0 102 -10.5 2 ARF4 (P61750) EXOC3 N-term 3XFLAG 2/2 77 - 1 -14.4 1 75 -11.0 2 EXOC4 N-term 3XFLAG 2/2 64 -3.3 1 126 -10.2 2 SEC61A1 (P61620) EXOC3 N-term 3XFLAG 2/2 85 -5.0 1 -7.4 1 270 -90.1 11 EXOC4 N-term 3XFLAG 2/2 256 -43.9 6 HERC1 (Q99KS8) EXOC3 N-term 3XFLAG 1/2 - -1.1 1 -45.0 1  57  X!Tandem Interacting Proteina HGNC (Uniprot ID) Bait Protein # Times Observed Per # Experiments Mascot Score b log(E) c # Unique Peptides d Average log(E) Tier EXOC4 N-term 3XFLAG 2/2 345 -51.9 3 KCTD2 (Q8CEZ0) EXOC3 N-term 3XFLAG 1/2 294 -39.0 3 -45.5 1 EXOC4 N-term 3XFLAG 1/2 - -51.1 7 - -46.6 7 UNC80 (Q8BLN6) EXOC3 N-term 3XFLAG 2/2 - -38.1 6 -45.3 1 EXOC3 N-term 3XFLAG 2/2 128 -20.2 3 SLC16A3 (P57787) EXOC4 N-term 3XFLAG 1/2 45 -8.2 2 -14.2 1 104 -7.9 1 EXOC4 N-term 3XFLAG 2/2 73 -3.5 1 90 -4.9 1 SLC16A1 (P53986) EXOC3 N-term 3XFLAG 2/2 44 -2.2 1 -4.6 2 EXOC3 N-term 3XFLAG 1/2 - -24.1 1 614 -17.7 1 GAPDH (P16858) EXOC4 N-term 3XFLAG 2/2 151 - 0 -20.9 2 102 -9.1 1 EXOC3 N-term 3XFLAG 2/2 82 -6.1 1 RAB1B (Q9D1G1) EXOC4 N-term 3XFLAG 1/2 74 - 1 -7.6 2 - -9.5 2 LRP1 (Q91ZX7) EXOC4 N-term 3XFLAG 2/2 - -1.1 1 -5.3 2 EXOC4 N-term 3XFLAG 1/2 93 -5.0 1 ABCF2 (Q99LE6) EXOC3 N-term 3XFLAG 1/2 48 -4.0 1 -4.5 2 Other Proteins 324 -53.2 6 EXOC4 N-term 3XFLAG 2/2 100 -50.1 7 - -3.0 1 PPM1B (P36993) EXOC3 N-term 3XFLAG 2/2 71 -1.5 1 -27.0 1 64 -33.8 5 EXOC3 N-term 3XFLAG 2/2 64 -9.0 2 140 -31.0 4 CAD (076014) EXOC4 N-term 3XFLAG 2/2 90 -18.2 3 -23.0 1 104 -11.9 2 EXOC3 N-term 3XFLAG 2/2 100 -9.8 2 79 -10.2 2 DDOST (O54734) EXOC4 N-term 3XFLAG 2/2 53 - 3 -10.6 1  58  X!Tandem Interacting Proteina HGNC (Uniprot ID) Bait Protein # Times Observed Per # Experiments Mascot Score b log(E) c # Unique Peptides d Average log(E) Tier 95 -21.8 3 EXOC3 N-term 3XFLAG 2/2 52 - 2 YWHAZ (P63101) EXOC4 N-term 3XFLAG 1/2 94 -11.4 2 -16.6 1 EXOC4 N-term 3XFLAG 1/2 - -61.4 9 ARID1B (Q8NFD5) EXOC3 N-term 3XFLAG 1/2 - -52.2 8 -56.8 1 EXOC4 N-term 3XFLAG 1/2 173 -48.9 6 PDIA3 (P27773) EXOC3 N-term 3XFLAG 1/2 174 -44.3 6 -46.6 1 EXOC3 N-term 3XFLAG 1/2 131 -28.3 4 PRPSAP1 (Q9D0M1) EXOC4 N-term 3XFLAG 1/2 172 -16.2 3 -22.3 1 EXOC4 N-term 3XFLAG 1/2 61 -18.8 3 PKM2 (P52480) EXOC3 N-term 3XFLAG 1/2 65 -10.2 2 -14.5 1 129 -26.4 1 EXOC3 N-term 3XFLAG 2/2 189 - 1 92 -21.6 2 GAPDHS (Q64467) EXOC4 N-term 3XFLAG 2/2 102 - 0 -24.0 2 96 -10.7 2 EXOC3 N-term 3XFLAG 2/2 - -1.0 1 53 -2.8 1 HSD17B12 (O70503) EXOC4 N-term 3XFLAG 2/2 83 -2.5 1 -6.5 2 77 -19.2 3 EXOC4 N-term 3XFLAG 2/2 48 -2.7 1 49 -2.1 1 LDHA (P00338) EXOC3 N-term 3XFLAG 2/2 61 - 2 -8.0 2 79 -11.8 2 EXOC3 N-term 3XFLAG 2/2 81 -10.2 1 YWHAQ (P68254) EXOC4 N-term 3XFLAG 1/2 59 - 1 -11.0 2 47 -11.4 2 EXOC3 N-term 3XFLAG 2/2 60 -1.8 1 SSR1 (Q9CY50) EXOC4 N-term 3XFLAG 1/2 53 -2.0 1 -5.1 2  59  X!Tandem Interacting Proteina HGNC (Uniprot ID) Bait Protein # Times Observed Per # Experiments Mascot Score b log(E) c # Unique Peptides d Average log(E) Tier - -9.3 2 EXOC3 N-term 3XFLAG 2/2 - -1.1 1 MAGEE2 (Q52KG3) EXOC4 N-term 3XFLAG 1/2 - -2.5 1 -4.3 2 49 -9.1 2 EXOC4 N-term 3XFLAG 2/2 68 - 3 PFKP (Q9WUA3) EXOC3 N-term 3XFLAG 1/2 - -1.4 1 -5.3 2 46 -8.5 2 EXOC3 N-term 3XFLAG 2/2 75 -2.3 1 GALK1 (Q9R0N0) EXOC4 N-term 3XFLAG 1/2 - -3.8 1 -4.9 2 70 -5.9 1 EXOC4 N-term 3XFLAG 2/2 98 -2.8 1 MEST (Q07646) EXOC3 N-term 3XFLAG 1/2 43 -2.1 1 -3.6 2 EXOC4 N-term 3XFLAG 1/2 94 -18.9 3 PRPSAP2 (Q8R574) EXOC3 N-term 3XFLAG 1/2 59 -10.9 1 -14.9 2 79 -11.4 2 CAND1 (Q6ZQ38) EXOC3 N-term 3XFLAG 2/2 - -1.0 1 -6.2 2 EXOC4 N-term 3XFLAG 1/2 79 -9.6 2 TPI1 (P17751) EXOC3 N-term 3XFLAG 1/2 64 -4.2 1 -6.9 2 EXOC3 N-term 3XFLAG 1/2 - -9.1 2 RPN2 (Q9DBG6) EXOC4 N-term 3XFLAG 1/2 46 -4.8 1 -7.0 2 1/2 - -3.1 1 DHCR7 (O88455) EXOC4 N-term 3XFLAG 1/2 - -2.8 1 -3.0 2  a – Interacting proteins are those meeting score quality thresholds (see Methods). b – Mascot score as calculated by the Matrix Science Mascot software algorithm against the Ensembl mouse protein database. c – The log (E) value is an estimate of the probability that the protein assignment happened randomly as calculated by the X!Tandem algorithm against the Ensembl mouse protein database. d - Number of uniquely assigned peptides assigned to an individual accession that contribute to the log(E) score as calculated by the X!Tandem algorithm unless no X!Tandem score was found, in which case the number of unique peptides contributing to the Mascot score is shown in italics.  Additional non-unique  60 peptides which map to multiple proteins within the Ensembl mouse protein database may contribute to individual X!Tandem or Mascot scores and are not reported here. e – This protein identification is likely a result of Mascot assigning ion scores from non-unique peptides to multiple proteins.  61 Table 2.5   - Results of GO Analysis for DTNBP1 Interacting Proteins This table displays the results of the DAVID Gene Ontology analysis of DTNBP1 immunoprecipitation results (76/83 identifications found by DAVID with Homo sapiens background).  Gene Ontology Term # Contributing Genes % of Genes Annotated to this Term Multiple Test Corrected P-Value (Benjamini) Biological Process cellular localization  19 25 2.6x10-5 establishment of cellular localization  19 25 3.4x10-5 vesicle-mediated transport  14 18.4 1.6x10-4 cellular component organization and biogenesis 29 38.2 6.9x10-4 protein transport  14 18.4 3.4x10-3 intracellular transport  14 18.4 3.7x10-3 cytoskeleton organization and biogenesis  12 15.8 4.7x10-3 establishment of protein localization  14 18.4 5.1x10-3 protein localization  14 18.4 7.4x10-3 macromolecule localization  14 18.4 1.3x10-2 transport  25 32.9 2.8x10-2 establishment of localization  25 32.9 4.1x10-2 localization  27 35.5 4.2x10-2 secretion  9 11.8 4.4x10-2 secretion by cell  8 10.5 4.8x10-2 Cellular Component cytoplasm  57 75 8.2x10-11 cytoplasmic part  40 52.6 4.5x10-7 cytoskeletal part  17 22.4 1.0x10-5 vesicle  13 17.1 2.1x10-5 intracellular part  62 81.6 2.3x10-5 cytoplasmic vesicle  13 17.1 2.4x10-5 protein complex  27 35.5 2.4x10-5 coated membrane  7 9.2 3.6x10-5 membrane coat  7 9.2 3.6x10-5 macromolecular complex  29 38.2 6.0x10-5 cytoskeleton  19 25 6.4x10-5 actin cytoskeleton  10 13.2 1.1x10-4 cytoplasmic membrane-bound vesicle  11 14.5 1.2x10-4 membrane-bound vesicle  11 14.5 1.2x10-4 intracellular  62 81.6 1.5x10-4 intracellular organelle part  33 43.4 1.6x10-4 organelle part  33 43.4 1.6x10-4 Golgi apparatus  13 17.1 7.2x10-4 microtubule cytoskeleton  10 13.2 2.3x10-3 cell cortex  5 6.6 5.8x10-3 microtubule associated complex  6 7.9 7.0x10-3 chaperonin-containing T-complex  3 3.9 1.0x10-2 coated vesicle  6 7.9 1.1x10-2 intracellular non-membrane-bound organelle  20 26.3 1.2x10-2 non-membrane-bound organelle  20 26.3 1.2x10-2 dynactin complex  3 3.9 1.2x10-2 microtubule  7 9.2 1.3x10-2  62 Gene Ontology Term # Contributing Genes % of Genes Annotated to this Term Multiple Test Corrected P-Value (Benjamini) pigment granule  5 6.6 1.4x10-2 melanosome  5 6.6 1.4x10-2 coated pit  4 5.3 3.9x10-2 cytosol  9 11.8 4.3x10-2 coated vesicle membrane  4 5.3 5.0x10-2 Molecular Function protein binding  54 71.1 2.8x10-6 binding  67 88.2 6.9x10-4 nucleotide binding  24 31.6 1.3x10-2   63 Table 2.6   - Results of GO Analysis for Dynactin