UBC Faculty Research and Publications

Neurocarta: aggregating and sharing disease-gene relations for the neurosciences Portales-Casamar, Elodie; Ch’ng, Carolyn; Lui, Frances; St-Georges, Nicolas; Zoubarev, Anton; Lai, Artemis Y; Lee, Mark; Kwok, Cathy; Kwok, Willie; Tseng, Luchia; Pavlidis, Paul Feb 26, 2013

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


52383-12864_2012_Article_4813.pdf [ 381.17kB ]
JSON: 52383-1.0224108.json
JSON-LD: 52383-1.0224108-ld.json
RDF/XML (Pretty): 52383-1.0224108-rdf.xml
RDF/JSON: 52383-1.0224108-rdf.json
Turtle: 52383-1.0224108-turtle.txt
N-Triples: 52383-1.0224108-rdf-ntriples.txt
Original Record: 52383-1.0224108-source.json
Full Text

Full Text

DATABASE Open AccessNeurocarta: aggregating and sharingdisease-gene relations for the neurosciencesElodie Portales-Casamar, Carolyn Ch’ng, Frances Lui, Nicolas St-Georges, Anton Zoubarev, Artemis Y Lai, Mark Lee,Cathy Kwok, Willie Kwok, Luchia Tseng and Paul Pavlidis*AbstractBackground: Understanding the genetic basis of diseases is key to the development of better diagnoses andtreatments. Unfortunately, only a small fraction of the existing data linking genes to phenotypes is availablethrough online public resources and, when available, it is scattered across multiple access tools.Description: Neurocarta is a knowledgebase that consolidates information on genes and phenotypes acrossmultiple resources and allows tracking and exploring of the associations. The system enables automatic and manualcuration of evidence supporting each association, as well as user-enabled entry of their own annotations.Phenotypes are recorded using controlled vocabularies such as the Disease Ontology to facilitate computationalinference and linking to external data sources. The gene-to-phenotype associations are filtered by stringent criteriato focus on the annotations most likely to be relevant. Neurocarta is constantly growing and currently holds morethan 30,000 lines of evidence linking over 7,000 genes to 2,000 different phenotypes.Conclusions: Neurocarta is a one-stop shop for researchers looking for candidate genes for any disorder of interest.In Neurocarta, they can review the evidence linking genes to phenotypes and filter out the evidence they’re notinterested in. In addition, researchers can enter their own annotations from their experiments and analyze them inthe context of existing public annotations. Neurocarta’s in-depth annotation of neurodevelopmental disordersmakes it a unique resource for neuroscientists working on brain development.Keywords: Phenotype, Genes, Knowledgebase, Brain developmentBackgroundThere is a tremendous amount of research focusing onunderstanding the genetic basis of disease. Studies use awide range of strategies, including targeted gene ap-proaches, genome-wide screens, and animal models. Assuch studies continue to proliferate and provide insightson specific disorders, it is important to integrate theinformation in order to make the best use of the dataand increase the level of insight that can be gained fromnew studies. Knowledge that crosses studies and disor-ders can be used to perform meta-analyses, to uncovercommonalities among conditions, and to tease apart thefactors that contribute to phenotypes that make up adisorder. However, currently, information about the gen-etic and molecular basis of diseases is distributed amonga range of specialized or generic data resources, hinder-ing its optimal use [1]. Examples of more or less genericdatabases are Online Mendelian Inheritance in Man(OMIM) [2], the Rat Genome Database (RGD) [3], andthe Comparative Toxicogenomics Database (CTD) [4].While these different resources overlap in their diseasecoverage and data sources, they are also complementaryin that each of the curation teams making the annotationshas different criteria for inclusion and different biases.Other databases are dedicated to specific disorders, theseinclude the Simons Foundation Autism Research InitiativeGene Database (SFARI Gene) for autism [5], PDGene forParkinson’s disease [6], Alzgene for Alzheimer’s disease[7], MSGene for multiple sclerosis [8], ADHDgene forAttention Deficit Hyperactivity Disorder [9], and CADgenefor Coronary Artery Disease [10].The resource we describe was motivated by theestablishment of a large Canadian research network“NeuroDevNet”, with the goal of translating knowledge* Correspondence: paul@chibi.ubc.caCentre for High-Throughput Biology and Department of Psychiatry,University of British Columbia, 2125 East Mall, Vancouver, BC V6T1Z4, Canada© 2013 Portales-Casamar et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of theCreative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use,distribution, and reproduction in any medium, provided the original work is properly cited.Portales-Casamar et al. BMC Genomics 2013, 14:129http://www.biomedcentral.com/1471-2164/14/129into improved diagnosis, prevention and treatment ofneurodevelopmental disorders [11,12]. To facilitate thedesign and interpretation of genetics studies, we recognizedthe need for a resource that captures existing information,but none of the resources mentioned above was sufficientlycomprehensive. This was in part because genetic investiga-tions of two of the disorders of interest to NeuroDevNet,Fetal Alcohol Spectrum Disorder (FASD) and CerebralPalsy (CP) were not well covered by any existing database.Neurocarta is an online resource focusing on the geneticbasis of neurodevelopmental disorders. In addition tocontaining manually curated information on disorders ofinterest to neurodevelopmental researcher, Neurocartaaggregates data from multiple disease gene resources sothat the neurodevelopmental annotations can be examinedin the context of other disease annotations, providing abetter understanding of how generic the function of thegene might be.Construction and contentDatabase schema and implementationNeurocarta was developed as an extension of Gemma[13], a database and software system for the meta-analysisof functional genomics data. Figure 1 shows a simplifiedschematic of our data model used to capture informationlinking genes and phenotypes. The “Gene” information isretrieved automatically as part of the Gemma frameworkfrom the NCBI Gene database [14]. Gemma currentlyfocuses on a set of selected species: human, mouse, rat,zebrafish, fly, worm, and yeast. The “Phenotype” informa-tion includes terms describing diseases, symptoms, and ab-normal physical characteristics, drawn from three distinctontologies: (i) Disease Ontology [15]; (ii) Human PhenotypeOntology [16]; and (iii) Mammalian Phenotype Ontology[17]. The “Evidence” corresponds to annotations linking aspecific gene to a specific phenotype. This evidence can beof several types: (i) Literature (reference from PubMed[18]); (ii) Experimental (details about experimental design,from a published article or not); and (iii) User comment.Where possible, links are provided to the original source ofthe evidence (e.g., public database, review article), andcan be defined as “positive” or “negative”, where “negative”means that the evidence shows that there is no associationbetween a gene and a phenotype. All evidence annotationsuse standardized terminologies such as the Ontology forBiomedical Investigations [19] to facilitate users’ interpret-ation and enable computational analysis. Currently we donot attempt to capture information on the specific geneticvariants associated with the disease as such informationis frequently not readily available in computable form,making acquisition challenging.Neurocarta benefits from the registration systemimplemented in Gemma [13] allowing users the optionof registering and entering their own annotations. Theannotations can be set to be either public or private.When private, the owner can decide whom to share themwith, using a group-based authorization framework.Database contentOur database currently contains more than 30,000 linesof evidence linking over 7,000 genes to 2,000 differentphenotypes (For detailed statistics, see http://www.chibi.ubc.ca/Gemma/neurocartaStatistics.html). Figure 2 showsthe distribution of genes (2A) and phenotypes (2B) basedon how many distinct associations they are a part of.Tables 1 and 2 detail the top ten genes and phenotypes,respectively, with the most distinct associations. The associ-ations are derived from manual annotations from the litera-ture and automatic annotations from public databases.Data extraction from external sourcesWe have defined stringent criteria for automatic inclusionof data from external sources, with the goal of limiting theinclusion of unreliable data or information that we deem oflimited utility to our target audience. In this section we pro-vide details of procedures for each resource. As we are con-tinuing to add resources to the system, information on theinclusion criteria and import procedures is also maintainedon the Neurocarta website at http://gemma-doc.chibi.ubc.ca/neurocarta/data-sources.OMIM [2]: The OMIM data files (morbidmap.txt andmim2gene.txt) are downloaded from the OMIM FTPsite. We extract unique mappings between PhenotypeMIM numbers and Gene MIM numbers from morbidmap.txt and map the genes to their NCBI identifiers inmim2gene.txt.RGD [3]: The RGD Gene-Disease association files(homo_genes_rdo, mus_genes_rdo, rattus_genes_rdo)are downloaded from the RGD FTP site. Annotationswith the following evidence codes are ignored: ISS(redundant across species), NAS (non-traceable author’sstatements are debatable), and IEA (electronic annotationscome from other sources, GAD for example) and weprefer to get these annotations directly from thesource). Annotations without a PubMed reference areignored as well.Figure 1 Gene-to-phenotype association data modelin Neurocarta.Portales-Casamar et al. BMC Genomics 2013, 14:129 Page 2 of 8http://www.biomedcentral.com/1471-2164/14/129CTD [4]: The CTD Gene-Disease association file(CTD_genes_diseases.tsv) is downloaded from the CTDwebsite. We only consider curated annotations with DirectEvidence set to “marker/mechanism” or “therapeutic”, andat least one PubMed reference.Disease-specific databases: The SFARI [5] annotation files(autism-gene-dataset.csv, gene-score.csv) are downloadedform the SFARI Gene website. Each PubMed reference isimported as separate literature evidence in Neurocarta,with the option of it being defined as “negative” wheneverspecified in the annotation file. The PDGene [6], AlzGene[7], and MSGene [8] “Top Results” are extracted from theirrespective websites. All three databases assess their resultsfor their epidemiological credibility using two methods:(1) The HuGENet interim criteria for the cumulativeassessment of genetic associations [20,21], and (2) Bayesiananalyses [22,23]. Only meta-analysis results with P-values<0.00001 are considered. The “Hot gene list” fromADHDgene [9] is extracted from their website. This listincludes all genes that have been identified in at least fiveindependent studies. The ALSoD [24] top 20 genes areidentified through the credibility score analysis providedon their website. The genes are ranked by the number ofaffected patients and by the number of mutations per gene,and the ranks are summed to determine the final rank foreach gene. For the IDGene [25] and EpiGAD [26] data-bases, we wanted to extract more information than whatwas readily accessible through respective websites. Wemanually reviewed the genes listed in each database andused that information as a seed for targeted PubMedsearches and manual curation of relevant publications.Disease mapping from external sources to DiseaseOntology (DO) terminologyFor the disorder-specific databases we use the correspond-ing appropriate terms in DO (e.g., “autism spectrumdisorder” for SFARI and “amyotrophic lateral sclerosis”for ALSoD). As described next, for other databases weused a combination of automatic and semi-automaticmethods for mapping.OMIM, RGD, and CTD: These three resources provideOMIM or MeSH terms that we mapped to DO terms asFigure 2 Distribution of genes (2A) and phenotypes (2B) based on their number of distinct associations. Each ontology term isconsidered a distinct phenotype regardless of its position in the ontology tree. Therefore, a gene will be counted as associated with two distinctphenotypes if different lines of evidence link it to a child term and its parent term.Table 1 Top ten genes with the most associated phenotypesGene symbol Gene name NCBI ID # of phenotypesTNF tumor necrosis factor 7124 111PTGS2 prostaglandin-endoperoxide synthase 2 5743 109MMP9 matrix metallopeptidase 9 4318 82IL6 interleukin 6 3569 79PTEN phosphatase and tensin homolog 5728 75HLA-DRB1 major histocompatibility complex, class II, DR beta 1 3123 75IL1B interleukin 1, beta 3553 73TP53 tumor protein p53 7157 66MTHFR methylenetetrahydrofolate reductase 4524 66TGFB1 transforming growth factor, beta 1 7040 66Portales-Casamar et al. BMC Genomics 2013, 14:129 Page 3 of 8http://www.biomedcentral.com/1471-2164/14/129follows. First, we use the Xref mappings provided in theHuman_DO.obo ontology file, which covers about 50%of the phenotype-gene mappings in these resources. Forthe remaining that use terms lacking a DO Xref, we usethe NCBO Annotator Web service [27] followed bymanual quality control to resolve partial matches,increasing coverage substantially. In total about 2/3 ofthe phenotype-gene associations present in OMIM,RGD, or CTD could be mapped to a DO term. This isdue to non-disease terms that are listed in OMIM butnot in DO (e.g., “Blood type”, “Ig levels”), and somedisease terms missing from DO (mostly syndromic, e.g.,TARP syndrome, Jawad syndrome), or missed mappings.We have notified the DO maintainers of these gaps andexpect to eventually be able to import a greater fractionof these annotations into Neurocarta.Manual curation of the literatureWhile the Neurocarta framework is generic, our curationteam is focusing on annotations relevant to our primaryresearch interest, neurodevelopmental disorders. In-depthannotations have been produced on the following DiseaseOntology terms (including respective children terms):(i) “Autism Spectrum Disorder” (ASD; DOID_0060041);(ii) “Cerebral Palsy” (CP; DOID_1969); (iii) “Fetal AlcoholSpectrum Disorder” (FASD; DOID_0050696); (iv) “Epilepsy”(DOID_1826); and (v) and “Intellectual disability”(DOID_1059). When necessary, phenotype descriptionswere complemented with more descriptive Human orMammalian Phenotype Ontology terms such as“Memory impairment” (HP_0002354), “EEG abnormality”(HP_0002353), or “decreased brain size” (MP_0000774).Curators review the literature using PubMed searchesacross all fields (that is, the default PubMed setting) usingqueries such as “epilepsy” AND “genetics”. We avoidmaking searches that are gene-centric, except as a second-ary mechanism to find additional citations on a gene-phenotype relationship identified through initial screening.When possible, review papers are used to identify primaryresearch papers, which are then curated as “ExperimentalType Evidence”. The curators record details about theexperiment using controlled vocabularies, categorized as(for example) “Bio Source”, “Experiment Design”, or“Developmental Stage”. The criterion for inclusion is anexperimentally-supported statement linking the gene tothe phenotype. The exception is genome-wide studieswhere the results were not yet confirmed by follow-upexperiments. The curated papers involve a wide varietyof experiments including both animal models andhuman studies. For the former, if the authors describethe animal model as a specific model for the disorder ofinterest, the curators associate the gene studied in thepaper directly to the human disease. If the authorsdescribe an endophenotype that is related to the disease,the gene is associated to the endophenotype only. Insome cases, review papers are used as the source of theannotations instead of drilling down to the originalresearch papers. In that case, it is curated as “LiteratureType Evidence” with no details about the experiments.To help users navigate through the evidence, we are,when possible, associating phenotypes to genes in aspecies-specific way. So, for instance, if the evidencecomes from an experiment done in rats, it will be linkedin Neurocarta to the rat gene.Utility and discussionUser interfaceFigure 3 shows the main Neurocarta user interface,which is divided into three panels. The left panel lists allphenotypes currently annotated in our system, displayedas a tree of terms in the ontologies, or as a simple list.By clicking on a checkbox next to the phenotype term,one or more phenotypes can be selected and it affectsthe display in the other two panels. The top-right panelshows the list of genes associated with the selectedphenotype(s). If more than one phenotype are selected,only genes that are associated with all of the phenotypesare listed (i.e., the intersection of genes associated withTable 2 Top ten phenotypes with the most associated genesPhenotype Term URI # of genesprostate cancer http://purl.obolibrary.org/obo/DOID_10283 602breast cancer http://purl.obolibrary.org/obo/DOID_1612 531hypertension http://purl.obolibrary.org/obo/DOID_10763 439autism spectrum disorder http://purl.obolibrary.org/obo/DOID_0060041 394type 2 diabetes mellitus http://purl.obolibrary.org/obo/DOID_9352 389asthma http://purl.obolibrary.org/obo/DOID_2841 389obesity http://purl.obolibrary.org/obo/DOID_9970 363peripheral nervous system disease http://purl.obolibrary.org/obo/DOID_574 296ovarian cancer http://purl.obolibrary.org/obo/DOID_2394 273Alzheimer’s disease http://purl.obolibrary.org/obo/DOID_10652 259Portales-Casamar et al. BMC Genomics 2013, 14:129 Page 4 of 8http://www.biomedcentral.com/1471-2164/14/129each phenotype). A download button allows users todownload the displayed gene lists. Once a gene of interestis selected, the bottom-right panel shows the list ofevidence for all phenotype associations annotated for thisgene, each row being expandable to provide more details.Evidence for the currently selected phenotype(s) and theirchildren terms are highlighted in red. Evidence for otherphenotypes associated with the gene are shown in black.Evidence inferred from an orthologous gene, as defined inthe NCBI Homologene resource [14], are displayed ingrey. Links are provided to the original source of theevidence when available. Users can filter the data displayedto restrict to a specific species, or to the annotations theyhave entered in Neurocarta.Use casesNeurocarta was originally conceived to help researchersidentify candidate genes that might be involved in theirdisorder of interest, based on existing knowledge, andput that information in the context of other phenotypesassociated with the genes. Neurocarta allows users toextract the list of genes that have been associated to aspecific disorder, look at the detail of the evidence, andapply further selection criteria. It can also be used toidentify relevant literature pertaining to a gene or pheno-type of interest. By aggregating data from multiple sources,we enable a global view of each gene’s involvement indiseases, facilitating the identification of genes specificallyinvolved in one disorder versus genes involved in manydisease processes. Such candidate gene lists can be used byresearchers who perform genome-wide studies, helpingthem identifying the most likely candidates in their results.It can also inform more targeted approaches as to whichgene to include in the study. Another unique aspect ofNeurocarta is the ability that users have to enter their ownannotations. This enables them to share unpublishedresults with collaborators as well as put their findings in thecontext of existing data and facilitate interpretation.Investigating gene-to-phenotype associations forneurodevelopmental disordersWe gathered positive associations from Neurocarta(March 12, 2012). There were 14,983 unique gene-phenotype associations, consisting of many-to-manyrelationships between 4,560 genes and 1,555 phenotypes(Disease Ontology terms only). We decided to focus ouranalysis on neurodevelopmental disorders since weperformed in-depth annotations on them. We categorizedthe genes based on which disease they were annotated for(ASD, CP, or FASD) and defined them as being “specific”if they were only associated with this one disease (Table 3).We observed that ASD has the largest fraction of specificFigure 3 Neurocarta user interface.Portales-Casamar et al. BMC Genomics 2013, 14:129 Page 5 of 8http://www.biomedcentral.com/1471-2164/14/129genes. To try to better understand where the differencemight come from, we decided to investigate whether ornot biases might be present in the data. Previous work inour lab [28] showed that genes associated with diseasestend to be more “multifunctional” (i.e., they have moreGene Ontology (GO) [29] annotations). Therefore, wepredicted that genes in Neurocarta would have amultifunctionality bias, and that the genes associated withmultiple disorders would be even more multifunctional.Indeed, we found that genes in Neurocarta tend to beassociated with a large number of GO annotations onaverage (aggregate multifunctionality score = 0.8, where1.0 is the highest possible bias and 0.5 would be no bias).When we separated the genes specific to one diseaseversus the rest, we confirmed our hypothesis that genesassociated with multiple disorders tend to be more multi-functional than specific genes (Figure 4; Mann–Whitneytest, p-values: ASD = 2.6 × 10-15; FASD = 5.9 × 10 -3;CP = 7.8 × 10-2). In addition, our results suggested thatgenes associated with FASD were more multifunctionalthan those associated with ASD or CP. We hypothesizedthat this might be due to the experimental approachesused to study FASD. Indeed, 98% of the FASD studies inNeurocarta are targeted (i.e., they use a candidate geneapproach) against only 55% for ASD and 72% for CP,respectively. The multifunctionality bias in FASD candidategenes might thus be due to researchers choosingwell-characterized genes for their studies rather thanthe genome-wide approaches mostly used in ASD researchrepresented in the database. This might also explain thefact that, in Neurocarta, ASD has the largest fraction ofspecific genes, as mentioned above.Comparison to similar existing resourcesSeveral genotype to phenotype databases have beencreated with the idea of aggregating data from severalsources in a common standardized online tool [1]. Someof existing tools rely entirely on OMIM annotations andonly provide a more sophisticated portal to access data[29-31]. Others automatically aggregate data from acollection of resources, including OMIM, but they eitherfocus only on human annotations [32], or on only onemajor phenotype database for a selection of modelorganisms [33]. Finally, some of the tools have beendesigned for human genetic association studies only andaggregate data from automatic or curated review of theTable 3 Genes in Neurocarta associated withneurodevelopmental disordersDisease category # of specificgenesTotalgenes% of specificgenesASD 189 321 69.8FASD 27 106 25.5CP 23 124 22.1ASD = Autism Spectrum Disorder; FASD = Fetal Alcohol Spectrum Disorder;CP = Cerebral Palsy.Figure 4 Genes associated with multiple diseases in Neurocarta are more multifunctional than specific genes. Mann–Whitney test: * P≤ 0.1;** P≤ 0.01; *** P≤ 0.001.Portales-Casamar et al. BMC Genomics 2013, 14:129 Page 6 of 8http://www.biomedcentral.com/1471-2164/14/129literature [34-37]. Neurocarta is unique compared to thesevarious initiatives in that it aggregates data from differentorganisms (human and animal models of diseases) anddifferent kinds of studies (from genetic association to basicmolecular experiments). It puts side-by-side data automat-ically extracted from public resources as well as manuallycurated from selected papers in the literature. All data goesthrough a review process where only the most reliableannotations are kept to reduce noise in the system. Finally,Neurocarta is the only publicly-available online tool thatallows users to enter their own genotype to phenotypeassociations, share them with other users, and analyzethem in the context of all existing annotations.Future developmentWe are planning several lines of improvements toNeurocarta’s data and software layers. Neurocarta currentlyincludes very few data from genome-wide associationstudies because of the high rate of false positives that canarise from these data. We have included data fromPDGene, AlzGene, and MSGene but only the top resultsthat reached significance in their meta-analyses of the data.In the case of ADHDgene, we have decided to incorporatetheir Hot Gene list even though it was only based on thenumber of studies a gene was identified in. We arecurrently investigating different options to incorporate themost significant results from additional genetic associationdata from public resources such as GAD [35], GWASdb[36], or the GWAS catalog [37]. Another development thatwe feel will add value to Neurocarta is to incorporateautomated Gemma differential expression analysis results.Neurocarta is part of Gemma but currently does not take afull advantage of this integration. In Gemma, gene expres-sion datasets comparing control vs. disease cases are taggedand easily identifiable. We will apply differential expressionanalysis to these datasets using stringent thresholds toidentify genes differentially expressed in specific diseases.We will then incorporate this analysis result as a new typeof evidence linking genes to phenotypes in Neurocarta.Finally, a challenge in making the best use of the data isthat different sources have different levels of evidencequality associated with them. For example, humangeneticists would generally rate evidence from animalmodels as weak. Neurocarta does not directly capturesuch distinctions, so we are in the process of devisingan evidence-rating scheme that will be used to auto-matically rank genes with respect to their strength ofevidence in association with each disorder.ConclusionsNeurocarta is a new online resource linking genes tophenotypes. It brings together data from a wide varietyof public resources and from manual curation of theliterature. It is unique in that it allows users to entertheir own annotations and keep them private if theywish to. In-depth annotations of genes involved in braindevelopment disorders are available but Neurocarta isnot restricted to a single disease. Instead, Neurocartaenables users to visualize all diseases their gene of interestmight be associated with. This allows users not only toextract candidate gene lists from the system, but also toidentify which of these genes are the most specific to theirdisorder of interest and to quickly find papers supportingthese associations. Our analysis of the data in the contextof neurodevelopmental disorders demonstrates that existingannotations linking genes to phenotypes are skewed togenes that are well known and involved in many biologicalfunctions. Neurocarta exposes this problem and makesit easier for researchers to focus their attention on more“specific” genes.Availability and requirementsNeurocarta is publicly available at http://neurocarta.chibi.ubc.ca.AbbreviationsADHDgene: Attention deficit hyperactivity disorder gene database;AlzGene: Alzheimer’s disease gene database; ASD: Autism spectrum disorder;CP: Cerebral palsy; CTD: Comparative toxicogenomics database; FASD: Fetalalcohol spectrum disorder; GAD: Genetic association database; GO: Geneontology; GWASdb: Genome-wide association study database; IEA: Inferredfrom electronic annotation; ISS: Inferred from sequence or structuralsimilarity; MSGene: Multiple sclerosis gene database; NAS: Non-traceableauthor statement; NCBI: National Center for Biotechnology Information;OMIM: Online mendelian inheritance in man; PDGene: Parkinson’s diseasegene database; RGD: Rat genome database; SFARI: Simons FoundationAutism Research Initiative.Competing interestsThe authors declare that they have no competing interests.Authors’ contributionsEPC oversaw the development of Neurocarta and drafted the manuscript. CCconducted the data analysis. FL, NSG, and AZ developed Neurocarta. AL, ML,CK, WK, LT curated the data and tested the user interface. PP led the projectand helped draft the manuscript. All authors read and approved the finalmanuscript.AcknowledgementsThis work was supported by the NeuroDevNet Network of Centres ofExcellence (Neuroinformatics Core and Opportunities Initiatives grants to PP)and NIH grant RO1-GM076990 to PP. We thank Sanja Rogic for comments onthe manuscript, Thea Van Rossum for software and graphics contributions,James Reynolds, Michael Shevell, Lonnie Zwaigenbaum, Steven Scherer, andMarie-Pierre Dubé for advice and encouragement.Received: 31 July 2012 Accepted: 23 February 2013Published: 26 February 2013References1. Thorisson GA, Muilu J, Brookes AJ: Genotype-phenotype databases: challengesand solutions for the post-genomic era. Nat Rev Genet 2009, 10:9–18.2. Amberger J, Bocchini CA, Scott AF, Hamosh A: McKusick’s OnlineMendelian Inheritance in Man (OMIM). Nucleic Acids Res 2009,37:D793–D796.3. Laulederkind SJF, Tutaj M, Shimoyama M, Hayman GT, Lowry TF, Nigam R,Petri V, Smith JR, Wang S-J, De Pons J, Dwinell MR, Jacob HJ: Ontologysearching and browsing at the Rat Genome Database. Database 2012,2012:bas016.Portales-Casamar et al. BMC Genomics 2013, 14:129 Page 7 of 8http://www.biomedcentral.com/1471-2164/14/1294. Davis AP, King BL, Mockus S, Murphy CG, Saraceni-Richards C, Rosenstein M,Wiegers T, Mattingly CJ: The Comparative Toxicogenomics Database:update 2011. Nucleic Acids Res 2011, 39:D1067–D1072.5. Banerjee-Basu S, Packer A: SFARI Gene: an evolving database for theautism research community. Dis Model Mech 2010, 3:133–135.6. Lill CM, Roehr JT, McQueen MB, Kavvoura FK, Bagade S, Schjeide B-MM,Schjeide LM, Meissner E, Zauft U, Allen NC, Liu T, Schilling M, Anderson KJ,Beecham G, Berg D, Biernacka JM, Brice A, DeStefano AL, Do CB, Eriksson N,Factor SA, Farrer MJ, Foroud T, Gasser T, Hamza T, Hardy JA, Heutink P,Hill-Burns EM, Klein C, Latourelle JC, Maraganore DM, Martin ER, MartinezM, Myers RH, Nalls MA, Pankratz N, Payami H, Satake W, Scott WK,Sharma M, Singleton AB, Stefansson K, Toda T, Tung JY, Vance J, WoodNW, Zabetian CP, Young P, Tanzi RE, Khoury MJ, Zipp F, Lehrach H,Ioannidis JPA, Bertram L: Comprehensive research synopsis andsystematic meta-analyses in Parkinson’s disease genetics: ThePDGene database. PLoS Genet 2012, 8:e1002548.7. Bertram L, McQueen MB, Mullin K, Blacker D, Tanzi RE: Systematic meta-analyses of Alzheimer disease genetic association studies: the AlzGenedatabase. Nat Genet 2007, 39:17–23.8. Lill CM, Roehr JT, McQueen MB, Bagade S, Schjeide BM, Zipp F,Bertram L: The MSGene Database. Alzheimer Research Forum. Availableat http://www.msgene.org/.9. Zhang L, Chang S, Li Z, Zhang K, Du Y, Ott J, Wang J: ADHDgene: agenetic database for attention deficit hyperactivity disorder. Nucleic AcidsRes 2011, 40:D1003–D1009.10. Liu H, Liu W, Liao Y, Cheng L, Liu Q, Ren X, Shi L, Tu X, Wang QK, Guo A-Y:CADgene: a comprehensive database for coronary artery disease genes.Nucleic Acids Res 2010, 39:D991–D996.11. Shevell M, Goldowitz D: Inter-disciplinary research in the pediatricneurosciences: the NeuroDevNet model, Introduction. Semin PediatrNeurol 2011, 18:1.12. Portales-Casamar E, Evans A, Wasserman W, Pavlidis P: The NeuroDevNetNeuroinformatics Core. Semin Pediatr Neurol 2011, 18:17–20.13. Zoubarev A, Hamer KM, Keshav KD, McCarthy EL, Santos JRC, Rossum TV,McDonald C, Hall A, Wan X, Lim R, Gillis J, Pavlidis P: Gemma: A resourcefor the re-use, sharing and meta-analysis of expression profiling data.Bioinformatics 2012, 28:2272–2273.14. Sayers EW, Barrett T, Benson DA, Bolton E, Bryant SH, Canese K, Chetvernin V,Church DM, Dicuccio M, Federhen S, Feolo M, Fingerman IM, Geer LY,Helmberg W, Kapustin Y, Krasnov S, Landsman D, Lipman DJ, Lu Z, Madden TL,Madej T, Maglott DR, Marchler-Bauer A, Miller V, Karsch-Mizrachi I, Ostell J,Panchenko A, Phan L, Pruitt KD, Schuler GD, Sequeira E, Sherry ST, ShumwayM, Sirotkin K, Slotta D, Souvorov A, Starchenko G, Tatusova TA, Wagner L,Wang Y, Wilbur WJ, Yaschenko E, Ye J: Database resources of the NationalCenter for Biotechnology Information. Nucleic Acids Res 2012, 40:D13–25.15. Schriml LM, Arze C, Nadendla S, Chang Y-WW, Mazaitis M, Felix V, Feng G,Kibbe WA: Disease Ontology: a backbone for disease semanticintegration. Nucleic Acids Res 2012, 40:D940–946.16. Robinson PN, Mundlos S: The human phenotype ontology. Clin Genet2010, 77:525–534.17. Smith CL, Eppig JT: The mammalian phenotype ontology: enablingrobust annotation and comparative analysis. Wiley Interdiscip Rev Syst BiolMed 2009, 1:390–399.18. Lu Z: PubMed and beyond: a survey of web tools for searchingbiomedical literature. Database (Oxford) 2011, 2011:baq036.19. Brinkman RR, Courtot M, Derom D, Fostel JM, He Y, Lord P, Malone J,Parkinson H, Peters B, Rocca-Serra P, Ruttenberg A, Sansone S-A, SoldatovaLN, Stoeckert CJ Jr, Turner JA, Zheng J: Modeling biomedical experimentalprocesses with OBI. J Biomed Semantics 2010, 1(Suppl 1):S7.20. Ioannidis JPA, Boffetta P, Little J, O’Brien TR, Uitterlinden AG, Vineis P,Balding DJ, Chokkalingam A, Dolan SM, Flanders WD, Higgins JPT, McCarthyMI, McDermott DH, Page GP, Rebbeck TR, Seminara D, Khoury MJ:Assessment of cumulative evidence on genetic associations: interimguidelines. Int J Epidemiol 2008, 37:120–132.21. Khoury MJ, Bertram L, Boffetta P, Butterworth AS, Chanock SJ, Dolan SM,Fortier I, Garcia-Closas M, Gwinn M, Higgins JPT, Janssens ACJW, Ostell J,Owen RP, Pagon RA, Rebbeck TR, Rothman N, Bernstein JL, Burton PR,Campbell H, Chockalingam A, Furberg H, Little J, O’Brien TR, Seminara D,Vineis P, Winn DM, Yu W, Ioannidis JPA: Genome-wide association studies,field synopses, and the development of the knowledge base on geneticvariation and human diseases. Am J Epidemiol 2009, 170:269–279.22. Ioannidis JPA: Effect of formal statistical significance on the credibility ofobservational associations. Am J Epidemiol 2008, 168:374–383. discussion384–390.23. Stephens M, Balding DJ: Bayesian statistical methods for geneticassociation studies. Nat Rev Genet 2009, 10:681–690.24. Abel O, Powell JF, Andersen PM, Al-Chalabi A: ALSoD: A user-friendlyonline bioinformatics tool for amyotrophic lateral sclerosis genetics. HumMutat 2012, 33:1345–1351.25. ID Database Home. http://gfuncpathdb.ucdenver.edu/iddrc/iddrc/home.php.26. Tan NCK, Berkovic SF: The Epilepsy Genetic Association Database(epiGAD): analysis of 165 genetic association studies, 1996–2008.Epilepsia 2010, 51:686–689.27. Musen MA, Noy NF, Shah NH, Whetzel PL, Chute CG, Story M-A, Smith B:The National Center for Biomedical Ontology. J Am Med Inform Assoc2012, 19:190–195.28. Gillis J, Pavlidis P: The Impact of Multifunctional Genes on “Guilt byAssociation” Analysis. PLoS One 2011, 6:e17258.29. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP,Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A,Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G: Geneontology: tool for the unification of biology, The Gene OntologyConsortium. Nat Genet 2000, 25:25–29.30. Gefen A, Cohen R, Birk OS: Syndrome to gene (S2G): in-silico identificationof candidate genes for human diseases. Hum Mutat 2010, 31:229–236.31. Van Triest HJW, Chen D, Ji X, Qi S, Li-Ling J: PhenOMIM: an OMIM-basedsecondary database purported for phenotypic comparison. Conf Proc IEEEEng Med Biol Soc 2011, 2011:3589–3592.32. Wall DP, Pivovarov R, Tong M, Jung J-Y, Fusaro VA, DeLuca TF, Tonellato PJ:Genotator: A disease-agnostic tool for genetic annotation of disease.BMC Med Genomics 2010, 3:50.33. Groth P, Pavlova N, Kalev I, Tonov S, Georgiev G, Pohlenz H-D, Weiss B:PhenomicDB: a new cross-species genotype/phenotype resource. NucleicAcids Res 2007, 35:D696–699.34. Yu W, Clyne M, Khoury MJ, Gwinn M: Phenopedia and Genopedia:Disease-Centered and Gene-Centered Views of the Evolving Knowledgeof Human Genetic Associations. Bioinformatics 2010, 26:145–146.35. Becker KG, Barnes KC, Bright TJ, Wang SA: The Genetic AssociationDatabase. Nat Genet 2004, 36:431–432.36. Li MJ, Wang P, Liu X, Lim EL, Wang Z, Yeager M, Wong MP, Sham PC,Chanock SJ, Wang J: GWASdb: a database for human genetic variantsidentified by genome-wide association studies. Nucleic Acids Res 2012,40:D1047–1054.37. Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins FS,Manolio TA: Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad SciU S A 2009, 106:9362–9367.doi:10.1186/1471-2164-14-129Cite this article as: Portales-Casamar et al.: Neurocarta: aggregating andsharing disease-gene relations for the neurosciences. BMC Genomics2013 14:129.Submit your next manuscript to BioMed Centraland take full advantage of: • Convenient online submission• Thorough peer review• No space constraints or color figure charges• Immediate publication on acceptance• Inclusion in PubMed, CAS, Scopus and Google Scholar• Research which is freely available for redistributionSubmit your manuscript at www.biomedcentral.com/submitPortales-Casamar et al. BMC Genomics 2013, 14:129 Page 8 of 8http://www.biomedcentral.com/1471-2164/14/129


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items