UBC Faculty Research and Publications

AIR: A batch-oriented web program package for construction of supermatrices ready for phylogenomic analyses Kumar, Surendra; Skjæveland, Åsmund; Orr, Russell J; Enger, Pål; Ruden, Torgeir; Mevik, Bjørn-Helge; Burki, Fabien; Botnen, Andreas; Shalchian-Tabrizi, Kamran Oct 28, 2009

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


52383-12859_2009_Article_3087.pdf [ 3.34MB ]
JSON: 52383-1.0074673.json
JSON-LD: 52383-1.0074673-ld.json
RDF/XML (Pretty): 52383-1.0074673-rdf.xml
RDF/JSON: 52383-1.0074673-rdf.json
Turtle: 52383-1.0074673-turtle.txt
N-Triples: 52383-1.0074673-rdf-ntriples.txt
Original Record: 52383-1.0074673-source.json
Full Text

Full Text

ralssBioMed CentBMC BioinformaticsOpen AcceSoftwareAIR: A batch-oriented web program package for construction of supermatrices ready for phylogenomic analysesSurendra Kumar1, Åsmund Skjæveland1, Russell JS Orr1, Pål Enger1,2, Torgeir Ruden2, Bjørn-Helge Mevik2, Fabien Burki3, Andreas Botnen2 and Kamran Shalchian-Tabrizi*1Address: 1Microbial Evolution Research Group (MERG), Department of Biology, University of Oslo, Norway, 2Centre of Information Technology, University of Oslo, Norway and 3Department of Botany, University of British Columbia, Vancouver, British Columbia, CanadaEmail: Surendra Kumar - surendra.kumar@bio.uio.no; Åsmund Skjæveland - asmund.skjaveland@bio.uio.no; Russell JS Orr - russell.orr@bio.uio.no; Pål Enger - pal.enger@usit.uio.no; Torgeir Ruden - t.a.ruden@usit.uio.no; Bjørn-Helge Mevik - b.h.mevik@usit.uio.no; Fabien Burki - burkif@interchange.ubc.ca; Andreas Botnen - andreas.botnen@gmail.com; Kamran Shalchian-Tabrizi* - Kamran@bio.uio.no* Corresponding author    AbstractBackground: Large multigene sequence alignments have over recent years been increasinglyemployed for phylogenomic reconstruction of the eukaryote tree of life. Such supermatrices ofsequence data are preferred over single gene alignments as they contain vastly more informationabout ancient sequence characteristics, and are thus more suitable for resolving deeply divergingrelationships. However, as alignments are expanded, increasingly numbers of sites with misleadingphylogenetic information are also added. Therefore, a major goal in phylogenomic analyses is tomaximize the ratio of information to noise; this can be achieved by the reduction of fast evolvingsites.Results: Here we present a batch-oriented web-based program package, named AIR that allows1) transformation of several single genes to one multigene alignment, 2) identification ofevolutionary rates in multigene alignments and 3) removal of fast evolving sites. These threeprocesses can be done with the programs AIR-Appender, AIR-Identifier, and AIR-Remover (AIR),which can be used independently or in a semi-automated pipeline. AIR produces user-friendlyoutput files with filtered and non-filtered alignments where residues are colored according to theirevolutionary rates. Other bioinformatics applications linked to the AIR package are available at theBioportal http://www.bioportal.uio.no, University of Oslo; together these greatly improve theflexibility, efficiency and quality of phylogenomic analyses.Conclusion: The AIR program package allows for efficient creation of multigene alignments andbetter assessment of evolutionary rates in sequence alignments. Removing fast evolving sites withthe AIR programs has been employed in several recent phylogenomic analyses resulting inimproved phylogenetic resolution and increased statistical support for branching patterns amongthe early diverging eukaryotes.Published: 28 October 2009BMC Bioinformatics 2009, 10:357 doi:10.1186/1471-2105-10-357Received: 21 April 2009Accepted: 28 October 2009This article is available from: http://www.biomedcentral.com/1471-2105/10/357© 2009 Kumar et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.Page 1 of 7(page number not for citation purposes)BMC Bioinformatics 2009, 10:357 http://www.biomedcentral.com/1471-2105/10/357BackgroundA well-resolved phylogenetic tree demonstrating the rela-tionships between species is one of the most importantgoals in evolutionary biology, and the fundament forcomparative studies in many fields in life science. Multi-ple gene sequence data is increasingly being used toresolve phylogenetic relationships, and frequently morethan 50 genes are being inferred to address key questionsabout the early evolution of eukaryotes [1-8]. Recent stud-ies have for instance shown support for the grouping ofknown eukaryotes into a handful of supergroups [2,5,9-15]. The main reason for constructing multigene datainstead of using single gene data in phylogenetic recon-struction is to collect enough information to improve thephylogenetic signal [9,16]. Accordingly, as the number ofgenes increases, the tendency is that phylogenetic rela-tionships are better resolved and receive higher statisticalsupport [2,5,16-18]. However, simply adding genes to analignment to increase statistical support does not neces-sarily lead to more accurate results; inconsistencies indatasets may adversely lead to higher support for an incor-rect topology. Reducing such stochastic errors is an impor-tant step in improving the phylogenetic resolution of thesequence data [16,19-21]. Consistency in the data may beimproved by the removal of the fastest evolving sites; assuch sites may have over-representation of substitutionsaturation causing homoplasies [22,23]. However, so faronly a few bioinformatics program has been reported thatallows for the concatenation of multiple single gene align-ment files, identification of fast evolving sites andremoval of fast evolving sites in accordance with the usersneeds.Here we present a bioinformatics package, named AIRthat combines all these possibilities. AIR is divided intothree applications: AIR-Appender, AIR-Identifier and AIR-Remover (Figure 1). AIR-Appender performs separateprocessing of data by appending single gene alignmentfiles to a multi-gene alignment. AIR-Identifier identifiesfast evolving sites by calculating site-rates, and AIR-Remover removes fast evolving sites from an alignment.The AIR programs are interlinked with other applicationsuseful in the field of phylogenomics (i.e., multi-geneBLAST, contig assembly of Sanger and 454 sequences,alignment and phylogeny) through the Bioportal at theUniversity of Oslo.ImplementationThe AIR package is implemented on the Bioportal at theUniversity of Oslo. The Bioportal is a web-based bioinfor-matics service freely available to academic users at the fol-lowing URL: http://www.bioportal.uio.no/. The Bioportaluses SQL for maintaining information about users, files,to maintain the Bioportal, e.g. cron jobs scripts and post-processing scripts, are written in Perl v5.8, and python 2.3.The web-interface for all available applications on Biopor-tal is written in PHP 4.3.Each user of the Bioportal has access to several file direc-tories and file administration functions. All files used asinput for analyses are stored in project folders defined bythe users. Once the user has created a project folder theycan upload data-files into its respective project folders.The user can then use the web interface created for eachapplication on Bioportal to select their files, applications(here for example AIR-Appender, AIR-Identifier, or AIR-Remover) and parameter settings. For each analysis aworking folder is created in the working directory 'jobadmin'. A 'copy home' function in the 'job admin' can beused to transfer files from working directories to projectfolders; hence result files from one process can be used asinput files in subsequent analyses, and to link differentapplications in a semi-automated pipeline. For instance,alignments made by MAFFT [24] can be used for phyloge-netic analyses by one of the available phylogenetic pro-grams e.g. RAxML, Treefinder or MrBayes [25-27]. TheBioportal tutorial is available at the Bioportal website.All successfully submitted Bioportal jobs are run in thebackground, the execution time of each process variesdependent on the file size and the nature of the selectedapplications. To keep track of the status of submitted jobsa manager module has been developed on the Bioportal;this updates the users about the current status of all jobs.Upon completion the results are returned to the respectiveworking directory where files can then be downloaded ina compressed 'zip' format.Currently the Bioportal is the largest high performance-computing environment in Norway. The available com-puter resources are 320 dedicated cores on the TITAN clus-ter at the University of Oslo. In addition, the Bioportal hasaccess to all free or idle TITAN cores if needed (4000 atpresent). The TITAN cluster has LINUX nodes with 16gigabytes of memory and 2× quadcore CPUs or 2× dual-core CPUs.ResultsAppending single gene alignmentsAIR-Appender merges multiple single gene alignment filesinto one major multigene alignment; the program looksfor species with identical names and subsequently mergesthese. If any of the single gene alignments are lacking taxain relation to one another, the program will automaticallyreplace the missing data with question marks '?'. The junc-tion between genes will be marked with double hyphenPage 2 of 7(page number not for citation purposes)databases, and jobs. The Bioportal resources are deployedon Linux with Apache HTTP server 2.2. The critical scriptsfor easy identification of the sequence borders. The result-ing output of AIR-Appender is a single FASTA and PAMLBMC Bioinformatics 2009, 10:357 http://www.biomedcentral.com/1471-2105/10/357Page 3 of 7(page number not for citation purposes)Overview of AIR-packageFigure 1Overview of AIR-package. Overview of the functionalities and programs in the AIR-package installed on the Bioportal: The colored boxes depict input files (red), output files (green), and the AIR programs (Blue). Texts in Italics depict the filename and respective extension of output files of AIR programs. A) AIR-Appender uses several single gene alignments for construction of a multigene alignment. B) AIR-Identifier uses the output file from AIR-Appender and file containing one or more phylogenetic trees for calculating site rates and rate categories. C. AIR-Remover deletes fast evolving sites according to settings defined by the user. The output files from each of the AIR programs can be used in subsequent analysis by copying the files from the work directory to project folder on the Bioportal using the copy home function. Five main output files are produced by AIR. In which two are graphical html files with information about site rates and fast evolving sites (rates.html), and sites removed from the alignment (outfile.html). File 'rates.html' shows the rate categories as different colors (up to 8 categories), while 'outfile.html' shows the removed sites in red color (e.g. category 7 and 8 removed are shown in red), and rest sites in blue. Files namely 'rates' and 'out.ctl' are produced by PAML programs, which are implemented in AIR-Identifier. While 'outfile.ali' is the multigene alignment with fast evolving sites removed.BMC Bioinformatics 2009, 10:357 http://www.biomedcentral.com/1471-2105/10/357formatted file containing the multiple gene alignment(out.fasta in Figure 1); this can be used for downstreamprocessing with AIR-Identifier (or other programs availa-ble on the Bioportal) or downloaded to a local computeras a compressed zip file.Identifying site rateAfter the user has made the multi-gene sequence file,site-rates (i.e. posterior mean values) can then be identi-fied for nucleotides, codons and amino acids sequenceswith the program AIR-Identifier. AIR-Identifier appliesthe PAML programs codeml (for codon and amino acidsequences) and baseml (for nucleotide sequences)[28,29]. The control file (out.ctl in Figure 1) is critical asit is here that the user defines a set of parameters to beused for estimation of site rates by codeml or baseml.These programs are usually only available via the com-mand line, and thus setting parameters for a successfulrun can be a cumbersome task. We have therefore devel-oped AIR-Identifier as a user-friendly web interface forthe PAML programs; here the users can define the param-eters and their respective values (Figure 2). For instance,the evolutionary model for calculation of site-rates, andthe number of rate categories (normally 8 categories) forthe analysis can be defined. Users still have an option touse their own control file that can be uploaded to theBioportal.Two types of files are used to calculate the site rates: 1) amultigene alignment in FASTA format with file extension'.fasta' or PAML format, and 2) a corresponding file con-taining a phylogenetic tree. The tree file should be gener-ated with a suitable phylogenetic programs; the codemland baseml programs are not recommended to recon-struct trees (see the PAML manual [30]). The tree topolo-gies accepted are typically specified using the parenthesisAIR-Identifier Web-InterfaceFigur  2AIR-Identifier Web-Interface. AIR-Identifier web-interface on the Bioportal, where the user can select input files (i.e. sequence alignments and tree file containing phylogenetic trees) and parameters for three types of data; i.e. nucleotides, codons, and amino acids. The sequence files can be in FASTA or PAML format, while single or multiple trees in the tree file Page 4 of 7(page number not for citation purposes)must be in Newick format and supplied in a single file.BMC Bioinformatics 2009, 10:357 http://www.biomedcentral.com/1471-2105/10/357notation such as the Newick tree format [31]. It should benoted that some widely used programs such as PAUP orMacClade [32,33] can produce tree files with limited com-patibility, whereas other programs such as PHYLOBAYESv. 2.3 [34] or RAxML-VI-HPC [27] generate output filesthat are ready to use. Trees with or without branch lengthare accepted by AIR-Identifier.It can often be difficult to decide which phylogeny shouldbe used for estimating rates, especially when a datasetgives differing trees from different evolutionary models,parameters and tree searching algorithms. It has also beenproposed that the selection of phylogeny can have a majorimpact on rate estimation [21]. For this reason we haveconstructed the AIR-Identifier to calculate site rates andrate categories from multiple phylogenetic trees.The AIR-Identifier program produces two output files: 1)A rate file, which contains information about the evolu-tionary rate (rate category) for each site in the alignment(rates in Figure 1); 2) A html file (i.e. rates.html in Figure1) that visually presents information about the rate pat-tern in the alignment and which allow the users to easilyevaluate the importance of the various rate categories andthe dispersal of the site rates along the alignment beforesites are removed; the file also includes an graphical over-view of the alignment where different rate categories havebeen color-coded.Removing fast evolving sitesAIR-Remover is developed for the removal of fast evolvingsites. The sites can be removed based on either site-rate orrate-category. The AIR-remover uses the alignment fileand respective rates file obtained as output from AIR-Iden-tifier. The users can then decide which of the rates and cat-egories of fastest evolving sites should be removed.Multiple categories can be removed by using comma-sep-arated numbers. The users can also remove sites that cor-respond to a fraction of the fastest evolving sites bydefining a percentage of the total rate distribution; it ispossible to remove e.g. the 5% fastest evolving sites (Fig-ure 3). The AIR-Remover output files produces a mainresult file containing the ready to use alignment file (out-file.ali in Figure 1) and an html file (outfile.html in Figure1) that enables the users to visualize the removed sitescolored in red within their alignment.Discussion and conclusionThe AIR package has been extensively used in recentlypublished phylogenomic studies of deeply divergingeukaryote lineages [2,18]. In the study of Burki et al.,2008, a global eukaryote phylogeny was reconstructedfrom a dataset of 135 genes and 65 taxa, resulting in 73%groups Plantae, chromalveoalates and Rhizaria). Whenthe fast evolving sites were identified and removed fromthe alignment with AIR, the same topology was recoveredbut with a substantially increased bootstrap support(97%) for the observed relationship. In the study ofMinge et al. 2008, the evolutionary position of an enig-matic lineage named Breviata was in question using 78genes and 38 taxa. The lineage was placed with strongbootstrap support as sister to the supergroup Amoebozoa,however statistical testing i.e. AU-test [35] of alternativeplacements in the eukaryote tree could not reject a sisterrelationship to another supergroup, the Excavata. Oncefast evolving sites were removed using AIR the AU testcould reject an affinity to the Excavata and additionallyplaced Breviata with the Amoebozoa with higher boot-strap support. Interestingly, the removal of additional fastevolving sites (altogether the 3 fastest rate categories)reduced the bootstrap support for the monophyly of Bre-viata and Amoebozoa, thus suggesting that the removal oftoo many categories or sites can reduce relevant phyloge-netic information in the data. It demonstrates the need fordetailed information about rates in the alignment pro-vided by AIR.The great need for efficient bioinformatic tools in recon-structing multi-gene alignments for phylogenomic infer-ences has over the last years been met by several newapplications, such as Concatenator, IDEA, SCaFoS, IDEAand ASAP [36-40]. Several of these have overlapping func-tionalities with the AIR package, but the AIR is unique incombining key steps for constructing multi-gene align-ments and evolutionary rate estimations. Most impor-tantly AIR allows trimming of alignments according to theevolutionary rates and the users' preferences. Site ratesestimation can be based on multiple phylogenies thataccount for uncertainties in the phylogeny. Several differ-ent criterions can be used for removing sites, either basedon rate categories or site rates, which reduces the possibil-ity of removing too many or few sites from the alignment.Monitoring of the site removal process is easy by using thecolored alignment output files from the AIR.In contrast to the vast majority of other programs, the AIRpackage is easily accessible on the web and does notrequire cumbersome installation on local computers. AIRis implemented on the Bioportal where users have theirown file directories and can access several widely usedprograms in molecular evolution and ecology. The resultfiles from the AIR programs can also be easily down-loaded and applied in downstream analyses at other web-based bioinformatics services (such as http://www.phylo.org and http://bioweb2.pasteur.fr/). Thismakes the AIR package user-friendly and efficient. As AIRPage 5 of 7(page number not for citation purposes)bootstrap support for a single "megagroup" comprisingnearly all photosynthetic lineages (including the super-will process files on a large computer cluster, with theprospect of being linked to a larger grid infrastructure inBMC Bioinformatics 2009, 10:357 http://www.biomedcentral.com/1471-2105/10/357future, there is currently no restriction on the size of theinput sequences.Availability and requirementsProject name: AIR version 1.1Project home page: http://www.bioportal.uio.noOperating system(s): Platform independentProgramming language: SQL, Perl, Python and PHPOther requirements: Apache webserverLicense: GNU - GPLAny restrictions to use by non-academics: AIR-Identifieruses PAML with license for academic use. Non-academicusers still can use AIR-Appender and AIR-Remover athttp://app3.titan.uio.no/biotools/. Test dataset for allprograms of AIR is available at http://www.bioportal.uio.no/onlinemat/online_material.php.Authors' contributionsSK conducted the programming of AIR-Appender, AIR-Identifier and AIR-remover, wrote the paper and imple-mented the applications on the Bioportal. ÅS contributedwith programming of AIR-Appender. RO and FB testedthe AIR programs and contributed with writing of themanuscript. PE contributed with programming andimplementation of the AIR on the Bioportal. ÅS, PE, TR,first draft of the AIR paper. KST and AB initiated theBioportal service, and KST is leading the development ofthe service. All authors read and approved the final man-uscript.AcknowledgementsWe would like to thank Marianne Minge and Jon Bråte for valuable sugges-tions and testing of the AIR package. The Bioportal has been developed as collaboration between bioinformatics groups at USIT headed by Jostein Sundet and Hans Eide and a bioinformatics group in the KST lab. We thank Center of Technology at University of Oslo for maintenance of the TITAN clusters and Research Council of Norway for financing computers through AVIT and FUGE grants to a consortium headed by Kjetill S. Jakobsen at Uni-versity of Oslo. This work is supported by University of Oslo start grant to KST and PhD for Surendra Kumar. The Bioportal service is financially sup-ported by EMBIO, MLS and FUGE initiatives at University of Oslo.References1. Burki F, Pawlowski J: Monophyly of Rhizaria and multigene phy-logeny of unicellular bikonts.  Mol Biol Evol 2006,23(10):1922-1930.2. Burki F, Shalchian-Tabrizi K, Pawlowski J: Phylogenomics revealsa new 'megagroup' including most photosynthetic eukaryo-tes.  Biol Lett 2008, 4(4):366-369.3. Gadagkar SR, Rosenberg MS, Kumar S: Inferring species phyloge-nies from multiple genes: concatenated sequence tree ver-sus consensus gene tree.  J Exp Zoolog B Mol Dev Evol 2005,304(1):64-74.4. Philippe H, Lartillot N, Brinkmann H: Multigene analyses of bilat-erian animals corroborate the monophyly of Ecdysozoa,Lophotrochozoa, and Protostomia.  Mol Biol Evol 2005,22(5):1246-1253.5. Rodriguez-Ezpeleta N, Brinkmann H, Burger G, Roger AJ, Gray MW,Philippe H, Lang BF: Toward resolving the eukaryotic tree: thephylogenetic positions of jakobids and cercozoans.  Curr Biol2007, 17(16):1420-1425.6. Ruiz-Trillo I, Roger AJ, Burger G, Gray MW, Lang BF: A phyloge-nomic investigation into the origin of metazoa.  Mol Biol Evol2008, 25(4):664-672.7. Shalchian-Tabrizi K, Brate J, Logares R, Klaveness D, Berney C, Jakob-sen KS: Diversification of unicellular eukaryotes: crypto-monad colonizations of marine and fresh waters inferredfrom revised 18S rRNA phylogeny.  Environ Microbiol 2008,10(10):2635-2644.8. Shalchian-Tabrizi K, Minge MA, Espelund M, Orr R, Ruden T, JakobsenKS, Cavalier-Smith T: Multigene phylogeny of choanozoa andthe origin of animals.  PLoS ONE 2008, 3(5):e2098.9. Delsuc F, Brinkmann H, Philippe H: Phylogenomics and thereconstruction of the tree of life.  Nat Rev Genet 2005,6(5):361-375.10. Nikolaev SI, Berney C, Fahrni JF, Bolivar I, Polet S, Mylnikov AP,Aleshin VV, Petrov NB, Pawlowski J: The twilight of Heliozoa andrise of Rhizaria, an emerging supergroup of amoeboideukaryotes.  Proc Natl Acad Sci USA 2004, 101(21):8066-8071.11. Philippe H, Lopez P, Brinkmann H, Budin K, Germot A, Laurent J,Moreira D, Muller M, Le Guyader H: Early-branching or fast-evolving eukaryotes? An answer based on slowly evolvingpositions.  Proc Biol Sci 2000, 267(1449):1213-1221.12. Burki F, Shalchian-Tabrizi K, Minge M, Skjaeveland A, Nikolaev SI,Jakobsen KS, Pawlowski J: Phylogenomics reshuffles the eukary-otic supergroups.  PLoS ONE 2007, 2(8):e790.13. Shalchian-Tabrizi K, Kauserud H, Massana R, Klaveness D, JakobsenKS: Analysis of environmental 18S ribosomal RNA sequencesreveals unknown diversity of the cosmopolitan phylum Tel-onemia.  Protist 2007, 158(2):173-180.14. Rodríguez-Ezpeleta N, Brinkmann H, Burey SC, Roure B, Burger G,Löffelhardt W, Bohnert HJ, Philippe H, Lang BF: Monophyly of pri-mary photosynthetic eukaryotes: green plants, red algae,AIR-Remover Web-InterfaceFigur  3AIR-Remover Web-Interface. AIR-Identifier uses rates generated with AIR-Identifier (Figure 1) and the correspond-ing multigene alignment in PAML format. Sites can be removed on the basis of site rates or rate categories.Page 6 of 7(page number not for citation purposes)BHM and AB programmed the Bioportal. KST funded anddesigned the project, supervised the process, wrote theand glaucophytes.  Current Biology 2005, 15(14):1325-1330.15. Keeling PJ: Diversity and evolutionary history of plastids andtheir hosts.  American Journal of Botany 2004, 91:1481-1493.Publish with BioMed Central   and  every scientist can read your work free of charge"BioMed Central will be the most significant development for disseminating the results of biomedical research in our lifetime."Sir Paul Nurse, Cancer Research UKYour research papers will be:available free of charge to the entire biomedical communitypeer reviewed and published immediately upon acceptancecited in PubMed and archived on PubMed Central BMC Bioinformatics 2009, 10:357 http://www.biomedcentral.com/1471-2105/10/35716. Dutilh BE, Huynen MA, Bruno WJ, Snel B: The consistent phyloge-netic signal in genome trees revealed by reducing the impactof noise.  J Mol Evol 2004, 58(5):527-539.17. Bapteste E, Brinkmann H, Lee JA, Moore DV, Sensen CW, Gordon P,Durufle L, Gaasterland T, Lopez P, Muller M, et al.: The analysis of100 genes supports the grouping of three highly divergentamoebae: Dictyostelium, Entamoeba, and Mastigamoeba.Proc Natl Acad Sci USA 2002, 99(3):1414-1419.18. Minge AM, Silberman JD, Orr RJ, Cavalier-Smith T, Shalchian-TabriziK, Burki F, Skjaeveland A, Jakobsen KS: Evolutionary position ofbreviate amoebae and the primary eukaryote divergence.Proc Biol Sci 2009, 276(1657):597-594.19. Brinkmann H, Giezen M van der, Zhou Y, Poncelin de Raucourt G,Philippe H: An empirical assessment of long-branch attractionartefacts in deep eukaryotic phylogenomics.  Syst Biol 2005,54(5):743-757.20. Pisani D: Identifying and removing fast-evolving sites usingcompatibility analysis: an example from the Arthropoda.Syst Biol 2004, 53(6):978-989.21. Rodriguez-Ezpeleta N, Brinkmann H, Roure B, Lartillot N, Lang BF,Philippe H: Detecting and overcoming systematic errors ingenome-scale phylogenies.  Syst Biol 2007, 56(3):389-399.22. Brinkmann H, Philippe H: Archaea sister group of Bacteria? Indi-cations from tree reconstruction artifacts in ancient phylog-enies.  Mol Biol Evol 1999, 16(6):817-825.23. Burleigh JG, Mathews S: Phylogenetic signal in nucleotide datafrom seed plants: implications for resolving the seed planttree of life.  American Journal of Botany 2004, 91(10):1599-1613.24. Katoh K, Kuma K, Toh H, Miyata T: MAFFT version 5: improve-ment in accuracy of multiple sequence alignment.  NucleicAcids Res 2005, 33(2):511-518.25. Ronquist F, Huelsenbeck JP: MrBayes 3: Bayesian phylogeneticinference under mixed models.  Bioinformatics 2003,19(12):1572-1574.26. Jobb G, von Haeseler A, Strimmer K: TREEFINDER: a powerfulgraphical analysis environment for molecular phylogenetics.BMC Evol Biol 2004, 4:18.27. Stamatakis A: RAxML-VI-HPC: maximum likelihood-basedphylogenetic analyses with thousands of taxa and mixedmodels.  Bioinformatics 2006, 22(21):2688-2690.28. Yang Z: PAML: a program package for phylogenetic analysisby maximum likelihood.  Comput Appl Biosci 1997, 13(5):555-556.29. Yang Z: PAML 4: phylogenetic analysis by maximum likeli-hood.  Mol Biol Evol 2007, 24(8):1586-1591.30. Yang Z: 2007 [http://abacus.gene.ucl.ac.uk/software/pamlDOC.pdf].31. The Newick tree format   [http://evolution.genetics.washington.edu/phylip/newicktree.html]32. Maddison WP, Maddison DR: MacClade 4: Analysis of Phylogenyand Character Evolution.  Sinauer Associates, Sunderland, MA;2000. 33. Swofford DL: PAUP*: Phylogenetic Analysis Using Parsimony.(* and other methods).  In ver. 4.0b10 edn Sinauer Associates, Inc.Publishers, Sunderland, MA; 2003. 34. Lartillot N, Philippe H: Computing Bayes factors using thermo-dynamic integration.  Syst Biol 2006, 55(2):195-207.35. Shimodaira H: An approximately unbiased test of phylogenetictree selection.  Syst Biol 2002, 51(3):492-508.36. Pina-Martins F, Paulo OS: Cancatenator: Sequence Data Matri-ces Handling Made easy.  Molecular Ecology Resource 2008,8(6):1254-1255.37. Egan A, Mahurkar A, Crabtree J, Badger JH, Carlton JM, Silva JC:IDEA: Interactive Display for Evolutionary Analyses.  BMC Bio-informatics 2008, 9(1):524.38. Roure B, Rodriguez-Ezpeleta N, Philippe H: SCaFoS: a tool forselection, concatenation and fusion of sequences for phylog-enomics.  BMC Evol Biol 2007, 7(Suppl 1):S2.39. Felsenstein J: PHYLIP (Phylogeny Inference Package) version3.6.  Distributed by he author. Department of Genome Sciences, Uni-versity of Washington, Seattle; 2005. 40. Sarkar IN, Egan MG, Coruzzi G, Lee EK, DeSalle R: Automatedsimultaneous analysis phylogenetics (ASAP): an enablingtool for phlyogenomics.  BMC Bioinformatics 2008, 9:103.yours — you keep the copyrightSubmit your manuscript here:http://www.biomedcentral.com/info/publishing_adv.aspBioMedcentralPage 7 of 7(page number not for citation purposes)


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items