Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Population dynamics and metabolic potential of a pilot-scale microbial community performing enhanced… Lawson, Christopher Evan 2014

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


24-ubc_2014_november_lawson_christopher.pdf [ 3.49MB ]
JSON: 24-1.0166977.json
JSON-LD: 24-1.0166977-ld.json
RDF/XML (Pretty): 24-1.0166977-rdf.xml
RDF/JSON: 24-1.0166977-rdf.json
Turtle: 24-1.0166977-turtle.txt
N-Triples: 24-1.0166977-rdf-ntriples.txt
Original Record: 24-1.0166977-source.json
Full Text

Full Text

Population dynamics and metabolic potential of a pilot-scale microbial community performing enhanced biological phosphorus removal  by  CHRISTOPHER EVAN LAWSON  B.A.Sc. (Civil Engineering), University of British Columbia, 2010  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF  THE REQUIREMENTS FOR THE DEGREE OF  MASTER OF APPLIED SCIENCE  in   THE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES (Civil Engineering)  THE UNIVERSITY OF BRITISH COLUMBIA  (Vancouver)   September 2014  © Christopher Evan Lawson, 2014   ii Abstract    Enhanced biological phosphorus removal (EBPR) is an environmental biotechnology of global importance, essential for protecting receiving waters from eutrophication and enabling phosphorus recovery. Current understanding of EBPR is largely based on empirical evidence and black-box models that fail to appreciate the driving force responsible for nutrient cycling and ultimate phosphorus removal, namely microbial communities. Accordingly, this thesis focused on understanding the microbial ecology of a pilot-scale microbial community performing EBPR to better link bioreactor processes to underlying microbial agents.  Initially, temporal changes in microbial community structure and activity were monitored in a pilot-scale EBPR treatment plant by examining the ratio of small subunit ribosomal RNA (SSU rRNA) to SSU rRNA gene over a 120-day study period. Although the majority of operational taxonomic units (OTUs) in the EBPR ecosystem were rare, many maintained high potential activities, suggesting that rare OTUs made significant contributions to protein synthesis potential. Few significant differences in OTU abundance and activity were observed between bioreactor redox zones, although differences in temporal activity were observed among phylogenetically cohesive OTUs. Moreover, observed temporal activity patterns could not be explained by measured process parameters, suggesting that alternate ecological forces shaped community interactions in the bioreactor milieu.  Subsequently, a metagenome was generated from pilot plant biomass samples using 454 pyrosequencing. Comparison of microbial community metabolism across multiple metagenomes from different environments revealed that EBPR community function was enriched in biofilm formation, phosphorus metabolism, and aromatic compound degradation, reflective of local bioreactor conditions. Population genomes binned from metagenomic contigs showed that M. parvicella genomes displayed remarkable genomic cohesion across EBPR ecosystems, where functional differences related to biofilm formation and antibiotic resistance, likely reflecting adaptation to habitat-specific selection pressures. Additionally, novel metabolic insights into  iii Gordonia spp. in the EBPR ecosystem suggested a potential role for its involvement in polyphosphate and triacylglycerol cycling.  Overall, these findings offer valuable insight on EBPR microbial ecology and will guide future studies aimed at monitoring spatiotemporal patterns in population dyanmics and gene expression. Moreover, this work demonstrates that molecular sequencing approaches can be successfully used to gain deeper insight on microbial communities responsible for wastewater remediation.        iv Preface I was responsible for the design and initiation of this research program with direct input from my supervisors, Dr. Steven Hallam and Dr. Eric Hall. My thesis committee members, Dr. William Ramey, Dr. Barry Rabinowitz, and Dr. Don Mavinic, also made significant contributions to the design of the research program. In Chapter 2, I generated and analyzed the small subunit ribosomal RNA (SSU rRNA) gene amplicons and transcripts from biomass samples collected from the UBC enhanced biological phosphorus removal (EBPR) Pilot Plant. Melanie Scofield and Aria Hahn provided initial training on laboratory protocols and assisted with sample collection. Niels Hanson assisted with data visualization and provided bioinformatic support. Blake Strachan received training from me and assisted with sample collection and subsequent laboratory processing. Sam Bailey, Mike Harvard, Rony Das, and Fred Koch assisted with the operation and maintenance of the UBC EBPR Pilot Plant. I drafted the manuscript with direct input from Dr. William Ramey and Dr. Steven Hallam. Dr. Eric Hall, Dr. Barry Rabinowitz, and Dr. Don Mavinic also provided constructive feedback. Excerpts from Chapter 1 were presented at the 86th Annual Water Environment Federation Technical Exhibition and Conference, Chicago, Illinois, October 9th, 2013 and have been submitted for publication in a peer-reviewed journal:  Lawson, C.E., Strachan, B.J., Hanson, N.W., Hahn, A.S., Hall, E.R., Rabinowitz, B., Mavinic, D.S., Ramey, W.D., & Hallam, S.J.  Microbial community structure and activity in a pilot-scale enhanced biological phosphorus removal ecosystem. In review.  In Chapter 3, I generated and analyzed the EBPR metagenome from a biomass sample collected from the UBC EBPR Pilot Plant. Masaru Nobu from the University of Illinois at Urbana-Champaign performed the metagenomic binning with my interpretation. Excerpts from  v Chapter 2 were presented at the 15th International Symposium on Microbial Ecology, Seoul, Korea, August 29th, 2014 and are in preparation for submission to a peer-reviewed journal.    vi Table of Contents Abstract .....................................................................................................................................ii Preface ...................................................................................................................................... iv Table of Contents ..................................................................................................................... vi List of Tables ............................................................................................................................ ix List of Figures ............................................................................................................................ x List of Symbols and Abbreviations .......................................................................................... xi Acknowledgements .................................................................................................................xiii Dedication ............................................................................................................................... xiv Chapter 1: Introduction - the microbial ecology of enhanced biological phosphorus  removal ...................................................................................................................................... 1 1.1 Phosphorus: a broken biogeochemical cycle ............................................................... 1 1.2 The removal of Phosphorus from municipal wastewaters ........................................... 1 1.3 EBPR biochemical transformations and metabolic models ......................................... 2 1.4 EBPR process configurations ..................................................................................... 6 1.5 Metagenomic insights: an ecosystem model for EBPR ............................................... 7 1.5.1 Polyphosphate-accumulating organisms............................................................... 8 1.5.2 Glycogen-accumulating organisms .................................................................... 10 1.5.3 Filamentous hydrolyzing bacteria ...................................................................... 11 1.5.4 Fermenting bacteria ........................................................................................... 13 1.5.5 Denitrifying bacteria .......................................................................................... 14 1.5.6 Nitrifying bacteria ............................................................................................. 15 1.5.7 Predators: bacteriophage and protozoa ............................................................... 16 1.5.8 EBPR environments: from macro to micro......................................................... 18 1.6 Research motivation and objectives ......................................................................... 20 Chapter 2: Microbial community structure and activity in a pilot-scale EBPR ecosystem .. 22 2.1 Synopsis...................................................................................................................... 22 2.2 Background ................................................................................................................. 23 2.3 Experimental procedures ............................................................................................. 25 2.3.1 Pilot plant operation and sampling ........................................................................ 25 2.3.2 Nucleic acid extraction and cDNA synthesis......................................................... 27  vii 2.3.3 PCR amplification and pyrosequencing of SSU rDNA and cDNA ........................ 27 2.3.4 Processing of pyrotag sequences........................................................................... 28 2.3.5 Statistical analysis ................................................................................................ 28 2.4 Results ........................................................................................................................ 29 2.4.1 SSU rDNA and rRNA sequencing ........................................................................ 29 2.4.2 Overview of microbial community structure and activity ...................................... 32 2.4.3 Relative rRNA abundance across EBPR redox zones ............................................ 35 2.4.4 Abundance and activity of core EBPR taxa........................................................... 35 2.4.5 Temporal dynamics of community structure and activity ...................................... 41 2.5 Discussion................................................................................................................... 44 2.5.1 Rare biosphere is active in EBPR ecosystems ....................................................... 44 2.5.2 Temporal activity patterns suggest high microdiversity within genera ................... 46 2.5.3 Anticipatory life strategy for EBPR microbes? ..................................................... 47 2.6 Concluding remarks .................................................................................................... 48 Chapter 3: Metagenomic analysis of a pilot-scale microbial community performing enhanced biological phosphorus removal ............................................................................... 49 3.1 Synopsis...................................................................................................................... 49 3.2 Background ................................................................................................................. 50 3.3 Experimental procedures ............................................................................................. 51 3.3.1 Sampling .............................................................................................................. 51 3.3.2 DNA extraction and sequencing ........................................................................... 51 3.3.3 Metagenomic assembly and binning ..................................................................... 52 3.3.4 Gene annotation and pathway analysis.................................................................. 52 3.3.5 Genome comparisons ........................................................................................... 53 3.3.6 Prophage and CRISPR reconstruction .................................................................. 53 3.4 Results and discussion ................................................................................................. 54 3.4.1 Sequencing statistics ............................................................................................ 54 3.4.2 Community structure: comparison of pyrotag and metagenomic results ................ 54 3.4.3 Microbial community metabolism ........................................................................ 56 3.4.4 Comparison of population genomes to existing reference genomes ....................... 59 3.4.5 EBPR ecosystem bacteria-phage interactions ........................................................ 70 3.5 Concluding remarks .................................................................................................... 73 Chapter 4: Conclusions and future directions ........................................................................ 75 4.1 Conclusions, limitations, and future directions ............................................................. 75  viii Bibliography ............................................................................................................................ 78 Appendix A – Chapter 2 supplementary material ................................................................. 94 Appendix B – Chapter 3 supplementary material ................................................................ 103     ix List of Tables  Table 2.1 Pilot plant process operations and performance data ................................................... 26  Table 3.1 Metagenome assembly and sequencing statistics ........................................................ 54 Table 3.2 Comparison of community structure based on pyrotag and metagenomic methods ...... 55 Table 3.3 ORFs assigned to capsular and exopolysaccharides metabolism ................................. 59 Table 3.4 Metagenomic contig binning statistics ........................................................................ 59 Table 3.5 Polyphosphate metabolism, M. parvicella .................................................................. 63 Table 3.6 Polyphosphate metabolism, Gordonia spp. ................................................................. 69  Table A1 Sampling and sequencing statistics ............................................................................. 94 Table A2 OTU richness and diversity estimates ......................................................................... 96 Table A3 Abundance and activity of select EBPR taxa .............................................................. 97 Table A4 Indicator OTUs – rDNA abundance ......................................................................... 100 Table A5 Indicator OTUs – SSU rRNA:rDNA ratio ................................................................ 102  Table B1 Marker genes identified in population genome bins .................................................. 105 Table B2 Bin001 (Candidatus ‘Microthrix parvicella’) variable genomic regions .................... 107 Table B3 Bin002 (Gordonia spp.) variable genomic regions .................................................... 117 Table B4 Prophage regions from metagenome ......................................................................... 129 Table B5 Summary of spacer sequences .................................................................................. 136    x List of Figures  Figure 1.1 Nutrient cycling in EBPR bioreactors ......................................................................... 3 Figure 1.2 Original EBPR biochemical model. ............................................................................ 5 Figure 1.3 Bioreactor configurations to achieve EBPR................................................................. 7 Figure 1.4 EBPR ecosystem distributed metabolism. ................................................................... 9 Figure 1.5 Acetate uptake model in Candidatus ‘Accumulibacter phosphatis’ ............................ 11 Figure 1.6 EBPR microbial activity dynamics. ........................................................................... 19 Figure 1.7 Activated sludge floc composition ............................................................................ 20  Figure 2.1 Generic configuration of the UBC EBPR pilot plant.................................................. 25 Figure 2.2 Rarefaction curves for each day in time series. .......................................................... 30 Figure 2.3 Relationship between SSU rDNA and rRNA frequencies .......................................... 32 Figure 2.4 Microbial community structure and activity .............................................................. 33 Figure 2.5 Activity profiles for selected genera. ......................................................................... 40 Figure 2.6 Bray-Curtis dissimilarities between samples. ............................................................ 41 Figure 2.7 UBC EBPR Pilot Plant phosphate (PO4) removal performance .................................. 42 Figure 2.8 RDA bioplots ........................................................................................................... 44  Figure 3.1 Comparison of taxonomic composition using pyrotag and metagenomic methods. .... 56 Figure 3.2 SEED subsystem comparison of microbial metagenomes with the UBC EBPR Pilot Plant metagenome ............................................................................................................. 58 Figure 3.3 M. parvicella pathway comparison ........................................................................... 61 Figure 3.4 Fine-scale comparison of M. parvicella genomes ...................................................... 65 Figure 3.5 Gordonia spp. pathway comparison .......................................................................... 67 Figure 3.6 Prophage coding sequence (CDS) regions reconstructed from metagenome............... 72 Figure 3.7 Total spacers count from EBPR metagenome ............................................................ 73 Figure 3.8 CRISPR spacer-repeat loci (region G4). .................................................................... 73  Figure B1 UDP-D-xylose biosynthesis pathways in M. parvicella spp. .................................... 103 Figure B2 dTDP-L-rhamnose biosynthesis pathways in M. parvicella spp. .............................. 104     xi List of Symbols and Abbreviations  ADP  Adenosine diphosphate AOB  Ammonium oxidizing bacteria ATP  Adenosine triphosphate BLAST  Basic Local Alignment Search Tool BOD  Biochemical oxygen demand BNR  Biological nutrient removal COD  Chemical oxygen demand CRISPR Clustered regularly interspaced short palindromic repeats DNA  Deoxyribonucleic acid DAPI  4',6-diamidino-2-phenylindole EBPR  Enhanced biological phosphorus removal  ED  Entner Doudoroff EMP  Embden–Meyerhoff–Parnas ePGDB  Environmental pathway/genome databases EPS  Extracelluar polymetric substances FISH  Fluorescent in situ hybridization GAO  Glycogen accumulating organism GC  Guanine-cytosine HGT  Horizontal gene transfer HRT  Hydraulic retention time LCFA  Long-chain fatty acid N  Nitrogen NADH  Nicotinamide adenine dinucleotide NH4  Ammonium NDMS  Non-metric multidimensional scaling NOB  Nitrite oxidizing bacteria NOx  Nitrate/nitrite-nitrogen ORF  Open reading frame OTU  Operational taxonomic unit PAO  Polyphosphate accumulating organism  P  Phosphorus PCR  Polymerase chain reaction Pi  Phosphate Pit  Inorganic phosphate transport PMF  Proton motive force PolyP  Polyphosphate Pst  Phosphate-specific transport PHA  Poly-β-hydroxyalkanoate PHB  Poly-β- hydroxybutyrate RDA  Redundancy analysis RNA  Ribonucleic acid rDNA  Ribosomal DNA rRNA  Ribosomal RNA  xii RT  Reverse transcriptase SCFA  Short-chain fatty acid SERC  Staging Environmental Research Centre SRT  Solids retention time SSU  Small subunit TAG  Triacylglycerol  TCA  Tricarboxylic acid TKN   Total Kjeldahl Nitrogen TSS  Total suspended solids TP  Total Phosphorus UBC  University of British Columbia UCT  University of Cape Town    xiii Acknowledgements I would like to acknowledge my supervisor and mentor Dr. Steven Hallam for his unwavering enthusiasm and support during all my academic pursuits.  His constant belief that I could successfully pursue research at the interface of microbial ecology and environmental engineering has truly motivated me to become the scientist I am today. Additionally, I would like to thank Dr. William Ramey for the enormous investment he has made toward my scientific training. I truly cherish the many evening long conversations in Bill’s office learning about microbiology and discussing the philosophies of science. Many thanks also go to Dr. Donald Mavinic and Dr. Eric Hall for supporting me during my Masters degree and allowing me to pursue my passion for an interdisciplinary research project. Without their support, this project would not have been possible.  In the same regard, I must also thank Dr. Barry Rabinowitz for his constant enthusiasm and willingness to participate in my research work, providing his invaluable insight on enhanced biological phosphorus removal. I am also grateful for having worked with members of both the Hallam Lab and Pollution Control and Waste Management group. Their constant support and constructive feedback over the years has made my time at UBC most enjoyable. Finally, I could not have completed this degree without the enormous support of my family. In particular, I wish to thank Christine Tam and Keith Lawson for their continual motivation, support, and investment in my ongoing interests.          xiv To my grandmother, Helen Sorensen, for always encouraging me to follow my dreams. 1 Chapter 1: Introduction - the microbial ecology of enhanced biological phosphorus removal  1.1 Phosphorus: a broken biogeochemical cycle  Phosphorus is an essential nutrient in all forms of life. It is a central molecule in the structure of DNA, cellular membranes, and ATP, and has no known substitute. However, population growth and intensive farming have resulted in a collapse of the natural phosphorus cycle, leading to the deterioration of surface water quality and a shortage of easily mineable deposits of phosphate rock (Elser and Bennett, 2011). The discharge of excess phosphorus to the aquatic environment can promote excessive algal growth and eutrophication, which exerts a substantial oxygen demand on receiving waters with the potential to upset marine food webs (Wright et al., 2012), and produces a variety of products that adversely affect the suitability of water for consumption (Orihel et al., 2012). While this sink of phosphorus creates a serious threat to the aquatic environment and human health, our major reserves of phosphate for food production are diminishing, creating a one-way flow of phosphorus from rocks to farms to lakes and oceans  (Elser and Bennett, 2011). Therefore, the ability to remove and subsequently recover phosphorus from municipal wastewaters is critical to restoring the phosphorus balance, while preventing the deterioration of surface water quality and sustaining available water and nutrient resources for human societies and the biosphere.   1.2 The removal of phosphorus from municipal wastewaters  With the global population becoming increasingly urbanized, the removal of phosphorus from municipal wastewaters has become a crucial aspect of wastewater management. Fundamental to meeting this objective is the enhanced biological phosphorus removal (EBPR) process, which is widely recognized as the most advanced and sustainable treatment technology in application for the removal and recovery of phosphorus from municipal wastewaters (Coats et al., 2011). The  2 process leverages microbial community metabolism, resulting in the accumulation of polyphosphate within the biomass of polyphosphate accumulating organisms (PAO), where it can be subsequently recovered into a commercial-grade fertilizer via struvite crystallization (Britton et al., 2005). However, despite its successful application in Canada and abroad, development of EBPR technology has largely relied on “black-box” empirical approaches to predict complex biological processes, which tend towards oversimplification (Follows and Dutkiewicz, 2011) and neglect key microbial interactions and activities facilitating phosphorus removal (Mino and Satoh, 2006). Consequently, reliable phosphorus removal to regulated limits is difficult to meet with EBPR alone and often requires the use of supplementary chemical precipitation (Johnson and Daigger, 2009). These chemical methods increase treatment plant operating costs due to the large volumes of chemical waste sludge generated and additionally reduce the availability of phosphorus for nutrient recovery. Unpredictable loss or reduced activity of microorganisms responsible for phosphorus removal has been the main observation associated with EBPR process instability (Gu et al., 2010). Accordingly, if EBPR is to fully achieve its potential as an effective and reliable environmental biotechnology, a more comprehensive understanding of the microbial ecology of EBPR ecosystems is needed.    1.3 EBPR biochemical transformations and metabolic models   EBPR is achieved in the activated sludge process by cycling biomass through anaerobic “feast conditions” and aerobic “famine conditions” (Barnard, 1975) (Figure 1.1). This configuration, combined with an anoxic zone for nitrogen removal is termed biological nutrient removal (BNR). In EBPR processes, it is presumed that system performance is largely dictated by the ability of PAO to store intercellular compounds, namely, poly-β-hydroxyalkanoate (PHA), polyphosphate (polyP), and glycogen (Smolders et al., 1995). In the anaerobic zone of EBPR systems, PAO take up short-chain fatty acids (SCFAs), such as acetate, and store them as PHAs, while degrading  3 internally stored polyP and glycogen for energy and reducing equivalents (Smolders et al., 1994a). Effectively sequestering SCFAs during the anaerobic phase is believed   Figure 1.1 Nutrient cycling in EBPR bioreactors (taken from McMahon and Read, 2013). Anaerobic zone receives high soluble phosphate (P) and organic carbon loading from settled wastewater influent (primary effluent). Anaerobic zone characterized by phosphate release and carbon uptake; aerobic zone characterized by phosphate uptake and subsequent P removal in waste activated sludge.    to give PAO a selective advantage over other organisms present in the microbial community for subsequent growth in the aerobic phase, allowing for their proliferation in the system. Under aerobic conditions, internally stored PHA is oxidized and used for growth, conservation of energy, Pi uptake, and glycogen production (Smolders et al., 1994b).  The microbiology of EBPR has been subject to ongoing review (Mino et al., 1998; Seviour et al., 2003; Oehmen et al., 2007; McMahon and Read, 2013; Kang and Noguera, 2014). Two initial metabolic models have been used or expanded upon to describe the biochemical transformations of PAO, based on bulk chemical measurements from lab-scale and pilot-scale systems; namely, the Comeau-Wentzel model (Comeau et al., 1986; Wentzel et al., 1986) and the Mino model (Mino et al., 1987). The models are generally based on acetate as the primary substrate, although PAO can assimilate other soluble substrates, such as propionate and glucose  4 for storage as PHA (Jeon and Park, 2000). The major difference between the two models relates to the origin of reducing equivalents needed for PHA biosynthesis. In the Comeau-Wentzel model, reducing equivalents were assumed to be produced anaerobically by the tricarboxylic acid (TCA) cycle, whereas the Mino model proposed that reducing equivalents were generated from the consumption of internally stored carbohydrates (glycogen) based on experimental observations. Under anaerobic conditions, glycogen was assumed to be converted to pyruvate via the Embden–Meyerhoff–Parnas (EMP) pathway, producing nicotinamide adenine dinucleotide (NADH). This hypothesis was later contested by Pereira et al. (1996), who used in-vivo 13C and 31P nuclear magnetic resonance (NMR) to show that acetate was mainly used for poly-β-hydroxybutyrate (PHB) storage under anaerobic conditions and that the conversion of glycogen via the Entner Doudoroff (ED) pathway appeared to be the main source of reducing equivalents. However, more recent genomic and transcriptomic data unambiguously implicate the EMP pathway as the main route for glycolysis in model PAO (Garcia Martin et al. 2006, He et al., 2011b). It was also suggested that glycogen alone could not provide sufficient reducing equivalents to convert SCFAs to PHA (Pereira et al., 1996), suggesting that other mechanisms, such as the TCA or glyoxylate cycle were active. (Louie et al., 2000; Burow et al., 2008). Therefore, it is likely that both glycogen and variants of the TCA cycle are utilized to generate reducing power, combining the initial ideas of both the Comeau/Wentzel and Mino models. Summaries of the main biochemical transformations that transpire in the anaerobic and aerobic zones of EBPR systems are as follows (Figure 1.2; adopted from McMahon et al., 2010):  Anaerobic (feast) environment: i. SCFAs are rapidly assimilated and stored as PHAs (active transport via the proton motive force, PMF), whose chemical composition depends on the feed carbon substrate. PHB is synthesized with acetate as the carbon source; Poly-β-hydroxyvalerate (PHV) and poly-β-hydroxy-2-methylvalerate (PH2MV) are produced with propionate as the carbon source.   5 ii. Adenosine triphosphate, ATP, (i.e. “energy currency”) is produced from the transfer of an energy-rich phosphoric group from intracellular polyP to adenosine diphosphate (ADP), resulting in the release of cations and Pi to the bulk liquid.  iii. Intracellularly stored glycogen is degraded for the production of ATP and reducing equivalents (NADH).  Aerobic (famine) environment: iv. Stored PHA is catabolized through the TCA cycle as a carbon and energy source for biomass growth. Portions of the carbon and ATP produced are used for regeneration of polyP and glycogen.  v. Pi levels in the bulk liquid decrease, coupled to an increase in intercellular polyP levels. vi. Biomass storage carbohydrates (i.e. glycogen) are replenished.   Figure 1.2 Original EBPR biochemical model adopted from Comeau et al. (1986) and Mino et al. (1987). Ac-: acetate; H+: hydrogen ion; PAO: polyphosphate accumulating organism; polyP: polyphosphate; gly: glycogen; TCA: tricarboxylic acid.                  PolyP PHA H+ Ac- Ac-CoA Ac- Pi pool H+ Pi H+ energy PHA PolyP Ac-CoA TCA cycle Pi energy cell growth O2 Anaerobic Phase Aerobic Phase PAO  Gly Gly NADH TCA Cycle +  6 1.4 EBPR process configurations  Several bioreactor configurations exist for the process design of EBPR treatment plants (Figure 1.3). Design and development of EBPR treatment processes has largely been empirical (Oldham and Stevens, 1984; Oldham, 1985; Barnard, 1998), however, several key design considerations are based on the initially proposed EBPR metabolic models (Section 1.3; Fuhs and Chen, 1975; Comeau et al., 1986; Wentzel et al., 1986). These considerations include the following.  i. Sufficient availability of SCFAs to drive PAO carbon storage and phosphorus release (Rabinowitz and Oldham, 1986). Approximately 7 to 10 mg of acetate are needed to remove 1 mg of phosphorus, based on experimental evidence (Grady et al., 2011). To ensure optimal phosphorus removal, particularly in regions with colder wastewater temperatures, SCFAs are added to the anaerobic zone using external carbon sources (e.g. sodium acetate) or pre-fermentation of primary sludge (Rabinowitz and Oldham, 1986).  ii. Alternating anaerobic-aerobic regimes to drive PHA and polyP cycling (Nicholls and Osborn, 1979). Spatial separation of electron donor and electron acceptor selects for bacteria capable of polymer storage, ultimately resulting in excess phosphorus assimilation due to polyP synthesis requirements in the aerobic zone.  iii. Strict maintenance of anaerobic conditions in the anaerobic zone (Barnard, 1976). Oxygen or nitrate entering the anaerobic zone is believed to provide denitrifying bacteria (Section 1.5.5) with an alternative electron acceptor for growth and organic carbon consumption, reducing SCFA availability for PAO. Oxygen or nitrate can potentially enter the anaerobic zone through aggressive mixing or recycle activated sludge lines, which should therefore be minimized.  7  Figure 1.3 Bioreactor configurations to achieve EBPR. Advantages and disadvantages of each configuration are summarized in Grady et al. (2011). Q: process flow rate; inf: influent; acetate: external acetate addition; IR: internal recycle flow; RAS: recycle activated sludge; eff: effluent.  1.5 Metagenomic insights: an ecosystem model for EBPR  Recent advances in molecular biology and sequencing throughput have led microbiologists and engineers to appreciate EBPR as a model ecosystem for understanding microbial community metabolism and environmental biotechnology (Nielsen et al., 2012). Previous methodologies based on cultivation-dependent approaches have limited the study of in situ microbial communities performing EBPR at wastewater treatment plants due to isolation-biases and low taxonomic resolution (Seviour and Nielsen, 2010). Indeed, decades of research were spent searching for the “super bug” responsible for EBPR, once believed to be Acinetobacter spp. (Fuhs and Chen, 1975). Cultivation-independent approaches based on high-throughput sequencing have  8 now revealed that the majority of microorganisms in natural and engineered ecosystems are uncultured (Amann et al., 1998; Hugenholtz et al., 1998; Rappé and Giovannoni, 2003). These approaches are increasing being employed to resolve central questions in microbial ecology; namely who are the key microbial players, what are their functions, and how do they interact? The answers can in turn be used to realize engineering objectives related to managing microbial communities for societies benefit (e.g. biological nutrient removal).  Culture-independent analysis of microbial communities across multiple full-scale wastewater treatment plants has revealed that PAO represent only a minor fraction of microorganisms in EBPR ecosystems (Nielsen et al., 2010; Nielsen et al., 2012). These studies have also shown that EBPR ecosystems are diverse, but share a core microbiome (defined at the genus-level), despite differences in treatment plant layout, operations, and wastewater characteristics (Nielsen et al., 2010; Zhang et al., 2012; Nielsen et al., 2012). Core microorganisms of known functional relevance to EBPR ecosystems include PAO, glycogen-accumulating organisms (GAO), hydrolyzers, fermenters, nitrifiers, denitrifiers, and predators (viruses and protozoa) (Figure 1.4). A summary of the core EBPR microbiome is presented below.   1.5.1 Polyphosphate-accumulating organisms   Culture-independent methods have identified the betaproteobacterial Rhodocyclus-related Candidatus ‘Accumulibacter phosphatis’ (hereafter, Accumulibacter) and the actinobacterial genus Tetrasphera to be important PAOs, based on their abundance and activity in full-scale EBPR ecosystems (Hesselmann et al., 1999; Maszenen et al., 2000; Zilles et al., 2002; Kong et al, 2004; Kong et al., 2005). While the ecophysiology of Accumulibacter agrees with the initially proposed metabolic models (Section 1.3), the ecophysiology of Tetrasphera is markedly different (He and McMahon, 2012; Kristiansen et al., 2012). Under anaerobic conditions, Kristiansen et al.  9 (2012) proposed that Tetrasphera PAO synthesize glycogen granules using energy generated through polyP degradation and glucose fermentation. Under subsequent aerobic conditions, stored glycogen is catabolized to provide energy for growth and polyP replenishment needed for subsequent anaerobic metabolisms (Kristiansen et al., 2012). Both Accumulibacter and Tetrasphera are also believed to be capable of denitrification (Kong et al., 2004; Flowers et al., 2008; Kristiansen et al., 2012). Whether this is completed independently or in concert with other microbial partners has yet to be determined (Flowers et al., 2013; Kim et al., 2013).    Figure 1.4 EBPR ecosystem distributed metabolism. Hydrolyzing bacteria, such as Chloroflexi convert larger macromolecules into soluble carbon molecules that are subsequently fermented into short-chain fatty acids by fermenting bacteria, such as Streptococcus. PAO, such as Accumulibacter assimilate SCFAs and store them as poly-β-hydroxyalkanoates under anaerobic conditions. Both bacteriophage and grazers (predators) modulate EBPR community dynamics.   It is important to note that other microorganisms have also been observed to accumulate polyP granules based on in situ 4',6-diamidino-2-phenylindole (DAPI) staining and genomic evidence, including Candidatus ‘Microthrix parvicella’ (hereafter, Microthrix) (Erhart et al., 1997; McIlroy et al., 2013; Wang et al., 2014), Gordonia amarae like organisms (hereafter,  10 Gordonia) (Wong et al., 2005; Beer et al., 2006), and Dechloromonas (Goel et al., 2005; Kong et al., 2007). However, direct evidence of their involvement in phosphorus removal and continuous polyP cycling has yet to be shown. Nevertheless, it seems likely that PAO are not phylogenetically cohesive units, but rather consist of several diverse taxonomic groups that vary among different treatment systems (Mino, 1998; Seviour and Nielsen, 2010).  1.5.2 Glycogen-accumulating organisms   Glycogen-accumulating organisms are considered competitors to PAO in EBPR ecosystems, based on their ability to compete for SCFAs. Two main GAO often identified in lab-scale and some full-scale EBPR ecosystems include the Gammaproteobacteria Candidatus ‘Competibacter phosphatis’ (hereafter, Competibacter) and the tetrad-forming Alphaproteobacteria related to Defluviicoccus vanus (Crocetti el al., 2002; Wong et al., 2004; McIlroy and Seviour, 2009). Ecophysiological differentiation between GAO and PAO results from the ability of GAO to perform anaerobic-aerobic cycling of PHA and glycogen, but not polyP (Cech and Hartman; 1993; Wong and Liu, 2006). As competitors, it is presumed that GAO populations should be minimized for stable phosphorus removal. Known factors that control the balance between GAO and PAO populations include SCFA/P ratio, carbon source, pH, and temperature among other factors, as reviewed by Oehmen et al. (2007).  Recent genomic comparisons have revealed that key metabolic differences between GAO and PAO relate to their phosphate transport systems (McIlroy et al., 2014; Nobu et al., 2014). Here, the genomes of known PAO (i.e. Accumulibacter and Tetrasphera) encode both the high-affinity phosphate-specific transport (Pst) system and the low-affinity inorganic phosphate transport (Pit) system, whereas GAO genomes only encode the Pst system (Garcia Martin et al., 2006; McIlroy et al., 2014; Nobu et al., 2014). Indeed, the Pit system is also missing in Microthrix, which can accumulate polyP, but is not believed to participate in polyP cycling linked  11 to anaerobic SCFA uptake (Andereasen and Nielsen, 2000, McIlroy et al., 2013). In PAO, the Pit system is believed to be essential for generating the proton motive force under anaerobic conditions through export of Pi in symport with protons (Figure 1.5; Saunders et al., 2007; Burow et al., 2008). As such, several researchers have hypothesized that the Pit system is a prerequisite for polyP cycling in EBPR (McIlroy et al., 2014; Nobu et al., 2014); however, further studies are needed to elucidate this possibility.     Figure 1.5 Acetate uptake model in Candidatus ‘Accumulibacter phosphatis’ (taken from Saunders et al., 2007). Proton motive force is generated by export of phosphate in symport with hydrogen ions.   1.5.3 Filamentous hydrolyzing bacteria  Earlier studies based on microscopic identification observed numerous filament morphologies in activated sludge plants, including those configured for EBPR (Eikelboom, 2000). These investigations have been greatly improved by the application of culture-independent methods, which overcome difficulties associated with identification of bacteria based on morphological features (Nielsen et al., 2008). While excess filamentous bacteria result in serious operational  12 problems at wastewater treatment plant due to sludge settling issues and foaming (Seviour and Nielsen, 2010), they are essential for hydrolysis of macromolecules and the generation of low-molecular weight soluble substrates utilized by other community members, such as PAO. Many non-filamentous bacteria have also been implicated in the hydrolysis of macromolecules, including members affiliated with the phyla Bacteroidetes, Firmicutes, and Proteobacteria (Xia et al., 2008). Macromolecule degradation is accomplished by secretion of exoenzymes by hydrolyzing bacteria, including lipases, proteases, esterases, chitinases, galactosidases, glucuronidases, and phosphatases (Krageland et al., 2007). These exoenzymes typically remain associated with its producer cell (surface-associated) or diffuse into the adjacent extracellular polymeric substance (EPS) layer and function by repeatedly fragmenting macromolecules into small enough molecules for assimilation (Confer and Logan, 1998; Wingender et al., 1999).  The abundant macromolecules entering wastewater treatment plants are typically lipids, polysaccharides, and proteins. Bacteria specialized in lipid degradation in EBPR ecosystems include Microthrix (Nielsen et al., 2002) and the Mycolata (mainly Gordonia and Mycobacterium) (Krageland et al., 2007). Previous in situ experimental data and genomic information has provided significant insight into the ecophysiology of the lipid-accumulating Microthrix (Rossetti et al., 2005; McIlory et al., 2013). Metabolic models proposed by McIlroy et al. (2013) suggest that under anaerobic conditions, Microthrix preferentially takes up and accumulates long-chain fatty acids (LCFA), such as triaclglycerols, using energy generated through trehalose and/or polyP degradation or partial oxidation of LCFAs. Under subsequent aerobic conditions, stored triaclglycerols are processed via β-oxidation and ethylmalonyl-CoA pathways into the TCA cycle, providing energy and precursor metabolites for growth (McIlroy et al., 2013). Comparison of the genome of the isolated Microthrix RN1 strain to two metagenomes recovered from full-scale Danish EBPR plants indicates that limited metabolic differences exist between Microthrix strains. This suggests that proposed metabolic models are generally applicable to Microthrix strains across EBPR ecosystems, and that removal of lipids could be a  13 potential control strategy for excess Microthrix growth resulting in sludge bulking and foaming (McIlroy et al., 2013). In comparison, variable substrate uptake patterns have been observed for the Mycolata, where some members utilize LCFAs (e.g. oleic acid) (Soddell et al., 1998) while others take up acetate or glucose (Carr et al., 2006; Kragelund et al., 2007). Indeed this highlights the diversity of mycolic acid species in EBPR ecosystems and indicates that further ecological studies are needed to determine effective control strategies.   Filamentous bacteria implicated in the hydrolysis of polysaccharides and proteins in EBPR ecosystems are commonly affiliated with the phyla Bacteroidetes, Chloroflexi, and candidate division TM7 (Miura et al., 2007; Xia et al. 2007; Kragelund et al., 2009; Yoon et al., 2010; Albertsen et al., 2013). Microbial degradation of polysaccharides and proteins in wastewaters is essential for the generation of monosaccharaides (e.g. glucose) and amino acids, often the rate-limiting step for biological nutrient removal (Dueholm et al., 2001; Morgenroth et al., 2002). Interestingly, a group of protein-hydrolyzing epiphytic rods affiliated with the family Saprospiraceae have been observed to grow attached to filaments belonging to the phyla Chloroflexi, Proterobacteria, and candidate phylum TM7 (Xia et al., 2008). The advantage of epiphytic growth is currently unknown, however, it is hypothesized that such interactions may be symbiotic, where attachment protects epifloral bacteria from washout and, in return, provides amino acid substrates to their filamentous hosts (Xia et al., 2008).    1.5.4 Fermenting bacteria  Fermenting bacteria carry out anaerobic degradation of wastewater organic carbon derived from the hydrolysis of macromolecules. Here, bacteria ferment simple monosaccharides, fatty acids, and amino acids to SCFAs, which are primary substrates for both PAO and denitrifying bacteria (Section 1.5.5) (Vollertsen et al., 2006). SCFA availability is often limiting in EBPR ecosystems, and operators are therefore required to add external sources or use pre-fermentation processes to  14 achieve optimal nutrient removal (Section 1.4; Rabinowitz and Oldham, 1986); an important economic consideration in wastewater treatment.   Previous culture-independent studies using labeling and isotope experiments coupled to fluorescent in situ hybridization (FISH) have identified bacteria affiliated with the phyla Firmicutes, Actinobacteria, and Bacteroidetes as active fermenters in EBPR ecosystems (Kong et al., 2008; Nielsen et al., 2012). Kong et al. (2008) demonstrated that the genera Streptococcus and Tetrasphera were dominant monosaccharide fermenters in full-scale EBPR plants, where main fermentation products (in descending order) were propionic acid, lactic acid, acetic acid, and formic acid. Subsequent studies by Nielsen et al. (2012) were consistent with this, and further implicated the genera Propionicimonas (Actinobacteria) and Lactococcus (Firmicutes) in active fermentation. Indeed, these studies reveal that fermentation is diverse in EBPR ecosystems based on the identification of multiple metabolic pathways, which likely ensures a broad mixture of fermentation products are produced to sustain other microbial populations (Kong et al., 2008; Nielsen et al., 2012). Nonetheless, further insight into fermentation processes in EBPR ecosystems is needed to better control and optimize SCFA production for efficient phosphorus removal.       1.5.5 Denitrifying bacteria  Denitrification is widespread among the prokaryotes and allows microorganisms to cope with oxygen-limited conditions. It is the second step in nitrogen removal from municipal wastewaters (with nitrification, discussed below) and is often performed together with EBPR for complete nutrient removal. At full-scale wastewater treatment plants, the addition of an anoxic zone (defined in engineering as the presence of nitrate, in the absence of oxygen) is required to select for the growth of denitrifying organisms. Denitrifying bacteria commonly identified in EBPR ecosystems are affiliated with the families Rhodocyclaceae, Comamonadaceae, and  15 Hyphomicrobiaceae (Kong et al., 2004; Osaka et al., 2006; Hesselsoe et al., 2009), however, many other bacteria are also known to denitrify (Daims & Wagner, 2010). As such, strategies based on phylogenetic markers (e.g. SSU RNA gene) have limited application as process diagnostics for monitoring and control of denitrification at EBPR treatment plants because some organisms that denitrify are not dependent on denitrification and will also grow in non-denitrifying conditions. Instead, functional gene makers based on key enzymes involved in nitrate reduction have been used to monitor in situ activity of denitrifiers in environmental samples (Braker et al., 1998; Taroncher-Oldenburg et al., 2003). Such markers include core genes involved in the reduction of nitrate to nitrogen gas, including the respiratory nitrate reductase (Nar), nitrite reductase (Nir), nitric oxide reductase (Nor), and nitrous oxide reductase (Nos) genes.   1.5.6 Nitrifying bacteria  Nitrification is the first step in nitrogen removal from wastewater. Most nitrogen in municipal wastewater enters the treatment plant as ammonium (NH4) or nitrogen-based organic molecules (urea and proteins). Microbial degradation of urea and proteins by hydrolyzing bacteria (Section 1.4.3) results in the further release of NH4 , through the process of ammonification. In biological wastewater treatment, nitrification is achieved in the aerobic zone by maintaining sufficient solids retention times (SRT) for the proliferation of slow-growing nitrifying bacteria (Grady et al., 2011). Two main groups of bacteria are known to catalyze the transformation of ammonia to nitrate: the ammonium oxidizing bacteria (AOB) and the nitrite oxidizing bacteria (NOB) (Daims & Wagner, 2010). AOB catalyze the oxidation of ammonium to nitrite using the membrane-bound enzyme ammonia monooxygenase (amoA) and the periplasmic enzyme hydroxylamine oxidoreductase (HAO) (Hooper at al., 1997; Olson and Hooper, 1983). Bacteria affiliated with the genus Nitrosomonas are commonly considered the most important AOB in wastewater  16 treatment plants (including EBPR), based on culture-independent surveys of 16S rRNA and amoA gene sequences (Purkhold et al., 2000). However, other microorganisms have also been implicated in ammonia oxidation at wastewater treatment plants, including the bacterial genera Nitrosococcus and Nitrosospira, as well as ammonia oxidizing archaea (Park et al., 2006; Stahl and Torre, 2012).  Subsequent to ammonium oxidation, NOB catalyze the conversion of nitrite to nitrate. NOB are more phylogenetically diverse than AOB, where known NOB include the genera Nitrobacter, Nitrospira, Nitrococcus affiliated with the phylum Proteobacteria and the Nitrospina affiliated with the phylum Nitrospirae (Daims & Wagner, 2010). The key enzyme involved in nitrite oxidation is nitrite oxidoreductase (Nxr) (Spieck et al., 1996; Bock and Wagner, 2006).  Recently, recovery of the Candidatus ‘Nitrospira defluvii’ (hereafter Nitrospira) genome through metagenomic sequencing identified variant nxr genes that differed dramatically from other known nitrite oxidizers from natural ecosystems (Spieck et al., 2006, Lücker et al., 2010). This was consistent with comparative genomic analyses that showed Nitrospira isolates from activated sludge and natural ecosystems were evolutionarily distant because the sludge Nitrospira genome was shaped by horizontal gene transfer (HGT) events with anaerobic ammonium-oxidizing planctomycetes (Lücker et al., 2010). The extent to which HGT has shaped other genomes in EBPR ecosystems remains to be more fully explored. As such, further insight into genomic differentiation processes likely holds promise for identifying functionally important traits specific to survival in EBPR ecosystems.  1.5.7 Predators: bacteriophage and protozoa   Aside from bacteria, bacteriophage (i.e. viruses) and eukaryotic protozoa also play important roles in EBPR ecosystems. While a large number of viruses enter EBPR ecosystems through sewage, our knowledge of bacteriophage ecology and their impact on microbial population  17 dynamics is limited (Otawa et al., 2007). Nevertheless, recent advances in environmental genomics have shed light on important phage-bacteria interactions that transpire in EBPR ecosystems (Kunin et al., 2008; Albertsen et al., 2012). For example, using a combination of metagenomics and community expression analysis, Kunin et al. (2008) revealed that bacteriophage actively prey on globally dispersed Accumulibacter populations, where Accumulibacter adapts locally to phage predation pressures. Here, key genes differentiating Accumulibacter populations were related to EPS gene cassettes, which act as phage defense mechanisms by masking bacteriophage receptors on bacterial cell surfaces (Forde and Fitzgerald, 2003; Kunin et al., 2008). Additional phage defense mechanisms found in the Accumulibacter genomes included the recently discovered CRISPR (clustered regularly interspaced short palindromic repeats)-Cas (CRISPR-associated proteins) adaptive immunity system (Barrangou et al., 2007; Horvath et al., 2010), where the majority of CRISPR spacers were derived from local phage DNA (Kunin et al., 2008). Indeed, this suggests that bacteriophage significantly impact EBPR population dynamics and that future studies are needed to further elucidate phage-bacteria interactions.  Protozoan grazers are the main consumers of bacteria in the environment and play a major role in controlling bacterial biomass and nutrient cycling in EBPR ecosystems (Sherr et al., 2002; Moreno et al., 2010; Warren et al., 2010). Protozoa can reach concentrations of 105 – 106 cells ml-1 in activated sludge and are dominated by a taxonomically diverse range of ciliates (Warren et al., 2010), in addition to flagellates such Giardia, Apicomplexia, , amoebae, and pathogenic protozoa (e.g. Giardia and Cryptosporidium). Microscopic identification of protozoa by plant operators has been used as a convenient indicator for assessing changes in biochemical oxygen demand (BOD) removal performance, although limited work has extended these metrics for assessing phosphorus removal (Madoni et al., 1993; Warren et al., 2010). While most literature indicates that protozoa play direct roles in bacterial consumption and inorganic particle removal, little is known about protozoan feeding dynamics, food selectivity, and interaction with other  18 microbial community members (Petropoulos and Gilbride, 2005; Moreno et al., 2010; Warren et al., 2010).  As such, further understanding of the protozoan ecology in EBPR ecosystems is needed to better predict bacterial responses to protozoan grazing.   1.5.8 EBPR environments: from macro to micro  The co-existence of microorganisms in EBPR ecosystems is influenced by both macro- and micro-scales. Three redox conditions, separated spatially by individual anaerobic, anoxic, and aerobic zones, characterize the EBPR macro environment. While the hydraulic retention time in each zone is between 1 to 10 hours, the solids retention time can be greater than 20 days, requiring EBPR microbes to constantly adapt to changing redox conditions with each bioreactor cycle. Moreover, microorganisms must also adapt to strong nutrient gradients between bioreactor redox zones, which often requires strategies for temporary carbon sequestration and storage (e.g. PHA granules and triacylglycerol). Indeed, these dynamic conditions are believed to select for versatile microorganisms that maintain some level of activity across most redox zones in the ecosystem (Nielsen et al., 2012) (Figure 1.6). Exposure to such dynamic conditions promotes niche partitioning in EBPR ecosystems, as some microorganisms are more suited for growth in one zone versus another (e.g. fermenters versus nitrifiers).  This niche partitioning is further driven by the diversity of substrates present in wastewaters and those generated through macromolecule hydrolysis (Section 1.4.3.). For example, Kindaichi et al. (2013) observed that several core EBPR microbes maintained distinct and highly specialized substrate uptake patterns, even when exposed to variations in substrate and growth conditions. This suggests that strong selection pressures for metabolic specialization exist in EBPR ecosystems, likely driven by biochemical conflicts (Johnson et al., 2012). Such niche partitioning also implies that substrate type (i.e. wastewater composition) plays a substantial role in determining EBPR community structure (Kindaichi et al., 2013).    19  Figure 1.6 EBPR microbial activity dynamics (adopted from Nielsen et al, 2012). Most microorganisms are versatile and maintain some level of activity across the three redox conditions. Hydrolyzing bacteria, such as Microthrix produce exoenzymes in all zones. Fermenting bacteria, such as Streptococcus are highly active in the anaerobic zone, but also can also grow in the aerobic zone because they are tolerant of oxygen and nitrates but do not use them. Polyphosphate accumulating organisms (PAO), such as Accumulibacter actively store acetate under anaerobic conditions and grow under aerobic conditions. Some Accumulibacter strains can also denitrify in the anoxic zone.   Microenvironment conditions, such as those associated with activated sludge floc, also drive community assembly in EBPR ecosystems. Flocs are typically 50-100 μm in diameter and contain a complex mixture of different microcolonies, filamentous bacteria, and EPS (Nielsen et al., 2012) (Figure 1.7). The function of activated sludge floc remains largely unknown, however, many believe microbial floc assemblage is important for biomass retention (Grady et al., 2011), resistance to chemical and enzymatic breakdown (Henriques and Love, 2007), and protection from predation (Forde and Fitzgerald, 2003). Floc formation is also proposed to be the consequence of stochastic and deterministic ecological factors that are complex and poorly understood (van der Gast et al., 2008; Ayarza et al., 2011). Elucidating these factors, and the local interactions between microorganisms in activated sludge floc, is essential for understanding controls on microbial assemblage in EBPR ecosystems.  Wastewater InfluentEffluentAnoxic recycleWASAerobic recycle 20  Figure 1.7 Activated sludge floc composition (taken from Nielsen et al., 2012).    1.6 Research motivation and objectives  The overarching motivation for this research was to gain further insight into the microbial ecology of biological phosphorus removal, such that design principles for improved engineering of EBPR ecosystems can be developed. Resulting principles will support the optimization of existing nutrient removal systems for improved treatment performance and process stability, while also enabling the development of novel EBPR process designs and control strategies.   Chapter 2 examines the activity of abundant and rare (i.e. very low abundance) microorganisms present in the University of British Columbia (UBC) EBPR Pilot Plant using pyrotag sequencing. Currently, little is known about microbial activity dynamics in engineered ecosystems, in particular, among rare members of the biosphere (Sogin et al., 2006; Pedrós-Alió, 2006; Kim et al., 2013). We hypothesized that microbial activity dynamics vary in EBPR ecosystems, both spatially and temporally, and posited that rare microorganisms make important contributions to nutrient cycling, based on recent observations from natural ecosystem (Neufeld et al., 2008; Pester et al., 2010; Campbell et al., 2011; Wihelm et al., 2014).     Chapter 3 examines the functional potential of the UBC EBPR pilot plant using metagenomic sequencing. To date, studies exploring the functional potential of full-scale EBPR  21 communities have been limited, but suggest that EBPR ecosystems manifest a high degree of microdiversity and significant selection pressure from phage (Albertsen et al., 2011; Albertsen et al., 2013).  We hypothesized that local selection pressures in EBPR ecosystems are responsible for genomic differentiation events observed between microbial populations, and believe such differentiation is acquired through mobile gene pools, based on current ecology theory (Polz et al., 2013; Cordero and Polz, 2014).   This thesis had the following objectives: 1. Compile current knowledge on the microbial ecology of biological phosphorus removal. 2. Determine the spatial and temporal activity dynamics of microorganisms in the UBC EBPR Pilot Plant.  3. Assess the functional potential of microbial communities in the UBC EBPR Pilot Plant and compare this potential to other EBPR ecosystems.    22 Chapter 2: Microbial community structure and activity in a pilot-scale EBPR ecosystem  2.1 Synopsis   Enhanced biological phosphorus removal (EBPR) relies on diverse but specialized microbial communities to mediate the cycling and ultimate removal of phosphorus from municipal wastewaters. However, little is known about microbial activity and dynamics in relation to process fluctuations in EBPR ecosystems. Here, temporal changes in microbial community structure and activity were monitored across each bioreactor zone in a pilot-scale EBPR treatment plant by examining the ratio of small subunit ribosomal RNA (SSU rRNA) to SSU rRNA gene (rDNA) over a 120-day study period. Although the majority of OTUs in the EBPR ecosystem were rare, many maintained high potential activities based on SSU rRNA:rDNA ratios, suggesting that rare OTUs contribute substantially to protein synthesis potential in EBPR ecosystems. Few significant differences in OTU abundance and activity were observed between bioreactor redox zones, although differences in temporal activity were observed among phylogenetically cohesive OTUs. Moreover, observed temporal activity patterns could not be explained by measured process parameters, suggesting that other ecological drivers, such as grazing or viral lysis, modulated community interactions. Taken together, these results point towards complex regulatory controls within the EBPR ecosystem based on “anticipatory” life strategies that attune EBPR communities to changing bioreactor redox conditions and nutrient concentrations.  Section of this Chapter have been submitted for publication and were presented at: Lawson, C.E., Rabinowitz B., Mavinic, D.S., Ramey, W.D., Hallam, S.J. (2013). Structure of the active microbial community in a pilot-scale enhanced biological phosphorus removal process revealed through 454-pyrotag sequencing. Proceedings of the 86th Annual Water Environment Federation Technical Exhibition and Conference, Chicago, Illinois, October 5-9, 2013.   23 2.2 Background  Despite its widespread usage, the stability and efficacy of EBPR can be unreliable at full-scale operations due to removal or reduced activity of taxa mediating polyphosphate storage and nutrient cycling (Neethling et al., 2005). The cause of this instability has been unpredictable, as our knowledge of microbial community structure and activity within the EBPR ecosystem is limited. As such, there is a pragmatic interest in charting the structure, function, and activities of microbial communities in EBPR ecosystems to better link bioreactor processes to microbial agents and ultimately improve the design and operation of wastewater treatment plants.  Multiple molecular surveys targeting the small subunit ribosomal RNA gene (SSU rDNA) have been conducted to chart microbial abundance across EBPR ecosystems (Eschenhagen et al., 2003; Hall et al., 2010; Wan et al., 2011; Silvia et al., 2012) and recent studies have improved quantitative insights through the use of high-throughput amplicon sequencing (Zhang et al., 2012; Kim et al., 2013a; Saunders et al., 2013). These studies revealed that despite being diverse, EBPR ecosystems shared a core microbiome composed of relatively stable and abundant taxa (Nielsen et al., 2010; Zhang et al., 2012). Stable isotope probing and microautoradiography have provided several metabolic linkages between bioreactor processes and active taxa, including polyphosphate accumulation (Kong et al., 2005; Kim et al., 2013b), nitrification (Dolinšek et al., 2013), denitrification (Osaka et al., 2006; Thomsen et al., 2007), hydrolysis (Kragelund et al., 2007), fermentation (Kong et al., 2008; Nielsen et al., 2012), and grazing (Moreno et al., 2010). While these studies effectively linked microbial community structure to function in EBPR ecosystems, they neither accounted for total diversity of active microbes nor determined dynamics of these microbial activities in relation to process fluctuations. Moreover, these studies did not consider whether the rare biosphere, defined by the long tail of low-abundance taxa identified in microbial communities (Sogin et al., 2006; Pedrós-Alió, 2006), contributed to EBPR nutrient cycling and process stability.  High-throughput sequencing combining both SSU rDNA and rRNA can be used to  24 measure the abundance and potential activity of abundant and rare taxa within microbial communities (Campbell et al., 2011; Campbell and Kirchman, 2013; Hunt et al., 2013; Wihelm et al., 2014). Here, the potential activity of microbial taxa was inferred based on the ratio of recovered rRNA to rDNA sequences. While this approach does not directly measure in situ microbial growth rates, rRNA can represent past, present, or emerging cellular activities and provide a robust proxy for microbial protein synthesis potential (Blazewicz et al. 2013). Moreover, monitoring rRNA dynamics over time can identify changes in ribosome synthesis and degradation, and inform hypotheses related to life strategies within microbial communities (Lepp and Schmidt, 1998; Barnard et al., 2013). Indeed, the cyclical exposure of microbes to dynamic EBPR redox conditions (anaerobic to aerobic) is believed to support multiple coexisting life-strategies differentiated by population-level metabolic traits (e.g. aerobic respiration versus fermentation) (Nielsen et al., 2012).  In the present study, microbial community dynamics were monitored across each bioreactor zone in a pilot-scale EBPR treatment plant over a 120-day study period using 454 pyrosequencing; this targeted the V6-V8 region of SSU rDNA and rRNA with three-domain resolution. Resulting sequence information was used to explore the potential activity of rare and abundant taxa in EBPR ecosystems and assess whether microbial community activity exhibited temporal variation under steady-state bioreactor conditions. Resulting datasets indicate that a large proportion of rare taxa are potentially active in EBPR ecosystems, similar to natural ecosystems (Jones and Lennon, 2010; Campbell et al., 2011; Hunt et al., 2013; Wilhelm et al., 2014) and that rare taxa may make important contributions to EBPR nutrient cycling and process stability.     25 2.3 Experimental procedures 2.3.1 Pilot plant operation and sampling  This study was conducted at the Staging Environmental Research Centre (SERC; lat. 49.245378, long. -123.22940) located on the UBC campus. SERC is an EBPR pilot plant operating with a municipal wastewater feed and supplementary acetate addition as a SCFA source. The plant uses a University of Cape Town (UCT) activated sludge configuration with a Zee-Weed® membrane system installed in the aerobic zone for solids-liquid separation and is designed for carbon oxidation, nitrification-denitrification, and EBPR (Figure 2.1). The pilot plant was operated under steady-state conditions at a solids retention time (SRT) of 15 days for the duration of the study. Wastewater temperatures ranged from 13 to 20oC. Biomass samples were collected from each bioreactor zone every 2 weeks from February 2013 to May 2013 to characterize microbial community structure and activity (Table A1). To avoid RNA degradation, samples were immediately flash frozen and stored at minus 80oC until further use. Table 2.1 summarizes operating conditions and performance of the treatment plant during the 120-day study period. Chemical oxygen demand (COD), orthophosphate-phosphorus (PO4-P), total phosphorus (TP), ammonium-nitrogen (NH4-N), nitrate/nitrite-nitrogen (NOx), total Kjeldahl nitrogen (TKN), total suspended solids (TSS), and SCFA were measure according to Standard Methods (APHA, 2005).   Figure 2.1 Generic configuration of the UBC EBPR pilot plant. The UBC EBPR pilot plant employs a University of Cape Town (UCT) activated sludge configuration with membrane filtration for solids-liquid separation. Carbon-rich primary effluent enters the anaerobic zone and is mixed with recycle activated  26 sludge from the anoxic zone. Anaerobic conditions are characterized by minimal dissolved oxygen and nitrate concentrations and microbial phosphorus release. Sodium acetate is added to the anaerobic zone as an external carbon source to improve biological phosphorus removal and denitrification. Activated sludge then enters the anoxic zone and is mixed with nitrate-rich activated sludge from the aerobic zone. Dissolved oxygen concentrations are minimal. Here, anaerobic respiration of carbon with nitrate occurs; nitrogen is removed from the system as nitrogen gas. Biomass then enters the aerobic zone where nitrification, carbon oxidation, and phosphorus uptake occur. Dissolved oxygen levels are maintained via air sparging. A submerged membrane unit provides solids-liquid separation; effluent is returned to the sewer system and biomass is retained in the bioreactor, such that all microorganisms experience the same solids retention time governed by the solids wasting rate. Phosphorus removal occurs via solids wastage.   Table 2.1 Process operations and performance data   Parameter unit N Mean Max Min Influent            COD mg/L 25 168.52 ± 41.1 579 59 SCFA mgCOD/L 24 43.24 ± 4.3 67.09 21.21 PO4-P mg/L 98 2.40 ± 0.09 3.35 0.66 TP mg/L 30 3.80 ± 0.3 5.52 2.35 NH4-N mg/L 99 36.90 ± 1.09 48.60 2.82 NOX-N mg/L 30 0.095 ± 0.017 0.36 0.00 TKN mg N/L 19 41.93 ± 4.96 88.20 22.60 Effluent           COD mg/L 25 29.36 ± 11.4 122 0.00 PO4-P mg/L 98 0.29 ± 0.11 3.99 0.00 TP Mg P/L 30 0.24 ± 0.13 1.33 0.00 NH3-N mg/L 99 0.19 ± 0.09 4.70 0.00 NOX-N mg/L 99 16.19 ± 0.8 24.30 4.50 TKN mg N/L 30 1.36 ± 0.35 4.06 0.00 Operational data           pH1 pH 23 7.28 ± 0.09 7.68 6.62 Temperature oC 106 16.37 ± 0.31 20.40 13.00 Dissolved O2 mg/L 110 2.26 ± 0.19 7.20 0.25 TSS1 mg/L 20 4441 ± 266 5350 3230 Influent Flow L/min 22 3.43 ± 0.03 3.6 3.3 Recycle ratio  - 1 - - SRT days - 15 - - HRT hours - 10 - - N, number of measurements. Max, maximum observed value. Min, minimum observed value. 1Measurements were taken in the aerobic zone only.  27 2.3.2 Nucleic acid extraction and cDNA synthesis   Total genomic DNA and RNA was extracted from biomass samples using the FastDNA® Spin Kit for Soil (MP Biomedicals, Solon, OH, USA) and the RNeasy Mini Kit (Qiagen, Valencia, CA, USA), respectively. A 60 second bead-beating step with a spherical ceramic bead (lysing matrix E) at a speed setting of 4.0 on a FastPrep (MP Biomedicals) was used for sample lysis. An on-column DNase I digestion was applied to remove DNA contamination from RNA prior to RT-PCR (Qiagen). Genomic DNA was quality checked using agarose gel electrophoresis and total RNA was quality checked using the Bioanalyzer RNA 6000 Nano assay (Agilent, Santa Clara, CA, USA) to ensure only high quality nucleic acids was used for downstream analysis. Total RNA was reverse transcribed to complementary DNA (cDNA) using random hexamers and a Superscript® III first-strand synthesis kit (Invitrogen). DNA contamination in RNA samples was determined by performing cDNA synthesis reactions without reverse transcriptase (RT).  Reactions without RT were then subject to PCR identically to reactions with RT.   2.3.3 PCR amplification and pyrosequencing of SSU rDNA and cDNA   The V6-V8 region of the SSU rRNA gene was amplified from DNA and cDNA templates using the universal primer pair 926F (5’-AAACTYAAAKGAATTGRCGG-3’) and 1392R (5’-ACGGGCGGTGTGTRC-3’). Primers were modified to include 454 pyrosequencing adaptor sequences and reverse primers included a five base-pair barcode according to previously published protocols (Engelbrektson et al., 2010; Allers and Wright et al., 2012). 50 μl PCR reactions were performed in duplicate and pooled to minimize PCR bias. Each reaction used 0.6 μl Taq DNA Polymerase (5U/ μl), 5 μl 10X PCR buffer, 3 μl magnesium chloride, 4 μl 2mM dNTP mix, 1 μl of each primer, and 10 ng template. Negative controls were included with each reaction to ensure that no contamination of DNA had occurred. Samples were diluted to 10 ng/μl and pooled in equal concentrations prior to sequencing. Emulsion PCR and sequencing were  28 performed at Genome Quebec (Montreal, Canada) on the Roche 454 GS FLX Titanium platform (454 Life Sciences, Branford, CT, USA) according to manufacturer’s instructions.  2.3.4 Processing of pyrotag sequences  A total of 1,432,464 SSU rDNA and rRNA pyrotag sequences were processed using the Quantitative Insights Into Microbial Ecology (QIIME) version 1.4.0 software package (Caporaso et al., 2010). Sequences with less than 150 bases, ambiguous ‘N’ bases, and homopolymer runs were removed before chimera detection. Chimeric sequences were identified with QIIME via ChimeraSlayer and removed prior to taxonomic assignment. A total of 633,521 rDNA and 794,255 rRNA non-chimeric sequences were clustered at 97% into operational taxonomic units (OTUs). Representative sequences from each cluster were queried against the SILVA 111 ribosomal RNA database (Quast et al., 2013) using the Basic Local Alignment Search Tool (BLAST) (Altschul et al., 1990) to assign taxonomy. Singleton OTUs (represented by one read only) were omitted from downstream analysis to reduce over prediction of rare OTUs (Kunin et al., 2010).   2.3.5 Statistical analysis   To make microbial community data more suitable for multivariate analysis, OTU matrices were standardized and Hellinger-transformed (Legendre and Legendre, 1998; Ramette, 2007). SSU rDNA values for rare OTUs that recovered rRNA but not rDNA were imputed using non-parametric multiplicative replacement (Martín-Fernández et al., 2003). Differences in microbial community structure and activity during the study period were explored using non-metric multidimensional scaling (NMDS) based on Bray-Curtis dissimilarity. A stress value was calculated which measures how far the distances in the reduced-space configuration are from being monotonic to the original distances in the OTU matrix (Legendre and Legendre, 1998).   29 Redundancy analysis (RDA) was performed to interpret changes in microbial community structure and activity with process variables including SCFA/P ratio, temperature, dissolved oxygen, effluent nitrate/nitrite, and effluent phosphorus. Permutation tests were conducted to assess the significance of RDA constraints (Legendre et al., 2011). Indicator analysis was used to identify OTUs specifically associated with time periods predefined based on NMDS ordinations. The statistical significance of the indicator value was evaluated using a randomization procedure. NMDS and RDA were performed using the package vegan, version 2.0.8, and indicator analysis was performed using the package labdsv version 1.5.0 in R.  2.4 Results  2.4.1 SSU rDNA and rRNA sequencing   A total of 616,766 rDNA and 778,266 rRNA pyrotag sequences recovered from 72 time-resolved and replicated biomass samples were clustered with a 97% similarity cutoff into 30 946 OTUs after singleton removal (Table A2). These OTUs encompassed 53 phyla affiliated with 402 families spanning Archaea, Bacteria and Eukaryota. Relative activity of microbial populations was inferred based on the ratio of recovered SSU rRNA sequences to rDNA sequences (SSU rRNA:rDNA ratio) for a given time interval. OTUs were considered active if rRNA recovery exceeded rDNA recovery (Rodriguez-Blanco et al., 2009; Jones and Lennon, 2010). Abundant OTUs were arbitrarily defined as having a frequency >1% in at least one sample, intermediate OTUs as having a frequency between 1% and 0.1% in at least one sample, and rare OTUs as having a frequency <0.1% in all samples (Galand et al., 2009). Rarefaction curves constructed from rDNA and rRNA libraries across each bioreactor redox zone and time point indicated that the total diversity of each sample was not reached despite the depth of sequencing (Figure 2.2). 30   Figure 2.2 Anaerobic (red), anoxic (green), and aerobic (blue) zone rarefaction curves for each day in time series. Horizontal axis shows number of reads sampled; vertical axis indicates accumulation of OTUs. Rows 1 and 2 indicate SSU rDNA and rRNA reads, respectfully. Curves that become horizontal with increasing sequencing depth indicate that full sampling depth has been reached.  0 2000 4000 6000 8000 10000 12000010002000300040005000rDNA/anaerobicSequencing Depth# of OTUsAD1AD2AD3AD4AD5AD6AB7AD80 2000 4000 6000 8000 10000 1200001000200030004000rDNA/anoxicSequencing Depth# of OTUsBD1BD2BD3BD4BD5BD6BD7BD80 5000 10000 1500001000200030004000rDNA/aerobicSequencing Depth# of OTUsCD1CD2CD3CD4CD5CD6CD7CD80 2000 4000 6000 8000 10000 12000 1400001000200030004000rRNA/anaerobicSequencing Depth# of OTUsAR1AR2AR3AR4AR5AR6AR7AR80 2000 4000 6000 8000 10000 12000 1400001000200030004000rRNA/anoxicSequencing Depth# of OTUsBR1BR2BR3BR4BR5BR6BR7BR80 5000 10000 15000 2000001000200030004000rRNA/aerobicSequencing Depth# of OTUsCR1CR2CR3CR4CR5CR6CR7CR8 31 The majority of OTUs in the EBPR bioreactor were rare; only 403 OTUs had maximal rDNA abundance greater than 0.1% at any time point sampled (Figure 2.2; Figure 2.3). This is consistent with previous reports on microbial communities from natural (Sogin et al., 2006; Gibbons et al., 2013) and engineered ecosystems (Kim et al., 2013c), where rare taxa are thought to act as a seed bank of dormant microbes capable of resuscitating in response to environmental change (Lennon and Jones, 2011). We observed that on average 29% of all OTUs present in the EBPR bioreactor were dormant (n = 8995), based on OTUs that recovered rDNA sequences but not rRNA sequences. This approach has previously been used to estimate the frequency of dormant cells in natural ecosystems (Lennon and Jones, 2010). Indeed, the majority of these OTUs (>99%) belonged to the rare EBPR biosphere, representing 8% of total community composition based on rDNA. However, many rare EBPR taxa were not dormant and maintained a consistently high level of activity in the bioreactor (see below). Interestingly, 48% of total rRNA abundance was recovered from the rare EBPR biosphere, suggesting that rare taxa contributed almost equally to the community’s protein synthesis potential (Figure 2.3). On average, approximately 90% of eukaryotic OTUs, 65% of archaeal OTUs, and 35% of bacterial OTUs from the rare biosphere were active, based on average SSU rRNA: rDNA ratios. In comparison, only 11% of OTUs with rDNA abundance >0.1% (n = 403) were active. Overall, we observed that rDNA and rRNA for individual OTUs did not correlate well within the EBPR ecosystem (Figure 2.3). This suggests that microbial abundance does not necessarily scale with potential activity in the EBPR milieu. Similar results were found for bacterial communities along an estuarine salinity gradient that experiences wide variation in environmental conditions over intermittent temporal and spatial scales (Sharp et al., 2009; Campbell and Kirchman, 2013).   32  Figure 2.3 Relationship between SSU rDNA and rRNA frequencies for abundant, intermediate, and rare OTUs in the pyrotag dataset, separated by bioreactor redox zone (anaerobic, anoxic, aerobic). Each point reflects paired SSU rRNA and rDNA coordinates for each OTU and time point.   2.4.2 Overview of microbial community structure and activity   The EBPR pilot plant maintained a relatively diverse microbial community structure during the study period, with rDNA datasets manifesting greater diversity overall than rRNA datasets (Table S2). Occasionally, rRNA diversity was greater than rDNA diversity, a difference attributed to increased representation of active eukaryotic taxa. Figure 2.4 summarizes abundance and activity of major taxa identified. When core taxa were defined as OTUs detected across all biomass  33 samples (Shade and Handelsman, 2011), a core microbial community was consistently recovered throughout the 120-day study period.  Figure 2.4 Microbial community structure and activity. Relative abundance of SSU rDNA and rRNA pyrotags for microbial taxa recovered from the UBC EBPR bioreactor (indicated by circles). Circles on right panel indicate corresponding SSU rRNA:rDNA ratio. Differences in abundance and activity between zones for all OTUs were assessed by one-way analysis of variance (ANOVA) to test for significance (p-value <0.05).  SSU rRNA SSU rDNAOther EukaryotaStramenopilesFungiAmoebozoaRotiferaEuglenidaCiliophoraOther BacteriaVerrucomicrobiaOther GammaproteobacteriaXanthomonadalesPseudomonadalesOther DeltaproteobacteriaMyxococcalesOther BetaproteobacteriaOther RhodocyclaceaePropionivibrioZoogloeaThaueraDechloromonasNitrosomonadaceaeCandidatus NitrotogaOther ComamonadaceaeAcidovoraxAlcaligenaceaeOther AlphaproteobacteriaSphingomonadaceaeRhizobialesCaulobacteralesPlanctomycetesNitrospiraGemmatimonadaceaeOther FirmicutesClostridiaLactococcusStreptococcusCyanobacteriaChloroflexiCandidate division TM7TM6Other BacteroidetesSaprospiraceaeChitinophagaceaeFlavobacteriaceaeCytophagaceaeOther ActinobacteriaSolirubrobacteralesPropionibacterialesPeM15GordoniaMycobacteriumCandidatus MicrothrixAcidimicrobiaceaeAcidobacteriaArmatimonadetesOther ArchaeaEuryarchaeotaRedox ZoneRelative AbundancePyrotag Abundance1% 4% 9%16%SSU rRNA: rDNA ratioSSU rRNA: rDNA Ratio1 4 >10LegendAerobicAnoxicAnaerobicFirmicutes      ArchaeaActinobacteriaBacteroidetesEukaryotaProteobacteria 34 On average, the most abundant bacterial phyla (rDNA >1%) were Bacteroidetes (29.6%), Proteobacteria (28.7%), Actinobacteria (24.6%), Planctomycetes (3.8%), Firmicutes (2.8%), Chloroflexi (1.8%), Verrucomicrobia (1.4%), and Nitrospirae (1.0%) (Figure 2.4). Thirty-nine other bacterial phyla had rDNA abundances <1%, including Gemmatimonadetes (0.79%), Acidobacteria (0.77%), Armatimonadetes (0.20%), and Candidate division TM7 (0.19%). With the exception of Bacteroidetes, these results were consistent with other high-throughput sequencing studies of activated sludge (Zhang et al., 2012; Hu et al., 2012). Within bacteria, OTUs affiliated with Candidate phylum TM6, Armatimonadetes, Firmicutes, and Proteobacteria appeared to be highly active, whereas OTUs affiliated with Candidate division TM7, Planctomycetes, Verrucomicrobia, Bacteroidetes, Actinobacteria, and Chloroflexi appeared less active, based on observed SSU rRNA:rDNA ratios (Figure 2.4). The most abundant archaeal phyla included Euryarchaeota (0.20%) and Thaumarchaeota (0.01%), where the majority of OTUs were affiliated with the methanogenic classes Methanobacteria and Methanomicrobia (Figure 2.4). Methanogenic archaea appeared to be particularly active in the EBPR ecosystem, averaging SSU rRNA:rDNA ratios between 2 and 10. While methane production was not measured in this study, active populations of methanogens have previously been observed in activated sludge, although their role in carbon turnover was considered minor (Gray et al., 2002).   On average, the most abundant eukaryotic clades included Opisthokonta (1.8%), Stramenopiles/Alveolates/Rhizaria (SAR) (0.54%), Amoebozoa (0.04%), and Excavta (0.01%). The most abundant eukaryote phyla included LKM11 group (Fungi-related) (1.24%), Ciliophora (0.43%), Ichthyosporea (0.32%), and Rotifera (0.09%) (Figure 2.4). Recently, Evans and Seviour (2012) identified Ascomycota, Basidiomycota, and Cryptomycota as dominant fungi in an EBPR ecosystem using fungal 18S rDNA clone libraries. However, OTUs affiliated with these fungal groups were rare (<0.02%) throughout the 120-day study period. While eukaryotic OTUs represented only 2.4% of rDNA, they represented a significant proportion of the community rRNA, encompassing >35% on average. We observed that Ciliophora, Amoebozoa, Excavta  35 (mainly Euglenida), Rotifera, and Basidiomycota were consistently active in the EBPR ecosystem, with Ciliophora and Amoebozoa averaging SSU rRNA:rDNA ratios >100 (Figure 2.4). Ciliphora and Amoebozoa have recently been identified as active protozoa abundant in activated sludge and play a major role in bacterial grazing and nutrient cycling (Moreno et al., 2010).   2.4.3 Relative rRNA abundance across EBPR redox zones  To test whether potential activity shifted across EBPR redox zones, SSU rRNA:rDNA ratios were compared from the anaerobic, anoxic, and aerobic zones for each OTU at a given time interval. Anaerobic, anoxic, and aerobic zone hydraulic retention times were approximately 1, 3, and 6 hours, respectfully, while the biomass solids retention time was 15 days. We observed that SSU rRNA:rDNA ratios for the majority of OTUs were not statistically different across the redox zones (Figure 2.4). These results were surprising, as we expected SSU rRNA:rDNA ratios to modulate based on changing redox conditions. Additionally, no changes in the relative abundance of SSU rRNA or rDNA were observed, suggesting that OTU abundance and ribosome levels were constant across redox zones. This contrasts previous studies done with EBPR communities enriched in Accumulibacter, which showed that rRNA abundances fluctuated with changing redox conditions in a sequencing batch reactor using reverse transcription quantitative PCR assays (He and McMahon, 2011a).    2.4.4 Abundance and activity of core EBPR taxa  The recently proposed core EBPR microbiome includes microbes mediating key bioreactor processes, including hydrolysis, fermentation, nitrification, denitrification, and biological phosphorus removal (Nielsen et al., 2010; Nielsen et al., 2012). Pyrosequencing allowed us to monitor potential activity of core EBPR taxa (genus-level) affiliated with these processes, in  36 addition to poorly characterized taxa (Table A3). Eukaryote populations commonly implicated in bacterial grazing were also monitored.  Hydrolyzing bacteria were the most abundant taxa in the EBPR ecosystem. However, their potential activity appeared to be the lowest. Abundant hydrolyzers were affiliated with the filamentous bacteria Microthrix and the Mycolata (mycolic-acid containing filaments). Microthrix is a genus specialized in lipid degradation (McIlory et al., 2013), whereas the Mycolata (mainly Gordonia and Mycobacterium) are implicated in the breakdown of lipids and polysaccharides (Krageland et al., 2007). Both groups are also commonly associated with foaming episodes and sludge separation problems (i.e. bulking) in activated sludge (Seviour and Nielsen, 2010). Despite being abundant, their average SSU rRNA:rDNA ratios were typically low (Table A3). This is consistent with previous reports that indicate Gordonia and Microthrix are either slow growing or carry low cellular levels of rRNA (Blackall et al., 1996; de los Reyes and Raskin, 2002; Rossetti et al., 2005). Other known hydrolyzers abundant in the EBPR ecosystem included filamentous bacteria affiliated with the classes Cytophagales, Flavobacteriales, and Sphingobacteriales of the Bacteroidetes and Caldilineae, Thermomicrobia, and Anaerolineae of the Chloroflexi. These groups are known to encompass many of the polysaccharide and protein-hydrolyzing bacteria detected in activated sludge (Miura et al., 2007; Xia et al. 2007; Kragelund et al., 2009; Yoon et al., 2010). SSU rRNA:rDNA ratios for these groups were also low, suggesting they had reduced metabolic activities in the bioreactor or are slow growing (Table A3).  The most abundant known fermenting microorganisms detected in the EBPR ecosystem belonged to the phylum Firmicutes and were affiliated with Streptococcus and Lactococcus. Despite having average rDNA abundances <0.5%, Streptococcus and Lactococcus consistently had the highest potential activity of all bacteria present in the EBPR ecosystem, averaging SSU rRNA:rDNA ratios of >20 (Table A3). These groups are frequently identified in EBPR ecosystems as active fermenters, producing a diverse range of fermentation products, such as  37 short-chain fatty acids (Kong et al., 2008; Nielsen et al., 2012). Other fermenters detected in the EBPR ecosystem were affiliated with the classes Clostridiales and Bacteroidales. SSU rRNA:rDNA ratios indicated that these taxa had moderate activity levels (Table A3).  Polyphosphate accumulating organisms detected in the EBPR ecosystem were affiliated with the genera Propionivibrio and Tetrasphaera. Species of the Propionivibrio were assigned to either uncultured Candidatus ‘Accumulibacter sp.’ or uncultured bacterium and maintained very high SSU rRNA:rDNA ratios (Table A3). Tetrasphaera-related PAO were observed to have very low abundance and potential activity in the EBPR ecosystem (Table A3), in contrast to other studies that have reported high levels of the Actinobacterial-PAO in full-scale systems (Kong et al., 2005; Nguyen et al., 2011; Mielczarek et al., 2013). This suggests that Tetrasphaera was likely not a significant contributor to biological phosphorus removal in our system. Competitors to PAO, known collectively as glycogen accumulating organisms (GAO), were also detected in the bioreactor, albeit at very low rDNA and rRNA abundances (Table A3). Known GAO observed were affiliated with the genus Defluviicoccus; no Competibacter-related GAO were detected. Defluviicoccus appeared to be relatively active (average SSU rRNA:rDNA ratio ~1). However, their relative rRNA abundance (proxy for protein synthesis potential) was still low (>0.1%), indicating they were likely not major competitors to PAO (Table A3).  Known nitrifying community members identified in the EBPR ecosystem included the ammonia-oxidizing bacteria (AOB) Nitrosomonas and Nitrosospira and the nitrite-oxidizing bacteria (NOB) Nitrospira and Candidatus ‘Nitrotoga’.  Nitrosomonas and Nitrosospira were potentially active in the EBPR ecosystem (rRNA:rDNA ratio >2), despite belonging to the rare biosphere and are commonly cited as the main AOB in activated sludge (Zhang et al., 2011) (Table A3). Nitrosococcus-related AOB were also detected at low rDNA abundances and moderate SSU rRNA:rDNA ratios (Table A3), which have previously been observed in activated sludge (Juretschko et al., 1998). Similarly, ammonia-oxidizing archaea (AOA) affiliated with Thaumarchaeota and Marine Group 1 were identified in the rare EBPR biosphere, albeit at low  38 rRNA and rDNA abundances (<0.01%). These groups have been detected in a wide variety of natural and engineering ecosystems, including activated sludge (Park et al., 2006; Stahl and Torre, 2012).  Members of the Nitrospira were abundant NOB in the bioreactor (average rDNA ~ 1%) consistent with previous reports (Luker et al., 2010). However, their SSU rRNA:rDNA ratio was ~ 0.74, on average. Indeed, this agrees with the presumed “K-strategist” lifestyle of Nitrospira, which suggests that these NOB possess a reduced maximum specific growth rate, but are well adapted to low nitrite concentrations allowing for their proliferation in EBPR ecosystems (Nogueira and Melo, 2006). In addition to Nitrospira, Candidatus ‘Nitrotoga’-related NOB were detected. Interestingly, these bacteria had the highest SSU rRNA:rDNA ratios of all nitrifying bacteria, which may reflect their high adaptability to cold-weather climates, such as those experienced in Canada (Alawi et al., 2007; Alawi et al., 2009) (Table A3).  Known denitrifying bacteria typically observed in EBPR ecosystems are affiliated with the families Rhodocyclaceae, Comamonadaceae, and Hyphomicrobiaceae (Kong et al., 2004; Osaka et al., 2006; Hesselsoe et al., 2009). Abundant denitrifiers identified in our system belonged to the Rhodocyclaceae family, including Dechloromonas, Thauera and to a lesser extend Zoogloea (Table A3). Dechloromonas consistently had the highest potential activity of all denitrifiers present in the EBPR ecosystem, averaging SSU rRNA:rDNA ratios >8 (Table A3) and has also been implicated in polyphosphate accumulation (Goel et al. 2005; Kong et al. 2007). Accumulibacter has also been shown to denitrify (Flowers et al., 2009) and was also potentially active as described above.  Eukaryotes affiliated with Peritrichia, Suctoria, and Aspidisca within the Ciliphora and Vannella within the Amoebozoa were observed to be the dominant protozoan OTUs in the EBPR ecosystem (Table A3). These groups had the highest SSU rRNA:rDNA ratios of all taxa in the bioreactor and are implicated in bacterial grazing (Moreno et al., 2010; Nielsen and Seviour, 2010). Other potentially active grazing populations included Euglenida and members of the  39 phylum Cercozoa (both protozoa), as well as Rotifera-related taxa (Rotifers) (Table A3). As the number of rRNA operons and ribosomes per cell drastically differs between eukaryotes and prokaryotes, it is difficult to compare these results to prokaryotic rRNA levels directly (Gong et al., 2013). Nonetheless, the abundance of protozoan rRNA (>35%) in the EBPR ecosystem still suggests that bacterial grazing was likely significant and plays a considerable role in controlling population dynamics and nutrient cycling. In addition to well-characterized taxa, OTUs affiliated with Nannocystis within the Myxococcales (Myxobacteria) were also abundant and potentially active in the EBPR ecosystem. Myxobacteria are known for their ability to aggregate into fruiting bodies, and secrete a number of extracellular compounds including exoenzymes, antibiotics, and bioflocculants (Zhang et al., 2002; Velicer and Vos, 2009). While this may suggest that Nannocystis-related taxa may actively contribute to floc formation in the bioreactor, further work is needed to characterize their phenotype in EBPR ecosystems. It is important to note that variable differences in activity for individual OTUs were observed at the genus level for numerous core EBPR taxa (Figure 2.5). For example, OTU 54595 and 37371 were both highly abundant taxa affiliated with Microthrix that exhibited markedly different activity profiles; OTU 54595 had high SSU rRNA:rDNA ratios on days 48 and 65, whereas OTU 37371 SSU rRNA:rDNA ratios were consistently low. Similar variability between different OTU of the same genera were found for most genera, including Accumulibacter, Dechloromonas, Streptococcus, Lactococcus, and Gordonia (Figure 2.5). This suggests that a high degree of microdiversity exists within EBPR ecosystems with potential implications for bioreactor performance. Interestingly, many OTUs across different genera appeared to share a sharp increase in activity during Days 48 and 65 (Figure 2.5). However, the cause of this increased activity could not be explained by measured process parameters, suggesting that other ecological drivers were at play.   40  Figure 2.5 Activity profiles for selected genera. Profile of relative abundance of SSU rDNA and rRNA pyrotags for selected OTUs (97% similarity cutoff) affiliated with genera over 120-day study period. Solid line indicates SSU rDNA abundance; dashed line indicates SSU rRNA abundance. Profiles on right panel indicate corresponding SSU rRNA:rDNA ratio. OTUs marked with an asterisk (*) were affiliated with uncultured Propionivibrio bacteria.   41 2.4.5 Temporal dynamics of community structure and activity   Although the EBPR community maintained a core population over the 120-day period, temporal dynamics in community structure and activity were still observed. This dynamic behavior was explored by ordination analysis of microbial community structure and activity at each time point using NMDS. As shown in Figure 2.6A, the bioreactor manifested two related community structures during days 1 - 34 (Period I) and days 79 – 120 (Period III), separated by a transition period during days 48 – 65 (Period II). The latter part of this transition period coincided with a process upset in biological P removal performance (days 57 – 65), where removal efficiencies dropped from 100% to ~70% on average (Figure 2.7). Overall changes in potential activity differed from changes in microbial community structure, where activity appeared to be much more variable, likely reflecting the increased responsiveness of rRNA to EBPR ecosystem dynamics (Figure 2.6B) (Barnard et al., 2013).    Figure 2.6 Bray-Curtis dissimilarities between samples (i.e. bioreactor communities) across all time points determined by non-metric multidimensional scaling. (A) NMDS unconstrained ordination of microbial community structure (SSU rDNA abundance) and (B) activity (SSU rRNA:rDNA ratio) for each time point and redox zone.  NMDS/Bray - Stress = 0.074NMDS1NMDS2A1A2A4A3A5A6A7A8B1B2B4B3B5B6B7B8C1C2C4C3C5C6C7C8Period IPeriod IIPeriod III-0.5 0.0 0.5- - Stress = 0.098NMDS1NMDS2A1A2A4A3A5A6A7A8B1B2B4B3B5B6B7B8C1C2C4C3C5C6C7C8Anaerobic (A)Anoxic (B)Aerobic (C)Points Legend1 - Day12 - Day193 - Day 344 - Day 48Plot A - community structure Plot B - community activity-0.5 0.0 0.5 1.0- - Day 656 - Day797 - Day 988 - Day 112 42  Figure 2.7 UBC EBPR Pilot Plant phosphate (PO4) removal performance during the 120-day study period. Triangles represent influent PO4 concentrations, circles represent effluent PO4 concentrations, and diamonds represent PO4 percent removal efficiency. Red lines indicate sampling events. 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 0 20 40 60 80 100 120 PO4  (mg/L) Days in Operation (Jan 31st to May 30th, 2013) UBC EBPR Pilot Plant PO4 Concentrations Influent Effluent % Removal Period I Period II Period III 43 To interpret changes in microbial structure and activity with measured process parameters, redundancy analysis (RDA) was conducted. Here, an attempt was made to explain the variance in microbial community structure and activity by linear combinations of several environmental variables, including influent SCFA/total phosphorus ratio, temperature, dissolved oxygen, and effluent PO4. However, RDA revealed no significant canonical axes (p < 0.05; Figure 2.8), based on permutation testing done using the annova.cca() function in the R package vegan. This indicates that variations along the canonical axes were not distinguishable from random chance alone (Legendre et al., 2011). It is possible that the lack of significant canonical axes was due to a limited number of time points (n=8) and a lack of seasonal resolution. A recent survey by Flowers et al. (2013) suggests that bacterial community structure in EBPR ecosystems can change with seasonal temperature variation. However, other studies have shown only weak correlations between microbial community structure and process parameters (Mielczarek et al., 2013). Unexpectedly, the influent SCFA/total phosphorus ratio did not correlate well with effluent phosphate concentrations (Figure 2.8). Approximately 7 to 10 mg of acetate are needed to remove 1 mg of phosphorus, based on experimental evidence (Grady et al., 2011). Given this, SCFA availability was likely not limiting our ability to achieve low effluent phosphorus concentrations, as the loading of SCFAs to the anaerobic zone of our pilot-scale bioreactor was in excess of what is typically necessary for good biological phosphorus removal (ratio >20), due to the addition of external sodium acetate to the process. A further attempt was made to resolve changes in community structure and activity using indicator analysis. OTUs were considered indicators of a particular time period if their relative frequency during that period was >50% compared to other time periods. We observed that abundant indicator OTUs (>1% rDNA) from Periods I and III were affiliated with Dechloromonas, Thauera, and Hydrogenophilaceae (Table A4). Dechloromonas-related OTUs appeared very active, albeit their potential activity levels were consistently high across all Periods. No indicator OTUs were observed during Period II. Consistent with RDA, indicator  44 analysis revealed no abundant indicator OTUs associated with changes in microbial activity (Table A5).    Figure 2.8 RDA bioplots constrained by four environmental parameters: temperature (T), dissolved oxygen (DO), effluent phosphate (eff PO4), and SCFA/total phosphorus ratio (SCFA/TP). Bioplot A, no statistically significant axes were found: RDA1 (p=0.217), RDA2 (p=0.274). Biplot B, no statistically significant axes were found: RDA1 (p=0.204), RDA2 (p=0.290).  2.5 Discussion 2.5.1 Rare biosphere is active in EBPR ecosystems   In natural and engineered ecosystems much of the microbial diversity is associated with low abundance taxa (i.e. rare biosphere) (Sogin et al., 2006; Shade et al., 2012; Kim et al., 2013c).  While the degree to which rare taxa contribute to ecosystem function remains poorly understood (Pedrós-Alió 2012), recent studies suggest that the rare biosphere maintains a persistent microbial seed bank (Gibbons et al., 2013), containing both active and inactive members (Campbell et al., 2011). Our results indicate that even though 71% of the OTUs in the EBPR ecosystem maintained some activity, a large proportion were inactive or dormant (29%), possibly reflecting the large -5 0 5 10-505RDA Biplot A - community abundance ~scaling 1RDA1 (17% explained, p=0.217)RDA2 (16% explained, p=0.274)Day 1Day 19Day 34Day 48Day 65Day 79Day 98Day 112DOTeff PO4SCFA/P-101-5 0 5 10-505RDA Biplot B - community activity ~scaling 1RDA1 (18% explained, p=0.204)RDA2 (15% explained, p=0.290)Day 1Day 19Day 34Day 48Day 65Day 79Day 98Day 112DOTeff PO4SCFA/P-101 45 changes in nutrient and electron acceptor availability experienced across bioreactor redox zones. Cellular stress response associated with such changes is a well-documented control on dormancy in bacteria (Braeken et al., 2006; Lennon and Jones, 2011) and it is reasonable to expect that these stressors exist in EBPR ecosystems, given the rapid and constant exposure to alternating redox conditions and substrate availabilities. Other factors contributing to dormancy in natural ecosystems include environmental perturbation and predation (Lennon and Jones, 2011). As EBPR ecosystems are subject to fluctuating operating conditions, such as toxic shock loadings (Henriques and Love, 2007), and strong selective pressure from phages (Kunin et al., 2008; Albertsen et al., 2012) these factors likely contribute to dormant microbial seed bank maintenance in engineered ecosystems as well.  Active taxa have also been observed within the rare biosphere of natural ecosystems (Campbell et al., 2011; Hugoni et al., 2013; Wilhelm et al., 2014) and some studies show that rare taxa make important contributions to nutrient cycling (Neufeld et al., 2008; Pester et al., 2010). Our results indicated that many rare taxa in the EBPR ecosystem maintained high activity levels over the study period, based on SSU rRNA:rDNA ratios and increases in rRNA abundances over time. Particular examples were OTUs affiliated with fermenting bacteria Streptococcus and Lactococcus and nitrifying bacteria Nitrosomonas and Candidatus ‘Nitrotoga’. It should be noted that the number of rRNA operons and ribosomes per cell varies between taxonomic groups, especially between eukaryotes and prokaryotes, which could bias the correlation between rRNA abundance and microbial activity found here (Crosby and Criddle, 2003; Blazewicz et al., 2013; Gong et al., 2013). With this caveat in mind, our results remained consistent with a recent study that identified low-abundance bacteria affiliated with Streptococcus and Lactococcus as important glucose fermenters in EBPR ecosystems using stable-isotope probing combined with fluorescent in situ hybridization (Nielsen et al., 2012).  The occurrence of consistently rare and active taxa is interesting from a functional perspective because active taxa are expected to eventually accumulate in abundance over time (Pedrós-Alió, 2012). It is possible that populations within these rare and active taxa did not  46 become abundant due to high maintenance energy demands associated with fluctuating nutrient concentrations and redox conditions across the EBPR ecosystem. Such conditions have the potential to limit growth, particularly for taxa with thermodynamically constrained energy metabolisms, such as fermentation (Russell and Cook, 1995). Another possibility is that rare and active taxa do indeed grow, but experience prompt decay via viral lysis or protozoan grazing (i.e. “top-down” regulation). Similar controls have been evoked in marine ecosystems, where rare taxa exhibiting high activity levels have also been observed (Hunt et al., 2013). Given these constraints, our results indicate that the rare EBPR biosphere might act as a seed bank, but may also contribute to metabolic transformations within the EBPR ecosystem based on the high potential for protein synthesis observed among rare taxa.     2.5.2 Temporal activity patterns suggest high microdiversity within genera  Although it is commonly assumed in EBPR ecosystems that OTU affiliations at the genus level share functional similarity (Zhang et al., 2012; Nielsen et al. 2012), we observed that potential activities between OTUs varied dynamically over time. For example, two abundant OTUs affiliated with Microthrix had considerably different activity levels during the 120-day study period (Figure 2.5). While recent reports indicate that low genomic diversity exists among Microthrix strains (McIlroy et al. 2013), our results suggest that fine-scale differences in genome content or gene expression could select for differential OTU activity or function within the EBPR ecosystem. Indeed, fine-scale population differences have been observed among other phylogenetically cohesive units abundant in EBPR ecosystems, based on taxonomic (He et al., 2007; McMahon et al. 2007) and genomic comparisons (Flowers et al., 2013).  Further investigation into the activity profiles of individual OTUs revealed that some taxa shared a common spike in activity during days 48 and 65. This spike could not be explained by measured process parameters and could reflect increases in grazing activities or viral lysis. In freshwater ecosystems, increased predation by viruses and protists has been shown to stimulate bacterial activity, likely through nutrient and substrate regeneration (Pradeep Ram and Sime- 47 Ngando, 2008; Berdjeb et al., 2011). Interestingly, we observed a considerable increase in amoeboid protozoan activity (Vannellida-related) during days 48 and 65, consistent with increased predation potential (Figure 2.5). However, as ecological interactions between bacteria, viruses, and grazing populations is poorly understood in EBPR ecosystems, further work is needed to interpret the impact these relationships have on EBPR ecosystem function and performance.    2.5.3 Anticipatory life strategy for EBPR microbes?  Our results indicate that the relative abundance of rRNA for individual OTUs was similar across all bioreactor redox zones (anaerobic/anoxic/aerobic). This was surprising, as previous studies using EBPR communities enriched in Accumulibacter showed rRNA abundance changed with sequencing batch reactor conditions (He and McMahon, 2011a). Sequencing batch reactors provide prefect separation between redox zones, whereas the continuous flow process employed in this study constantly recycles biomass from downstream, such that there is substantial mixing between zones. Indeed, the apparent level of similarity in rRNA abundance observed in our study may be due to ribosome carry-over and/or mixing between redox zones and therefore not reflect true differences in rRNA synthesis. Similar results were found at the protein level, where synthesized proteins could not be assigned to a particular redox zone (Wilmes et al., 2008). In a follow-up study, Wexler et al. (2009) used radioactive protein labeling to confirm that differential protein synthesis occurred at very low levels. Based on this information, it is likely that some differential rRNA synthesis transpired, despite our inability to detect differences based on rRNA:rDNA ratios. However, we hypothesize that the majority of ribosome activity is controlled at the translational level, where ribosomes become active or inactive based on a given taxon’s metabolic status. Indeed, minimizing rRNA (and therefore ribosome) synthesis across the rapid and cyclical redox changes encountered in the EBPR milieu would have minimal energy costs compared to constant production of new ribosomes during each bioreactor pass. Nonetheless, future studies are needed to elucidate rRNA regulation processes in continuous flow EBPR  48 ecosystems; process changes that should be more observable under perturbed conditions (He and McMahon, 2011b).    2.6 Concluding remarks  In summary, using a combination of SSU rDNA and rRNA sequencing we show that both rare and abundant taxa are potentially active in the EBPR ecosystem, and posit that rare taxa play important functional roles. Furthermore, we reveal that rRNA:rDNA ratios can vary dramatically among phylogenetically cohesive OTUs. This suggests that fine-scale population differences exist in EBPR ecosystems with the potential to impact process performance. Few changes in OTU rRNA abundance were detected across the EBPR redox zones consistent with “anticipatory” life strategies among EBPR microbiota, and observed temporal activity patterns could not be explained by measured process parameters. Taken together, these results point towards complex regulatory controls within the EBPR ecosystem that may be influenced by microniches (e.g. floc gradients) and localized spatial interactions (Shapiro and Polz, 2014). Given the abundance of potentially active eukaryotic OTUs identified in this study, these regulatory controls may include predation (e.g. grazers), in addition to other ecological drivers. It is important to note that rRNA abundances measured here reflect the potential for protein synthesis, including past, present, and future activities (Blazewicz et al., 2013). As such, additional studies combining labeling and incubation experiments with amplicon and plurality or single cell sequencing are needed to confirm whether rare taxa do indeed make meaningful contributions to nutrient cycling in EBPR ecosystems. When conducted with temporal resolution under different perturbation scenarios, such studies have potential to define ecological design principles needed to engineer improved nutrient cycling and population stability at EBPR wastewater treatment plants.   49 Chapter 3: Metagenomic analysis of a pilot-scale microbial community performing enhanced biological phosphorus removal  3.1 Synopsis  In this study, a metagenome was generated from biomass samples collected from a pilot-scale EBPR treatment plant using 454 pyrosequencing. Resulting DNA sequences were used to compare the diversity of microorganisms and metabolic processes present in the EBPR ecosystem to multiple metagenomes from different environments.  These comparisons revealed that EBPR community function was enriched in biofilm formation, phosphorus metabolism, and aromatic compound degradation, reflective of local bioreactor conditions. Consistent with the taxonomic composition of the pilot-scale EBPR treatment plant obtained by SSU rRNA pyrotag sequencing (Chapter 2), population genomes extracted from assembled metagenomes were affiliated with Candidatus ‘Microthrix parvicella’ and Gordonia spp. These population genomes were subsequently compared with existing references genomes to identify conserved and variable genomic regions. Although the M. parvicella population genome displayed remarkable similarity to other M. parvicella strains, functional differences in biofilm formation and antibiotic resistance often associated with mobile genetic elements reflected adaptation to habitat-specific selection pressures. This was further supported by the presence of prophage and phage defense mechanisms (EPS, restriction-modification systems, CRISPR) recovered from the metagenome. In the case of Gordonia spp. a potentially novel role in polyP and TAG cycling was identified. Taken together, our findings provide deeper insight on EBPR community function and enable future efforts aimed at monitoring spatiotemporal patterns in gene expression.  Sections of this Chapter will be submitted for publication and were presented at: Lawson, C.E., Hall, E.R., Mavinic, D.S., Ramey, W.D., Hallam, S.J. (2014). Metagenomic analysis of a pilot-scale microbial community performing enhanced biological phosphorus removal. Proceedings of the 15th International Symposium on Microbial Ecology, Seoul, Korea, August 24-29, 2014.  50  3.2 Background  Elucidating the identity and function of microorganisms driving enhanced biological phosphorus removal (EBPR) has been a long-standing research objective since the inception of the technology over four decades ago (Mino et al., 1998; Seviour et al., 2003). It is now recognized that EBPR is carried out by a diverse and stable core microbial community that mediates distinct steps in bioreactor nutrient cycling (Nielsen et al., 2012; Section 1.5). While previous investigations have probed the ecophysiology of some core microorganisms, including Accumulibacter, Tetrasphaera, and Microthrix (Nielsen et al., 2002; Zilles et al., 2002; Kong et al., 2005, 2007) limited information exists on the metabolic pathways catalyzing key biochemical transformations within the EBPR milieu. Moreover, ecological considerations that influence the assembly, partitioning, and ultimate control of microorganisms in EBPR ecosystems, such as local selection pressures, remain unclear (McMahon and Read, 2013). Recent advances in sequencing technology permit the study of microbial communities in unprecedented detail (Riesenfeld et al., 2004; Tringe et al., 2005; Garcia Martin et al., 2006). Indeed, this enables reconstruction of the metabolic blueprints underlying microbial controls on bioreactor performance (Tyson et al., 2004; Garcia Martin et al., 2006). To date, only a few studies exploring the metabolic potential of full-scale EBPR communities using metagenomic approaches have been published (Albertsen et al., 2012; Albertsen et al., 2013). Resulting datasets indicate that EBPR ecosystems manifest a high degree of microdiversity and significant selection pressure from phage (Kunin et al., 2008). While these studies have started to generate reference genomes for core EBPR microbes, including Microthrix (McIlroy et al., 2013) and Tetrasphaera (Kristiansen et al., 2012), further efforts are needed to accurately reconstruct metabolic pathways and assess ecological and evolutionary dynamics of core microbial players in the EBPR milieu (Albertsen et al., 2012).  In this study, we examined the structure and functional potential of a pilot-scale microbial community performing EBPR using metagenomic sequencing. Resulting environmental sequence information was used to compare metabolic potential between the EBPR  51 pilot plant and other ecosystems and to investigate genomic variation between available reference genomes and binned population genomes. Our results indicate that the pilot-scale microbial community was enriched with genes for biofilm formation, fatty acid metabolism, and aromatic compound degradation, consistent with reports from other activated sludge ecosystems (Sanapareddy et al., 2009, Albertsen et al., 2012), and highlight that local selection pressures are likely responsible for genomic differentiation within microbial populations from disparate treatment plant locations.     3.3 Experimental procedures 3.3.1 Sampling A total of 9 biomass samples were collected from the anaerobic, anoxic, and aerobic zones (1 per zone in triplicate) of the UBC EBPR Pilot Plant (lat. 49.245378, long. -123.22940). The plant uses a University of Cape Town (UCT) activated sludge configuration with a Zee-Weed® membrane system installed in the aerobic zone for solids-liquid separation and is designed for carbon oxidation, nitrification-denitrification, and EBPR (see Section 2.2.1). Samples were collected on March 5, 2013 from each bioreactor zone and immediately flash frozen in liquid nitrogen and stored at -80oC until further processing.  3.3.2 DNA extraction and sequencing  Total genomic DNA was extracted from biomass samples using the FastDNA® Spin Kit for Soil (MP Biomedicals, Solon, OH, USA). A 60 second bead-beating step with a spherical ceramic bead (lysing matrix E) at a speed setting of 4.0 on a FastPrep (MP Biomedicals) was used for sample lysis. Genomic DNA was quality checked using agarose gel electrophoresis. Library preparation and sequencing were performed at Genome Quebec (Montreal, Canada) on the Roche 454 GS FLX Titanium platform (454 Life Sciences, Branford, CT, USA) according to manufacturer’s instructions. Resulting reads were de-replicated and filtered using a minimum  52 length of 100 bp, and allowing for no ambiguous bases. Reads that did not meet minimum requirements were not used in downstream analysis.  3.3.3 Metagenomic assembly and binning  Filtered reads were subsequently assembled into contigs using the software package Newber with overlap parameters of 95% minimum identity and a minimum length of 40 bp (Margulies et al., 2005). Resulting contigs were binned using the software MaxBin 1.3 developed at the Joint Bioenergy Institute into population genomes with default parameters (Wu et al., 2014). MaxBin is based on an expectation-maximization algorithm and provides genome-related statistics, including estimated completeness, GC content, and genome size (Wu et al., 2014). The taxonomy of resulting population genome bins was assigned using the Metagenome Analyzer (MEGAN) (Huson et al., 2011).     3.3.4 Gene annotation and pathway analysis  Gene annotation and metabolic pathway prediction was accomplished using the in-house MetaPathways 2.0 pipeline (Konwar et al., 2013; Hanson et al., 2014). Briefly, open reading frames (ORFs) were predicted using the Prokaryotic Dynamic Programming Genefinding Algorithm (Prodigal) and queried against the Kyoto Encyclopedia of Genes and Genomes (KEGG), SEED subsystems, Clusters of Orthologous Groups of proteins (COG), RefSeq, and MetaCyc protein databases using the optimized LAST algorithm for functional annotation. Taxonomic annotation of predicted ORFs was accomplished using MEGAN; nucleotide sequences were also queried against the SILVA database to identify SSU rRNA genes. Environmental Pathway/Genome Databases (ePGDB) were subsequently reconstructed from annotated ORFs using Pathway Tools (Karp et al., 2010), which predicts metabolic pathways from MetaCyc: a highly curated database of 2,151 pathways and 14,084 reactions representing all  53 domains of life. Pathway inference was based on reaction coverage of at least 50 percent in a particular pathway and/or the presence of all “key reactions” (Karp et al., 2011).   3.3.5 Genome comparisons  Comparison of binned population genomes with isolate genomes was accomplished using the protein Basic Local Alignment Search Tool (BLASTP). Sequences for complete isolate genomes were downloaded from the National Center for Biotechnology Information (NCBI) website ( and ORFs for both the binned population and isolate genomes were predicted using prodigal (Hyatt et al., 2010). Predicted ORFs from population and isolate genomes were compared using BLASTP and the percent amino acid similarity between best reciprocal BLASTP hits was plotted to identify regions of low similarity. Low amino acid similarity regions that were flanked by mobile genetic element (MGE) signatures (e.g. transposases or integrases) and had variable guanine-cytosine (GC) content were considered putative genomic islands. ORFs were queried against the NCBI RefSeq or non-redundant (nr) database using BLASTP for annotation.   3.3.6 Prophage and CRISPR reconstruction  The Phage Search Tool (PHAST) was used to identify, annotate, and graphically display prophage sequences recovered from the EBPR metagenome based on data clustering algorithms (Zhou et al., 2011). PHAST was also used to evaluate the completeness of putative phage. Clustered regularly interspaced short palindromic repeats (CRISPR) were identified and reconstructed from raw metagenomic reads using Crass: the CRISPR assembly tool (Skennerton et al., 2013). Crass uses short- and long-read algorithms to scan unassembled metagenomic reads to identify and cluster CRISPR loci. Subsequently, graphical methods were used to reconstruct variable spacer arrangements that catalogue the historical interactions between host (i.e. microbe) and virus (Skennerton et al., 2013).   54 3.4 Results and discussion  3.4.1 Sequencing statistics  Pyrosequencing of 9 activated sludge samples collected from the anaerobic, anoxic, and aerobic zones of the UBC EBPR Pilot Plant resulted in 1,208,421 reads after quality filtering with an average read length of 712 bp (Table 3.1). Raw reads were subsequently assembled into contigs using Newbler. Approximately 37% of raw reads assembled into contigs over 300 bp with an average N50 contig size of 4,990 bp and a maximum contig length of 205,236 bp. (Table 3.1). This represents 318 Mb of non-redundant nucleotides, encompassing the equivalent of approximately 70-80 full bacterial genomes across the three redox zones.    Table 3.1 Metagenome assembly and sequencing statistics Number of samples 9 Sequenced reads 1,208,421 Sequenced bases 860,108,034 Avg. read quality 33 Avg. read length 712 ± 26 (s.d.) Contigs 15,137  Reads assembled 446,957  N50 (bp) 4,990  Max. contig length (bp) 205,236  ORF: open reading frame. N50: length of smallest contig in set that contains the fewest (largest) contigs whose combined length represents   ≥ 50% of the assembly.   3.4.2 Community structure: comparison of pyrotag and metagenomic results  Overall, the taxonomic composition of the EBPR microbial community based on metagenomic methods was comparable to results obtained from pyrotag sequencing (Section 2), although quantitative differences were observed among some phyla (Table 3.2). In particular, Bacteroidetes displayed considerable quantitative differences between methods; SSU rDNA pyrotag abundances affiliated with Bacteroidetes accounted for 25% of the community, whereas ORFs assigned to Bacteroidetes accounted for 6.5% of the community.  Further examination revealed that these differences were largely within the orders Flavobacteriales and  55 Sphingobacteriales (especially the family Saprospiraceae) (Figure 3.1). It is possible that the lower Bacteroidetes abundances based on ORF counts reflects the limited number of sequenced representatives present in available databases. Indeed, a lack of sequenced reference genomes strongly biases microbial community structure when employing metagenomic methods due to incorrect annotation of metagenomic reads (Albertsen et al., 2013). Such biases likely occurred in our dataset, as many SSU rDNA sequences were affiliated with uncultured bacteria. Additional caveats associated with pyrotag community analysis may have influenced our results. These include differences in rrn operon copy number between phylogenetic groups and PCR primer biases (Crosby and Criddle, 2003; Pinto and Raskin, 2012). PCR primers biases arise because of poor complexing between primer and template and/or poor primer extension, resulting in non-uniform amplification of SSU rRNA amplicons across all taxa (Pinto and Raskin, 2012).    Table 3.2 Comparison of community structure based on pyrotag and metagenomic methods  Phyla/Class ORF count Metagenome Pyrotags Actinobacteria 397,139  47.8% 32.0% Bacteroidetes 53,895  6.5% 25.0% Betaproteobacteria 131,894  15.9% 14.8% Deltaproteobacteria 39,232  4.7% 3.8% Gammaproteobacteria 40,651  4.9% 2.7% Alphaproteobacteria 79,151  9.5% 5.1% Firmicutes 18,020  2.2% 2.8% Cyanbacteria 9,639  1.2% 0.2% Verrucomicrobia 15,329  1.8% 2.6% Planctomycetes 17,392  2.1% 3.5% Acidobacteria 3,437  0.4% 0.7% Chloroflexi 3,322  0.4% 1.9% Euryarchaeota 1,932  0.2% 0.4% Nitrospirae 7,875  0.9% 0.8% Other 11,625  1.4% 3.7% Total 830,533  100.0% 100.0%   56 Figure 3.1 Comparison of taxonomic composition using pyrotag and metagenomic methods. Pyrotag abundance based on SSU rDNA abundance; metagenome abundance based on ORF counts.  3.4.3 Microbial community metabolism  The overall functional potential of the UBC EBPR metagenome was compared to 45 microbial metagenomes collected from nine distinct biomes (Dinsdale et al., 2008), as well as two previously published wastewater treatment plant metagenomes: the Aalborg East (AAE) EBPR metagenome (Albertsen et al., 2012) and the Mallard Creek activated sludge (Non-EBPR) VerrucomicrobiaePlanctomycetalesNitrospiraceaePseudomonadalesClostridialesBacillalesMyxococcalesRhodocyclaceaeOther BurkholderialesComamonadaceaeSphingomonadalesSaprospiraceaeOther SphingobacterialesFlavobacterialesCytophagales BacteroidalesRhodospirillalesRhodobacteralesRhizobialesNocardiaceaeMycobacteriaceaeCandidatus MicrothrixActinobacteriaAlphaproteobacteriaBacteroidetesBetaproteobacteriaDeltaproteobacteriaFirmicutesGammaproteobacteriaNitrospiraePlanctomycetesVerrucomicrobia2% 4% 6% 8% 10% 12% 14% 16% 18% 20%% Relative abundanceMethodmetagenomepyrotagsComparison of pyrotag and metagenomic results 57 metagenome. Comparisons were made based on the percentage of ORFs assigned to the SEED subsystems (Figure 3.2). Average values for the 45 microbial metagenomes were adopted from Sanapareddy et al. (2009). No photosynthesis was predicted across all WWTPs, consistent with previous observations (Sanapareddy et al., 2009; Albertsen et al., 2012). ORFs involved in capsular and exopolysacchride biosynthesis were also more enriched in the WWTP metagenomes, possibly do to the high selection pressure favoring floc formation in activated sludge systems (Figure 3.2; Table 3.3). As expected, the EBPR metagenomes had a larger fraction of phosphorus metabolism compared with the Non-EBPR metagenome. Additionally, the fraction of ORFs assigned to aromatic compound degradation was greater in the UBC EBPR and Non-EBPR metagenomes. This may be attributed to these facilities reciving higher industrial wastewater loadings (Sanapareddy et al., 2009), albeit aromatic compounds were not directly measured in this study.   To further examine microbial community metabolism in the UBC EBPR Pilot Plant, population genome bins were extracted from metagenomic contigs using MaxBin (Wu et al., 2014), producing four partial genomes (Table 3.4). Here, partial genomes were considered to be representative of a population, as metagenomic assembly algorithms cannot discriminate single-nucleotide polymorphisms (Sharon and Banfield, 2013). While metagenomic assembly algorithms confound analysis of fine-scale heterogeneity, generation of genomes from metagenomes does provide sufficient resolution to study population metabolic potential, diversity, and evolutionary dynamics (Sharon and Banfield, 2013). The two most complete population genomes were taxonomically affiliated with Candidatus ‘Microthrix parvicella’ (Bin 1) and Gordonia spp. (Bin 2). Genome completeness was estimated at 92.5% and 74.8%, respectfully, based on the presence of 107 essential single copy maker genes (Wu et al., 2014) (Table 3.4; Table B1). The M. parvicella genome contained 99 unique marker genes, indicating it was nearly complete (Table B1). A total of 4 out of 99 unique markers were found in duplicate, which may represent some contamination in the population bin or indicate that M. parvicella carries multiple copies of these genes.   58  Figure 3.2 SEED subsystem comparison of microbial metagenomes with the UBC EBPR Pilot Plant metagenome. SEED subsystems contain 23 functional categories of microbial metabolism. Each WWTP metagenome was compared to average values for 45 microbial metagenomes adapted from Sanapareddy et al. (2009).    -20%-40%-60%-80%-100%0%20%40%60%80%100%120%140%160%180%200%Phosphorus MetabolismAromatic CompoundsDNA MetabolismMotility and ChemotaxisRegulation and Cell signalingCofactors, Vitamins, Prosthetic...Sulfur MetabolismCell Wall and CapsuleAmino Acids and DerivativesFattyAcids, Lipids, and IsoprenoidsMembrane TransportStress ResponseNitrogen MetabolismCarbohydratesPotassium metabolismNucleosides and NucleotidesCell Division and Cell CycleRNA MetabolismProtein MetabolismRespirationVirulence ,Disease and DefensePhotosynthesis% differencecompared to45 microbial metagenomesWWTP MetagenomeUBC EBPRNon-EBPRAAE EBPRSEED subsystem comparison of wastewater treatment plant (WWTP) metagenomes 59 Other population bins had either low genome completeness or were taxonomically ambiguous and were not included in downstream analysis.   Table 3.3 ORFs assigned to capsular and exopolysaccharides metabolism SEED capsular and exopolysaccharides categories  UBC EBPR AAE EBPR Non-EBPR Alginate metabolism 857 192 184 Capsular heptose biosynthesis 665 159 125 Capsular polysaccharide (CPS) of Campylobacter 5 2 6 Colanic acid biosynthesis 300 105 99 dTDP-rhamnose synthesis 1253 428 340 Gram-negative cell wall components 2557 728 739 O-Methyl phosphoramidate capsule modification in Campylobacter 102 9 10 Peptidoglycan biosynthesis 4982 1244 1301 Pseudaminic acid biosynthesis 14 3 4 Rhamnose containing glycans 2192 678 554 Serotype determining capsular polysaccharide biosynthesis in Staphylococcus 9 0 3 Sialic acid metabolism 1032 265 303 Xanthan exopolysaccharide biosynthesis and export 21 3 6  Table 3.4 Metagenomic contig binning statistics  Bin  Abundance Completeness (%) Genome size (bp) GC content (%) Bin001 16.47 92.5 4,165,338 66 Bin002 4.43 74.8 4,447,347 67 Bin003 3.72 67.3 4,139,412 63 Bin004 2.53 11.2 3,619,713 71 aCompleteness based on percentage of 107 marker genes identified (Table B1;  Dupont et al., 2012)  3.4.4 Comparison of population genomes to existing reference genomes  Candidatus ‘Microthrix parvicella’ population genome To elucidate local selection pressures driving genomic differentiation events in the EBPR ecosystem, we compared functional differences between the M. parvicella population genome recovered here with previously published M. parvicella genomes isolated from activated sludge: M. parvicella strain Bio17-1 (Muller et al., 2012) and M. parvicella strain RN1 (McIlroy et al.,  60 2013). Genomic differences were considered to be associated with habitat-specific gene pools shaped by local bioreactor conditions (Polz et al., 2013; Cordero and Polz, 2014). Overall, the three M. parvicella-related genomes displayed high sequence homology; strain Bio17-1 and strain RN1 shared 89 and 87 percent amino acid similarity with the M. parvicella population genome, respectfully, based on best reciprocal BLASTP matches using an e-value cutoff of 1e-6. Indeed, this agrees with recent population comparisons that highlighted the low genomic diversity between M. parvicella RN1 and related EBPR community strains from full-scale treatment plants in Denmark (McIlroy et al., 2013).  To verify whether the M. parvicella RN1 metabolic model (McIlroy et al., 2013) could be extended to related strains from the UBC EBPR Pilot Plant, metabolic pathways were compared across all available M. parvicella-genomes using MetaPathways (Konwar et al., 2013; Hanson et al., 2014). From Figure 3.3, it is clear that the majority of metabolic pathways are indeed conserved across M. parvicella strains, including fatty acid β-oxidation and biosynthesis, triacylglycerol (TAG) biosynthesis and degradation, the pentose phosphate pathway, and the Embden-Meyerhof-Parnas (EMP) glycolysis pathway. Genes responsible for polyphosphate (polyP) storage were also identified, including a single polyphosphate kinase (ppk) (Table 3.5).  Despite the remarkable metabolic similarities between M. parvicella strains, minor differences in pathway composition were observed, particularly in strain RN1. Pathways missing from strain RN1 included cell capsule and exopolysaccharide biosynthesis; for example, dTDP-L-rhamnose biosynthesis (Tsukioka et al., 1997; Graninger et al., 2002) and UDP-D-xylose biosynthesis (Coyne et al., 2011; Gu et al., 2011).  61  Figure 3.3 M. parvicella pathway comparison. Columns represent M. parvicella strains; rows represent inferred MetaCyc pathways. Branches indicate clustering of M. parvicella strains (columns) and pathways (rows) based on Manhattan method. Pathways discussed in main text are listed on the right.      Predicted ORFs for these pathways formed syntenic regions conserved across the M. parvicella population genome and strain Bio17-1, but not strain RN1 (Figure B1). Structural variations in exopolysaccharides and/or cell capsule residues are known bacteriophage resistance mechanisms that prevent phage adsorption through physical barriers (i.e. EPS layer) or modifications to cell  62 surface receptors (Hanlon et al., 2001; Labrie et al., 2010). Given that some phage carry substrate specific polysaccharide-degrading enzymes, including polysaccharases and lyases (Sutherland, 1995), these structural differences may be important bacterial adaptive features for evading phage predation in EBPR ecosystems, as recently proposed by others (Kunin et al., 2008; Albertsen et al., 2012; McIlroy et al., 2013).  To further assess functional differentiation between M. parvicella-related strains, fine-scale variations in genome content were examined. Our results identified 21 loci encompassing 274 ORFs in the M. parvicella population genome that were missing in strain Bio17-1 and/or strain RN1 (Figure 3.4; Table B2). Indeed, these ORFs formed discrete gene clusters, often located on putative genomic islands, suggesting they were acquired together and encode some adaptive function (Figure 3.4). Consistent with pathway-level comparisons, annotation of the gene clusters revealed genes encoding cell envelope and exopolysaccharide biosynthesis enzymes. Other annotated gene clusters were associated with heavy metal/antibiotic resistance and antibiotic biosynthesis, as well as restriction-modification systems (possible phage defense). The presence of toxic compound and antibiotic resistance genes on mobile genetic elements agrees with previous studies that examined plasmids from activated sludge microbial communities (Szczepanowski et al., 2008; Zhang et al., 2011; Sentchilo et al., 2013). Acquisition of such resistance mechanisms is likely essential for survival in EBPR ecosystems that receive high loadings of toxic chemicals and antibiotics in wastewater influent, and also contributes to the global dispersal of antibiotic resistant bacteria (Czekalski et al., 2014).    63 Table 3.5 Polyphosphate metabolism, M. parvicella     Gene Protein EC No. M. parvicella UBC M. parvicella  RN1a M. parvicella  Bio17-1 ppk1 polyphosphate kinase 1 1 1 1 ppx Exopolyphosphatase - 2 - adk Adenylate kinase 1 1 1 pap polyP:AMP phosphotransferase - - - ppgK polyphosphate glucokinase 1 1 1 ppnK polyphosphate/ATP NAD kinase - 1 - phoB Pho regulon, DNA-binding response regulator - 2 2 2 phoR Pho regulon,sensory histidine kinase - 1 - phoH Pho regulon, phosphate starvation-inducible protein - 1 1 1 phoU Chaperone-like PhoR/PhoB inhibitory protein - - 1 - pstA Pi ABC transporter - membrane subunit - - 1 - pstB Pi ABC transporter - ATP binding subunit 1 1 1 pstC Pi ABC transporter - membrane subunit - - 1 - pstS Pi ABC transporter - periplasmic binding protein - - 1 - pitA low-affinity Pi transport protein - -  - - ppa Inorganic pyrophosphatase 1 1 1 alp secreted alkaline phosphatase - - - aValues revised based on McIlroy et al. (2013).       64  Figure 3.4 Fine-scale comparison of M. parvicella genomes. X-axis indicates position along the M. parvicella population genome; y-axis indicates the percent amino acid similarity of isolate genomes (M. parvicella RN1 and Bio17-1) with the population, as determined by best reciprocal BLASTP hits. Purple vertical bars indicate putative genomic islands. Arrows indicate predicted ORFs.    65 Gordonia spp. population genome Metabolic pathways and variable genomic regions were also examined in the Gordonia spp. population genome in comparison to 8 closely related Gordonia spp. isolates from different habitats. Our results show that the functional potential of Gordonia spp. from the UBC EBPR Pilot Plant was most closely related to G.amarae (Figure 3.5), consistent with previous studies that isolated Gordonia spp. from activated sludge systems (Soddell et al., 1998; Soddell et al., 2006). Core metabolic pathways inferred across Gordonia spp. were similar to M. parvicella strains, including fatty-acid β-oxidation and biosynthesis, TAG biosynthesis and degradation, the pentose phosphate pathway, and the Embden-Meyerhof-Parnas glycolysis pathway. The glyoxylate cycle was also inferred for most Gordonia spp., which could play a role in generating essential carbon precursors for biosynthesis under anaerobic conditions (Burrows et al., 2008b). The overrepresentation of lipid and fatty acid metabolism is consistent with isolate experiments indicating their use as growth substrates (Soddell et al., 1998) and supports previous observations correlating Gordonia spp. abundance with lipid loading rates in full-scale wastewater treatment plants (Frigon et al., 2006). While this implies that Gordonia spp. in EBPR ecosystems may also be specialized in lipid degradation, previous ecophysiological studies showed that acetate and glucose, not lipids, were the main substrates utilized by Gordonia spp. in situ, indicating that substrate preferences between Gordonia spp. may vary considerably (Carr et al., 2006; Kragelund et al., 2007). Most Gordonia spp., including the population genome binned here, were also capable of nitrate reduction, based on the presence of a respiratory nitrate reductase (nar) gene and nitric-oxide reductase (norQ) gene. This implies that anaerobic respiration for growth under anoxic conditions is possible.  Surprisingly, no Gordonia spp. examined here, including the binned population genome, encoded the potential for polyhydroxyalkanoate (PHA) storage, based on the absence of the key enzyme PHA synthetase (phaC). This contradicts previous ecophysiological studies that suggest PHA was accumulated by Gordonia spp. in activated sludge (Nielsen et al., 2009). Recently, phaC was also found to be missing from the M. parvicella genome (McIlroy et al., 2013). The  66 authors suggested that the Nile blue A staining methods used to identify PHA storage may have incorrectly identified TAGs as PHA, given that these methods stain both PHA and lipids (Serafim et al., 2002). Alternatively, a novel multifunctional fusion gene or putative hydrolase may have substituted for the missing PHA synthase normally encoded by phaC (McIlroy et al., 2013). Another possibility may be that the reaction catalyzed by PHA depolymerase (detected in Gordonia spp.) is reversible, although no evidence of such actvitiy has been reported in the literature. As such, confirmation of Gordonia spp. ability to accumulate PHA granules in EBPR ecosystems requires further elucidation.   Genes encoding the enzymes necessary for polyP storage were also present across Gordonia spp. (Table 3.6). All Gordonia spp. genomes encoded a single ppk gene and multiple copies of the polyP hydrolyzing enzyme exopolyphosphatase (ppx) (Rao et al., 2009). Regeneration of ATP from PolyP requires polyphosphate:AMP transferase (PAP) activity (Rao et al., 2009).  67                      Figure 3.5 Gordonia spp. pathway comparison. Columns represent Gordonia spp. strains; rows represent inferred MetaCyc pathways. Clustering based on Manhattan method. Branches indicate clustering of Gordonia spp. strains (columns) and pathways (rows) based on Manhattan method. Pathways discussed in main text are listed on the right. G. polyisoprenivorans, Gpol; G. terrae, Gter; G. rhizosphera, Grhi; G. soli, Gsol; Gordonia spp. UBC, Gubc; G. amarae, Gama; G. paraffinivorans, Gpar; G. KTR9, Gktr; G. aichiensis, Gaic.   While Accumulibacter and Tetrasphaera PAO encode PAP (Garcia Martin et al., 2006; Kristiansen et al., 2012), it was not identified in the Gordonia spp. genomes examined here. However, ppk and adenylate kinase (adk) are reported to form a complex exhibiting some PAP activity (Ishige and Noguchi, 2000), which were both identified in Gordonia spp. (Table 3.6).  68 This suggests that potential for ATP regeneration from polyP by Gordonia spp. in the EBPR ecosystem exists. Both the specific phosphate transport (pst) system and inorganic phosphate transport (Pit) system were also identified in most Gordonia spp. (Table 3.6). In Accumulibacter, the Pit system is believed to be essential for generating the proton motive force under anaerobic conditions through export of Pi in symport with protons (Saunders et al., 2007; Burow et al., 2008). The Pit system is missing in glycogen accumulating organisms (McIlroy et al., 2014; Nobu et al., 2014), as well as M. parvicella (McIlroy et al., 2013) leading some researchers to hypothesize that it is required for polyP cycling in EBPR. However, while Gordonia spp. have been observed to store polyP granules (Wong et al., 2005; Beer et al., 2006), their ability to continuously cycle polyP across anaerobic-aerobic conditions remain unknown. Analysis of the variable regions across the Gordonia spp. genomes identified 39 gene clusters containing 316 ORFs (Table B3). While assigning biological roles to some of these clusters was difficult due the large presence of hypothetical proteins, many appeared to be involved in aromatic compound degradation and fatty acid metabolism. Indeed, this observed metabolic variability suggests that differences in substrate uptake patterns among Gordonia spp. likely transpire, consistent with in situ observations from activated sludge populations (Carr et al., 2006; Kragelund et al., 2007). Other molecules commonly identified within the variable regions were PIN-domain toxin-antitoxin proteins.   69 Table 3.6 Polyphosphate metabolism, Gordonia spp.        Gene EC No.* Gordonia spp. UBC G.aichiensis G.amarae G.KTR9 G.paraffinivorans G.polyisoprenivorans G.rhizosphera G.soli G.terrae ppk1 1 1 1 1 1 1 1 1 1 ppx 3 2 2 2 2 2 2 2 2 adk 1 1 1 1 1 1 1 1 1 pap - - - - - - - - - ppgK - - - - - - - - - ppnK 1 1 1 1 1 1 1 1 1 phoB - 2 1 2 1 0 0 0 0 2 phoR - - - - - - - - - phoH - 1 1 1 1 1 1 1 1 1 phoU - 1 1 1 1 1 1 1 1 1 pstA - 1 1 1 1 1 1 1 1 1 pstB 1 1 1 1 1 1 1 1 1 pstC - 1 1 1 1 1 1 1 1 1 pstS - 1 1 1 1 1 1 1 1 1 pitA - 1 2 1 1 2 1 0 1 2 ppa 1 1 1 1 1 1 1 1 1 alp 1 1 1 1 1 2 2 2 1 *Protein names provided on Table 3.5.      70 While the precise role of these proteins is unclear, previous reports suggests that resident toxin-antitoxin operons originally acquired through mobile gene pools may be associated with retardation of cell growth and persistence in stressful environments (Arus et al., 2005; Gerdes et al., 2005). This agrees with the low activity levels often observed among Gordonia spp. in activated sludge (de los Reyes and Raskin et al., 2002; Nielsen et al., 2008); however, much still remains to be learned about the ecological role of toxin-antitoxin systems in bacteria (Arus et al., 2005).   3.4.5 EBPR ecosystem bacteria-phage interactions    Given the prevalence of phage defense mechanisms in the EBPR community, attempts were made to identify and annotate prophage sequences in the metagenome using the Phage Search Tool (PHAST) (Zhou et al., 2011). PHAST identified 5 partial prophage regions in the metagenome, 3 of which were affiliated with the M. parvicella population genome (Figure 3.6; Table B4). Overall, phage recovered from the metagenome appeared to be novel, as no consistent classification of viral coding sequences could be assigned. Functional content of the phage encoded mainly structural components, replication machinery, and hypothetical proteins. Interestingly, one prophage encoded the terminal quinol oxidases cytochrome bd, which facilitates microaerophilic respiration and nitric oxide resistance in bacteria (Mason et al., 2009).  Indeed, such enzymes have potential to confer a fitness advantage in bacterial hosts under the microaerophilic conditions present in EBPR ecosystems. Such potential has previously been reported in marine ecosystems, where viral metabolic reprogramming of host carbon flux towards energy production and viral genome replication was proposed under sunlit and dark ocean conditions (Hurwitz et al., 2013). Nevertheless, our interpretation remains largely speculative and requires further investigation.   To further explore bacteria-phage interactions in the EBPR ecosystem, clustered regularly interspaced short palindromic repeats (CRISPR) were identified and reconstructed from  71 raw metagenomic reads using Crass (Skennerton et al., 2013). CRISPR are an adaptive prokaryotic immune system found in half of all sequenced bacterial and archaeal genomes. They recognize and cleave foreign DNA entering the cell using unique “spacer” sequences (Barrangou et al., 2007; Horvath and Barrangou, 2010). Spacer sequences (i.e. short pieces of excised foreign DNA) are incorporated into the host genome between direct repeat clusters, providing a catalogue of the dynamic and rapidly evolving interactions between host and virus  (Horvath and Barrangou, 2010). A total of 40 unique CRISPR were identified in the EBPR metagenome, each differentiated by a specific direct repeat sequence and multiple spacers (Figure 3.7). Average direct repeat and spacer lengths were 35 bp and 33 bp, respectfully, although some variation in sequence length was observed (Table B5). While most spacers did not map to the assembled metagenomic contigs, the two largest spacer-repeat arrays matched contigs in the M. parvicella population genome (Figure 3.8). Here, some spacers had greater abundance within a specific CRISPR locus, indicating that some phage attacks were more widespread among the population than others. Moreover, many CRISPR formed a large number of unconnected spacer arrangements, suggesting that discrete CRISPR loci originated from different strains in a population (Skennerton et al., 2013). Attempts were also made to map spacer sequences to phage regions reconstructed from the metagenome, however no matches were found. This implies that reconstructed prophage regions may not represent the most recent lytic events that occurred in the pilot-scale EBPR ecosystem.    72   Figure 3.6 Prophage coding sequence regions reconstructed from the metagenome. A detailed description of each region can be found on Table B4. Each row represents one partial prophage. *Prophage found in M. parvicella UBC population genome.    73   Figure 3.7 Total spacer count from EBPR metagenome. X-axis indicates direct repeat ID arbitrarily assigned by Crass (Skennerton et al., 2013); y-axis indicates number of spacers associated with each direct repeat sequence.    Figure 3.8 CRISPR spacer-repeat loci (region G4) in M. parvicella population genome. Arrows indicate direct repeats; circles indicate spacers; diamonds indicate flanking sequences. Spacer abundance indicated by colour gradient.     010203040G4G9G10G12G15G16G19G20G23G27G28G29G33G35G37G41G48G50G53G56G58G59G60G63G67G70G71G72G73G74G76G83G84G85G87G90G91G98G100G101Direct repeat IDNumber of spacers1 5Spacer Count 74 3.5 Concluding remarks  In summary, our results show that metagenomic and pyrotag sequencing approaches provide comparable estimates of EBPR community structure; however biases may exist due to the underrepresentation of some taxa in available sequence databases (e.g. RefSeq) and/or because of non-uniform amplification of SSU rRNA amplicons using PCR methods. Additionally, we show that EBPR microbial communities are enriched in cell capsule and exopolysaccharide biosynthesis, consistent with local selection pressures favoring floc formation. Other enriched functions were associated with phosphorus metabolism and aromatic compound degradation, possibly reflecting microbial adaptation to local bioreactor conditions and influent composition, respectfully. Recovery of population genomes from metagenomic contigs revealed that M. parvicella strains from different geographical locations manifest remarkable genomic similarity, in agreement with previous reports (McIlroy et al., 2013). While this suggests that the proposed M. parvicella metabolic model can be broadly applied across EBPR ecosystems, fine-scale genomic differences relating to EPS formation and toxic compound/antibiotic resistance indicate that further work is needed to understand eco-evolutionary dynamics that tune M. parvicella population structure such as viral lysis or predation. Novel metabolic insights into Gordonia spp. in the EBPR ecosystem suggest a potential role for polyP cycling that should be further explored, and the presence of phage and phage defense mechanisms (EPS, restriction-modification systems, CRISPR) highlights the need to further elucidate the role that viruses potentially play in modulating microbial community dynamics in EBPR ecosystems. Taken together, our findings provide insight on EBPR community function and enable future efforts aimed at monitoring spatiotemporal patterns in gene expression using metatranscriptomics to elucidate regulatory controls on community metabolism.      75 Chapter 4: Conclusions and future directions  Enhanced biological phosphorus removal (EBPR) is an environmental biotechnology of global importance, essential for protecting receiving waters from eutrophication and enabling phosphorus recovery (Nielsen et al., 2012). Current understanding of EBPR technology is largely based on empirical evidence and black-box models that fail to appreciate the intricarte microbial community interactions responsible for nutrient cycling and ultimate phosphorus removal. This empirical approach has limited further development of EBPR technology, which can experience unpredictable process failures and struggles to meet increasingly stringent effluent regulations as a stand-alone process. Accordingly, in order for EBPR to realize its full potential as an efficient and reliable environmental biotechnology, greater understanding of the microbial ecology of these engineered ecosystems is needed. Insights into the structure and function of microbial communities performing EBPR are starting to emerge as reviewed in Chapter 1, including the description of a core EBPR microbiome (Nielsen et al., 2010; 2012).  This thesis aimed to build on previous efforts by exploring the temporal and spatial activity dynamics of the core EBPR microbiome using 454-pyrotag sequencing (Chapter 2) and by examining the metabolic potential of a pilot-scale EBPR community through metagenomic approaches (Chapter 3). Recent ecological theory was incorporated into the interpretation of the research findings, such that rules governing the assembly and control of microbial communities can be harnessed for engineering purposes.     4.1 Conclusions, limitations, and future directions  The findings presented in Chapter 2 represent the first high-throughput examination of microbial activity dynamics in an EBPR ecosystem. Using a combination of SSU rDNA and rRNA sequencing, our investigation expanded the current knowledge of active microbial players participating in enhanced biological phosphorus removal (EBPR), and revealed that rare (i.e. very low abundance) microorganisms have the potential to contribute to nutrient cycling and process stability in these engineered ecosystems. At present, rare microorganisms are generally ignored in  76 engineered ecosystems such as EBPR; a misconception that stems from the assumption that low abundance equates to limited functional importance. Our results further illustrate that microbial activity can be highly dynamic, even among phylogenetically cohesive units. This suggests that fine-scale population differences exist in EBPR ecosystems with potential to impact process performance. We also revealed that rRNA abundance for individual taxa remained constant across bioreactor redox zones (anaerobic/anoxic/aerobic) in the continuous flow process employed in this study, indicating that EBPR communities are attuned to changing bioreactor redox conditions and nutrient concentrations (Section 2.5.3).  It is important to note that rRNA abundances measured in this study reflect the potential for protein synthesis, including past, present, and future activities (Blazewicz et al., 2013). As such, additional studies combining labeling and incubation experiments with RNA sequencing are needed to confirm whether rare taxa do indeed make bona fide contributions to nutrient cycling in EBPR ecosystems. These studies could be complemented by metatranscriptomic approaches that measure total RNA expression to better understand microbial community responses to bioreactor dynamics. Such responses should also be tested under different perturbation scenarios in order to elucidate the environmental cues controlling population dynamics and gene expression in EBPR ecosystems. Specific perturbations of interest include sudden dilution of nutrients (C, N, and Pi), seasonal and weekly nutrient changes, changes in wastewater temperature, and the recycle of nitrates to the anaerobic zone.  The metabolic potential of a pilot-scale EBPR microbial community was examined in Chapter 3 using 454-pyrosequening. Here, major functions enriched in the EBPR community were related to extracellular polymetric substances (EPS), phosphorus metabolism, and aromatic compound degradation, likely reflecting microbial adaptation to the incoming wastewater composition (i.e. substrate) and local bioreactor selection pressures (Sanapareddy et al., 2009; Albertsen et al., 2012). This was further explored by comparative analysis of population genomes binned from the assembled metagenome. Here, a population genome for Candidatus ‘Microthrix parvicella’ showed remarkable similarity to previously sequenced strains sourced from disparate  77 EBPR ecosystems, indicating that existing M. parvicella metabolic models (McIlroy et al., 2013) may be broadly applicable. Fine-scale genomic comparisons also revealed that differentiation between Microthrix strains related to bacteriophage and toxin/antibiotic resistance, highlighting that local selection pressures from phage and toxic compounds likely contributed to community dynamics. This was further supported by the presence of prophage and phage defense mechanisms (EPS, restriction-modification systems, CRISPR) recovered from the metagenome. Comparative analysis of a Gordonia spp. population genome revealed that EBPR populations encode the metabolic potential for polyP cycling; a previously unrecognized finding. This supports the notion that PAO may consist of several diverse phylogenetic groups that vary among different treatment systems (Mino et al., 1998; McMahon et al., 2010).  The metagenomic work presented here provides much needed insight into EBPR community function; however, several constraints need to be considered in future studies. For example, in this study low sequencing depth prevented examination of the rare biosphere, which was shown in Chapter 2 to potentially play key roles in EBPR ecosystems (Chapter 2). As such, future investigations should combine 454 pyrosequencing offering longer reads needed for assembly (~700bp) with “deep” sequencing approaches such as the Illumina HiSeq or MiSeq platforms that generate orders of magnitude more short read paired-end data. In addition to sequencing depth, a lack of indigenous reference genomes limited the accuracy of binning and gene annotation (Chapter 3). This can be partially overcome by single-cell genomic approaches that enable whole genome amplification and sequencing of individual microorganisms (Rinke et al., 2013).     78 Bibliography Alawi, M., Lipski, A., Sanders, T., and Spieck, E. (2007) Cultivation of a novel cold-adapted nitrite oxidizing betaproteobacterium from the Siberian Arctic. ISME J. 1: 256–264. Alawi, M., Off, S., Kaya, M., and Spieck, E. (2009) Temperature influences the population structure of nitrite-oxidizing bacteria in activated sludge. Environ. Microbiol. Rep. 1: 184–90. Albertsen, M., Hansen, L.B.S., Saunders, A.M., Nielsen, P.H., and Nielsen, K.L. (2012) A metagenome of a full-scale microbial community carrying out enhanced biological phosphorus removal. ISME J. 6: 1094–106. Albertsen, M., Hugenholtz, P., Skarshewski, A., Nielsen, K.L., Tyson, G.W., and Nielsen, P.H. (2013) Genome sequences of rare, uncultured bacteria obtained by differential coverage binning of multiple metagenomes. Nat. Biotechnol. 31: 533–8. Albertsen, M., Saunders, A.M., Nielsen, K.L., and Nielsen, P.H. (2013) Metagenomes obtained by “deep sequencing” - what do they tell about the enhanced biological phosphorus removal communities? Wat. Sci. Tech. 68: 1959–68. Allers, E., Wright, J.J., Konwar, K.M., Howes, C.G., Beneze, E., Hallam, S.J., and Sullivan, M.B. (2012) Diversity and population structure of Marine Group A bacteria in the Northeast subarctic Pacific Ocean. ISME J. 1–13. Altschup, S.F., Gish, W., Webb, M., Myers, E.W., and Lipman, D.J. (1990) Basic Local Alignment Search Tool. J. Mol. Biol. 215: 403–410. APHA (2005). Standard Methods for the Examination of Water and Waste Water. Washington, DC: American Public Health Association, American Water Works Association and Water Environment Federation: Washington, DC. Amann, R., Lemmer, H., and Wagner, M. (1998) Monitoring the community structure of wastewater treatment plants: a comparison of old and new techniques. FEMS Microbiol. Ecol. 25: 205–215. Andreasen, K. and Nielsen, P.H. (2000) Growth of Microthrix parvicella in Nutrient Removal Activated Sludge Plants: Studies of in situ Physiology. Wat. Res. 34: Arcus, V.L., Rainey, P.B., and Turner, S.J. (2005) The PIN-domain toxin-antitoxin array in mycobacteria. Trends Microbiol. 13: 360–5. Ayarza, J.M. and Erijman, L. (2011) Balance of neutral and deterministic components in the dynamics of activated sludge floc assembly. Microb. Ecol. 61: 486–95. Ayarza, J.M., Guerrero, L.D., and Erijman, L. (2010) Nonrandom assembly of bacterial populations in activated sludge flocs. Microb. Ecol. 59: 436–44. Barnard, J.L. (1976) A Review of Biological Phosphorus Removal in the Activated Sludge Process. Water SA 2: 136–144. Barnard, J.L. (1998) The Development of Nutrient-Removal Processes (Abridged). J.CIWEM 12: 330–337.  79 Barnard, R.L., Osborne, C.A., and Firestone, M.K. (2013) Responses of soil bacterial and fungal communities to extreme desiccation and rewetting. ISME J. 7: 2229–41. Barrangou, R., Christophe, F., Deveau, H., Richards, M., Boyaval, P., Moineau, S., et al. (2007) CRISPR Provides Acquired Resistance Against Viruses in Prokaryotes. Science (80-. ). 315: 1709–1712. Beer, M., Stratton, H.M., Griffiths, P.C., and Seviour, R.J. (2006) Which are the polyphosphate accumulating organisms in full-scale activated sludge enhanced biological phosphate removal systems in Australia? J. Appl. Microbiol. 100: 233–43. Berdjeb, L., Pollet, T., Domaizon, I., and Jacquet, S. (2011) Effect of grazers and viruses on bacterial community structure and production in two contrasting trophic lakes. BMC Microbiol. 11: 88. Blackall, L.L., Stratton, H., Bradford, D., Dot, T.D., Sjörup, C., Seviour, E.M., and Seviour, R.J. (1996) “Candidatus Microthrix parvicella”, a filamentous bacterium from activated sludge sewage treatment plants. Int. J. Syst. Bacteriol. 46: 344–6. Blazewicz, S.J., Barnard, R.L., Daly, R.A., and Firestone, M.K. (2013) Evaluating rRNA as an indicator of microbial activity in environmental communities: limitations and uses. ISME J. 1–8. Bock, E. and Wagner, M. (2006) The Prokaryotes. In, Dworkin,M., Falkow,S., Rosenberg,E., Schleifer,K.-H., and Stackebrandt,E. (eds). Springer, Berlin, Heidelberg, pp. 457–495. Braeken, K., Moris, M., Daniels, R., Vanderleyden, J., and Michiels, J. (2006) New horizons for (p)ppGpp in bacterial and plant physiology. Trends Microbiol. 14: 45–54. Braker, G. and Fesefeldt, A. (1998) Development of PCR Primer Systems for Amplification of Nitrite Reductase Genes ( nirK and nirS ) To Detect Denitrifying Bacteria in Environmental Samples. Appl. Environ. Microbiol. 64: 3769–3775. Britton, A., Koch, F.A., Mavinic, D.S., Adnan, A., Oldham, W.K., and Udala, B. (2005) Pilot-scale struvite recovery from anaerobic digester supernatant at an enhanced biological phosphorus removal wastewater treatment plant. J. Environ. Eng. Sci 4: 265 – 277. Burow, L.C., Mabbett, A.N., McEwan, A.G., Bond, P.L., and Blackall, L.L. (2008) Bioenergetic models for acetate and phosphate transport in bacteria important in enhanced biological phosphorus removal. Environ. Microbiol. 10: 87–98. Campbell, B.J. and Kirchman, D.L. (2013) Bacterial diversity, community structure and potential growth rates along an estuarine salinity gradient. ISME J. 7: 210–20. Campbell, B.J., Yu, L., Heidelberg, J.F., and Kirchman, D.L. (2011) Activity of abundant and rare bacteria in a coastal ocean. PNAS 108: 12776–12781. Caporaso, J.G., Paszkiewicz, K., Field, D., Knight, R., and Gilbert, J. a (2012) The Western English Channel contains a persistent microbial seed bank. ISME J. 6: 1089–93. Cech, J.S. and Hartman, P. (1993) Competition between polyphosphate and polysaccharide accumulating bacteria in enhanced biological phosphate removal systems. Wat. Res. 27: 1219–1225.  80 Cech, J.S., Hartman, P., and Macek, M. (1994) Bacteria and protozoa population dynamics in biological phosphate removal systems. Wat. Sci. Tech. 29: 109–117. Coats, E.R., Watkins, D.L., and Kranenburg, D. (2011) A Comparative Environmental Life-Cycle Analysis for Removing Phosphorus from Wastewater: Biological versus Physical/Chemical Processes. Water Environ. Res. 83: 750 – 760. Comeau, Y., Hall, K.J., Hancock, R.E.W., and Oldham, W, K. (1986) Biochemical model for enhanced biological phosphorus removal. Water Res. 20: 1511–1521. Confer, D.R. and Logan, B.E. (1998) Location of Protein and Polysaccharide Hydrolytic Activity in Suspended and Biofilm Wastewater Cultures. Water Res. 32: 31–38. Cordero, O.X. and Polz, M.F. (2014) Explaining microbial genomic diversity in light of evolutionary ecology. Nat. Rev. Microbiol. 12: 263–73. Coyne, M.J., Fletcher, C.M., Reinap, B., and Comstock, L.E. (2011) UDP-Glucuronic Acid Decarboxylases of Bacteroides fragilis and Their Prevalence in Bacteria. J. Bacteriol. 193: 5252–5259. Crocetti, G.R., Banfield, J.F., Keller, J., Bond, P.L., and Blackall, L.L. (2002) Glycogen-accumulating organisms in laboratory-scale and full-scale wastewater treatment processes. Microbiology 148: 3353–64. Crosby, L.D. and Criddle, C.S. (2003) Understanding bias in microbial community analysis techniques due to rrn operon copy number heterogeneity. Biotechniques 34: 790–4, 796, 798 passim. Czekalski, N., Gascón Díez, E., and Bürgmann, H. (2014) Wastewater as a point source of antibiotic-resistance genes in the sediment of a freshwater lake. ISME J. 8: 1381–90. Daims and Wagner (2010) The microbiology of nitrogen removal. In Microbial Ecology of Activated Sludge (eds. Seviour, R.J., and Nielsen, P.H.). IWA, London, United Kingdom. De los Reyes, F.L. and Raskin, L. (2002) Role of filamentous microorganisms in activated sludge foaming: relationship of mycolata levels to foaming initiation and stability. Water Res. 36: 445–59. Dinsdale, E. a, Edwards, R. a, Hall, D., Angly, F., Breitbart, M., Brulc, J.M., et al. (2008) Functional metagenomic profiling of nine biomes. Nature 452: 629–32. Dolinšek, J., Lagkouvardos, I., Wanek, W., Wagner, M., and Daims, H. (2013) Interactions of nitrifying bacteria and heterotrophs: identification of a Micavibrio-like putative predator of Nitrospira spp. Appl. Environ. Microbiol. 79: 2027–37. Dueholm, T.E., Andreasen, K.H., and Nielsen, P.H. (2001) Transformation of lipids in activated sludge. Water Sci. Technol. 43: 165–72. Dunfield, P.F., Tamas, I., Lee, K.C., Morgan, X.C., McDonald, I.R., and Stott, M.B. (2012) Electing a candidate: a speculative history of the bacterial phylum OP10. Environ. Microbiol. 14: 3069–80. Eikelboom, D. (2000) Process control of activated sludge plants. IWA, London, United Kingdom.   81 Engelbrektson, A., Kunin, V., Wrighton, K.C., Zvenigorodsky, N., Chen, F., Ochman, H., and Hugenholtz, P. (2010) Experimental factors affecting PCR-based estimates of microbial species richness and evenness. ISME J. 4: 642–647. Eschenhagen, M., Schuppler, M., and Röske, I. (2003) Molecular characterization of the microbial community structure in two activated sludge systems for the advanced treatment of domestic effluents. Water Res. 37: 3224–32. Esler, J. and Bennett, E. (2011) Phosphorus cycle: A broken biogeochemical cycle. Nature 478: 29–31. Evans, T.N. and Seviour, R.J. (2012) Estimating biodiversity of fungi in activated sludge communities using culture-independent methods. Microb. Ecol. 63: 773–86. Flowers, J.J., Cadkin, T. a, and McMahon, K.D. (2013) Seasonal bacterial community dynamics in a full-scale enhanced biological phosphorus removal plant. Water Res. 47: 7019–31. Flowers, J.J., He, S., Malfatti, S., Del Rio, T.G., Tringe, S.G., Hugenholtz, P., and McMahon, K.D. (2013) Comparative genomics of two “Candidatus Accumulibacter” clades performing biological phosphorus removal. ISME J. 1–14. Flowers, J.J., He, S., Yilmaz, S., Noguera, D.R., and McMahon, K.D. (2009) Denitrification capabilities of two biological phosphorus removal sludges dominated by different “Candidatus Accumulibacter” clades. Environ. Microbiol. Rep. 1: 583–588. Follows, M.J. and Dutkiewicz, S. (2011) Modeling Diverse Communities of Marine Microbes. Ann. Rev. Mar. Sci. 3: 427–451. Forde, A. and Fitzgerald, G.F. (2003) Molecular organization of exopolysaccharide (EPS) encoding genes on the lactococcal bacteriophage adsorption blocking plasmid, pCI658. Plasmid 49: 130–142. Fredriksson, N.J., Hermansson, M., and Wilén, B.-M. (2012) Diversity and dynamics of Archaea in an activated sludge wastewater treatment plant. BMC Microbiol. 12: 140. Frigon, D., Guthrie, R.M., Bachman, G.T., Royer, J., Bailey, B., and Raskin, L. (2006) Long-term analysis of a full-scale activated sludge wastewater treatment system exhibiting seasonal biological foaming. Water Res. 40: 990–1008. Fuhs, G.W. and Chen, M. (1975) Microbiological basis of phosphate removal in the activated sludge process for the treatment of wastewater. Microb. Ecol. 2: 119–38. Galand, P.E., Casamayor, E.O., Kirchman, D.L., and Lovejoy, C. (2009) Ecology of the rare microbial biosphere of the Arctic Ocean. PNAS 106: 22427–32. García Martín, H., Ivanova, N., Kunin, V., Warnecke, F., Barry, K.W., McHardy, A.C., et al. (2006) Metagenomic analysis of two enhanced biological phosphorus removal (EBPR) sludge communities. Nat. Biotechnol. 24: 1263–9. Van der Gast, C.J., Ager, D., and Lilley, A.K. (2008) Temporal scaling of bacterial taxa is influenced by both stochastic and deterministic ecological factors. Environ. Microbiol. 10: 1411–8.  82 Gerdes, K., Christensen, S.K., and Løbner-Olesen, A. (2005) Prokaryotic toxin-antitoxin stress response loci. Nat. Rev. Microbiol. 3: 371–82. Gibbons, S.M., Caporaso, J.G., Pirrung, M., Field, D., Knight, R., and Gilbert, J. a. (2013) Evidence for a persistent microbial seed bank throughout the global ocean. PNAS 110: 4651–4655. Ginige, M.P., Keller, J., and Blackall, L.L. (2005) Investigation of an Acetate-Fed Denitrifying Microbial Community by Stable Isotope Probing , Full-Cycle rRNA Analysis , and Fluorescent In Situ Investigation of an Acetate-Fed Denitrifying Microbial Community by Stable Isotope Probing , Full-Cycle rRNA An. Appl. Environ. Microbiol. 71: 8683 – 8691. Goel R.K., Sanhueza P., Noguera D.R. (2005). Evidence of Dechloromonas Sp. Participating in Enhanced Biological Phosphorus Removal (EBPR) in a Bench-Scale Aerated Anoxic Reactor. Water Environment Federation 78th Annual Technical Exhibition and Conference, Water Environment Federation, Washington DC, 3864-3871. Gong, J., Dong, J., Liu, X., and Massana, R. (2013) Extremely high copy numbers and polymorphisms of the rDNA operon estimated from single cell analysis of oligotrich and peritrich ciliates. Protist 164: 369–79. Grady, C., Daigger, G., Love, N., & Filipe, C. (2011) Biological Wastewater Treatment.  CRC Press, Boca Raton, Florida. Graninger, M., Kneidinger, B., Bruno, K., Scheberl, A., Messner, P., Graninger, M., et al. (2002) Homologs of the Rml Enzymes from Salmonella enterica Are Responsible for dTDP-β-L-Rhamnose Biosynthesis in the Gram-Positive Thermophile Aneurinibacillus thermoaerophilus DSM 10155. Appl. Environ. Microbiol. 68: 3708 – 3715. Gray, N.D., Miskin, I.P., Kornilova, O., Curtis, T.P., and Head, I.M. (2002) Occurrence and activity of Archaea in aerated activated sludge wastewater treatment plants. Environ. Microbiol. 4: 158–68. Gu, A.Z., Saunders, a, Neethling, J.B., Stensel, H.D., and Blackall, L.L. (2008) Functionally Relevant Microorganisms to Enhanced Biological Phosphorus Removal Performance at Full-Scale Wastewater Treatment Plants in the United States. Water Environ. Res. 80: 688–698. Gu, X., Lee, S.G., and Bar-Peled, M. (2010) Biosynthesis of UDP-xylose and UDP-arabinose in Sinorhizobium meliloti 1021: first characterization of a bacterial UDP-xylose synthase, and UDP-xylose 4-epimerase. Microbiology 157: 260–269. Hall, E.R., Monti, A., and Mohn, W.W. (2010) A comparison of bacterial populations in enhanced biological phosphorus removal processes using membrane filtration or gravity sedimentation for solids-liquid separation. Water Res. 44: 2703–14. Hanlon, G.W., Denyer, S.P., Olliff, C.J., Ibrahim, L.J., and Ibrahim, L.J. (2001) Reduction in Exopolysaccharide Viscosity as an Aid to Bacteriophage Penetration through Pseudomonas aeruginosa Biofilms. Appl. Environ. Microbiol. 67: 2746–2753. Hanson, N.W., Konwar, K.M., Wu, S.-J., and Hallam, S.J. (2014) MetaPathways v2.0: A master-worker model for environmental Pathway/Genome Database construction on grids and clouds. 2014 IEEE Conf. Comput. Intell. Bioinforma. Comput. Biol. 1–7.  83 He, S., Gall, D.L., and McMahon, K.D. (2007) “Candidatus Accumulibacter” population structure in enhanced biological phosphorus removal sludges as revealed by polyphosphate kinase genes. Appl. Environ. Microbiol. 73: 5865–74. He, S. and McMahon, K.D. (2011a) “Candidatus Accumulibacter” gene expression in response to dynamic EBPR conditions. ISME J. 5: 329–40. He, S. and McMahon, K.D. (2011b) Microbiology of “Candidatus Accumulibacter” in activated sludge. Microb. Biotechnol. 4: 603–19. Henriques, I.D.S. and Love, N.G. (2007) The role of extracellular polymeric substances in the toxicity response of activated sludge bacteria to chemical toxins. Water Res. 41: 4177–85. Hesselmann, R.P.X., Werlen, C., Hahn, D., van der Meer, J.R., and Zehnder, A.J.B. (1999) Enrichment, Phylogenetic Analysis and Detection of a Bacterium That Performs Enhanced Biological Phosphate Removal in Activated Sludge. Syst. Appl. Microbiol. 22: 454–465. Hesselsoe, M., Fu, S., Schloter, M., Bodrossy, L., Iversen, N., Roslev, P., et al. (2009) Isotope array analysis of Rhodocyclales uncovers functional redundancy and versatility in an activated sludge. ISME J. 3: 1349–1364. Hooper, a B., Vannelli, T., Bergmann, D.J., and Arciero, D.M. (1997) Enzymology of the oxidation of ammonia to nitrite by bacteria. Antonie Van Leeuwenhoek 71: 59–67. Horvath, P. and Barrangou, R. (2010) CRISPR/Cas, the immune system of bacteria and archaea. Science. 327: 167–70. Hu, M., Wang, X., Wen, X., and Xia, Y. (2012) Microbial community structures in different wastewater treatment plants as revealed by 454-pyrosequencing analysis. Bioresour. Technol. 117: 72–9. Hugenholtz, P., Goebel, B.M., and Pace, N.R. (1998) Impact of Culture-Independent Studies on the Emerging Phylogenetic View of Bacterial Diversity. J. Bacteriol. 180: 4765 – 4774. Hugoni, M., Taib, N., Debroas, D., Domaizon, I., Jouan Dufournel, I., Bronner, G., et al. (2013) Structure of the rare archaeal biosphere and seasonal dynamics of active ecotypes in surface coastal waters. Proc. Natl. Acad. Sci. U. S. A. 110: 6004–09. Hunt, D.E., Lin, Y., Church, M.J., Karl, D.M., Tringe, S.G., Izzo, L.K., and Johnson, Z.I. (2013) Relationship between abundance and specific activity of bacterioplankton in open ocean surface waters. Appl. Environ. Microbiol. 79: 177–84. Hurwitz, B.L., Hallam, S.J., and Sullivan, M.B. (2013) Metabolic reprogramming by viruses in the sunlit and dark ocean. Genome Biol. 14: Huson, D.H., Mitra, S., Ruscheweyh, H.-J., Weber, N., and Schuster, S.C. (2011) Integrative analysis of environmental sequences using MEGAN4. Genome Res. 21: 1552–60. Hyatt, D., Chen, G.-L., Locascio, P.F., Land, M.L., Larimer, F.W., and Hauser, L.J. (2010) Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11: 119.  84 Ishige, K. and Noguchi, T. (2000) Inorganic polyphosphate kinase and adenylate kinase participate in the polyphosphate:AMP phosphotransferase activity of Escherichia coli. Proc. Natl. Acad. Sci. U. S. A. 97: 14168–71. Jeon, C.K. and Park, J.M. (2000) Enhanced biological phosphorus removal in a sequencing batch reactor supplied with glucose as a sole carbon source. Water Res. 34: 2160–2170. Johnson, B.R. and Daigger, G.T. (2009) Integrated nutrient removal design for very low phosphorus levels. Water Sci. Technol. 60: 2455–2462. Johnson, D.R., Goldschmidt, F., Lilja, E.E., and Ackermann, M. (2012) Metabolic specialization and the assembly of microbial communities. ISME J. 6: 1985–91. Jones, S.E. and Lennon, J.T. (2010) Dormancy contributes to the maintenance of microbial diversity. Proc. Natl. Acad. Sci. U. S. A. 107: 5881–6. Juretschko, S., Timmermann, G., Schmid, M., Schleifer, K., Pommerening-Röser, A., and Wagner, M. (1998) Combined Molecular and Conventional Analyses of Nitrifying Bacterium Diversity in Activated Sludge : Nitrosococcus mobilis and Nitrospira-Like Bacteria as Dominant Populations. Appl. Environ. Microbiol. 64: 3042–3051. Kang, D. and Noguera, D.R. (2014) Candidatus Accumulibacter phosphatis : Elusive Bacterium Responsible for Enhanced Biological Phosphorus Removal. J. Environ. Eng. 140: 2–10. Kim, B.-C., Kim, S., Shin, T., Kim, H., and Sang, B.-I. (2013a) Comparison of the bacterial communities in anaerobic, anoxic, and oxic chambers of a pilot A(2)O process using pyrosequencing analysis. Curr. Microbiol. 66: 555–65. Kim, J.M., Lee, H.J., Lee, D.S., and Jeon, C.O. (2013b) Characterization of the denitrification-associated phosphorus uptake properties of “Candidatus Accumulibacter phosphatis” clades in sludge subjected to enhanced biological phosphorus removal. Appl. Environ. Microbiol. 79: 1969–79. Kim, T.-S., Jeong, J.-Y., Wells, G.F., and Park, H.-D. (2013c) General and rare bacterial taxa demonstrating different temporal dynamic patterns in an activated sludge bioreactor. Appl. Microbiol. Biotechnol. 97: 1755–65. Kindaichi, T., Nierychlo, M., Kragelund, C., Nielsen, J.L., and Nielsen, P.H. (2013) High and stable substrate specificities of microorganisms in enhanced biological phosphorus removal plants. Environ. Microbiol. 15: 1821–31. Kong, Y., Nielsen, J.L., and Nielsen, P.H. (2005) Identity and Ecophysiology of Uncultured Actinobacterial Polyphosphate-Accumulating Organisms in Full-Scale Enhanced Biological Phosphorus Removal Plants. Appl. Environ. Microbiol. 71: 4076 – 4085. Kong, Y., Nielsen, J.L., and Nielsen, P.H. (2004) Microautoradiographic Study of Rhodocyclus-Related Polyphosphate-Accumulating Bacteria in Full-Scale Enhanced Biological Phosphorus Removal Plants. Appl. Environ. Microbiol. 70: 5383–5390. Kong, Y., Xia, Y., Nielsen, J.L., and Nielsen, P.H. (2007) Structure and function of the microbial community in a full-scale enhanced biological phosphorus removal plant. Microbiology 153: 4061–4073.  85 Kong, Y., Xia, Y., and Nielsen, P.H. (2008) Activity and identity of fermenting microorganisms in full-scale biological nutrient removing wastewater treatment plants. Environ. Microbiol. 10: 2008–19. Kragelund, C., Levantesi, C., Borger, A., Thelen, K., Eikelboom, D., Tandoi, V., et al. (2007) Identity, abundance and ecophysiology of filamentous Chloroflexi species present in activated sludge treatment plants. FEMS Microbiol. Ecol. 59: 671–82. Kragelund, C., Remesova, Z., Nielsen, J.L., Thomsen, T.R., Eales, K., Seviour, R., et al. (2007) Ecophysiology of mycolic acid-containing Actinobacteria (Mycolata) in activated sludge foams. FEMS Microbiol. Ecol. 61: 174–84. Kristiansen, R., Nguyen, H.T.T., Saunders, A.M., Nielsen, J.L., Wimmer, R., Le, V.Q., et al. (2012) A metabolic model for members of the genus Tetrasphaera involved in enhanced biological phosphorus removal. ISME J. 1–12. Kunin, V., Engelbrektson, A., Ochman, H., and Hugenholtz, P. (2010) Wrinkles in the rare biosphere : pyrosequencing errors. Environ. Microbiol. 12: 118–123. Kunin, V., He, S., Warnecke, F., Peterson, S.B., Martin, H.G., Haynes, M., et al. (2008) A bacterial metapopulation adapts locally to phage predation despite global dispersal A bacterial metapopulation adapts locally to phage predation despite global dispersal. Genome Res. 18: 293–297. Labrie, S.J., Samson, J.E., and Moineau, S. (2010) Bacteriophage resistance mechanisms. Nat. Rev. Microbiol. 8: 317–27. Legendre, P., Oksanen, J., and ter Braak, C.J.F. (2011) Testing the significance of canonical axes in redundancy analysis. Methods Ecol. Evol. 2: 269–277. Legendre P, Legendre L. (1998) Numerical Ecology. Amsterdam, the Netherlands: Elsevier Science, BV. Lennon, J.T. and Jones, S.E. (2011) Microbial seed banks: the ecological and evolutionary implications of dormancy. Nat. Rev. Microbiol. 9: 119–30. Lepp, P. and Schmidt, T. (1998) Nucleic acid content of synechococcus spp. during growth in continuous light and light/dark cycles. Arch. Microbiol. 170: 201–7. Louie, T.M., Mah, T.J., Oldham, W., and Ramey, W.D. (2000) Use of metabolic inhibitors and gas chromatography / mass spectrometry to study poly-b-hydroxyalkanoates metabolism involving cryptic nutrients in enhanced biological phosphorus removal systems. Wat. Res. 34: 1507–1514. Lücker, S., Wagner, M., Maixner, F., Pelletier, E., Koch, H., Vacherie, B., et al. (2010) A Nitrospira metagenome illuminates the physiology and evolution of globally important nitrite-oxidizing bacteria. Proc. Natl. Acad. Sci. U. S. A. 107: 13479–84. Madonii, P., Davoli, D., and Chierici, E. (1993) Comparative Analysis of the Activated Sludge Microfauna in Several Sewage Treatment Works. Wat. Res. 27: 1485–1491. Margulies, M., Egholm, M., Altman, W.E., Attiya, S., Bader, J.S., Bemben, L. a, et al. (2005) Genome sequencing in microfabricated high-density picolitre reactors. Nature 437: 376–80.  86 Martin-Fernandez, J.A., Barcelo-Vidal, C., and Pawlowsky-Glahn, V. (2003) Dealing With Zeros and Missing Values in Compositional Data Sets Using Nonparametric. Math. Geol. 35: 253 – 278. Martinez-Garcia, M., Brazel, D.M., Swan, B.K., Arnosti, C., Chain, P.S.G., Reitenga, K.G., et al. (2012) Capturing single cell genomes of active polysaccharide degraders: an unexpected contribution of Verrucomicrobia. PLoS One 7: e35314. Mason, M.G., Shepherd, M., Nicholls, P., Dobbin, P.S., Dodsworth, K.S., Poole, R.K., and Cooper, C.E. (2009) Cytochrome bd confers nitric oxide resistance to Escherichia coli. Nat. Chem. Biol. 5: 94–6. Maszenan, a M., Seviour, R.J., Patel, B.K., Schumann, P., Burghardt, J., Tokiwa, Y., and Stratton, H.M. (2000) Three isolates of novel polyphosphate-accumulating gram-positive cocci, obtained from activated sludge, belong to a new genus, Tetrasphaera gen. nov., and description of two new species, Tetrasphaera japonica sp. nov. and Tetrasphaera australiensis sp. no. Int. J. Syst. Evol. Microbiol. 50: 593–603. McIlroy, S. and Seviour, R.J. (2009) Elucidating further phylogenetic diversity among the Defluviicoccus-related glycogen-accumulating organisms in activated sludge. Environ. Microbiol. Rep. 1: 563–8. McIlroy, S.J., Albertsen, M., Andresen, E.K., Saunders, A.M., Kristiansen, R., Stokholm-Bjerregaard, M., et al. (2014) “Candidatus Competibacter”-lineage genomes retrieved from metagenomes reveal functional metabolic diversity. ISME J. 8: 613–24. McIlroy, Simon, J., Kristiansen, R., Albertsen, M., Michael Karst, S., Rossetti, S., Lund Nielsen, J., et al. (2013) Metabolic model for the filamentous “Candidatus Microthrix parvicella” based on genomic and metagenomic analyses. ISME J. 1–12. McMahon, K.D. and Read, E.K. (2013) Microbial Contributions to Phosphorus Cycling in Eutrophic Lakes and Wastewater. Annu. Rev. Microbiol. 67: 199–219. McMahon, K.D., Shaomei, H., and Oehmen, A. (2010) The microbiology of phosphorus removal. In Microbial Ecology of Activated Sludge (eds. Seviour, R.J., and Nielsen, P.H.). IWA, London, United Kingdom.  McMahon, K.D., Yilmaz, S., He, S., Gall, D.L., Jenkins, D., and Keasling, J.D. (2007) Polyphosphate kinase genes from full-scale activated sludge plants. Appl. Microbiol. Biotechnol. 77: 167–73. Meyer, F., Paarmann, D., D’Souza, M., Olson, R., Glass, E.M., Kubal, M., et al. (2008) The metagenomics RAST server - a public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinformatics 9: 386. Mielczarek, A.T., Nguyen, H.T.T., Nielsen, J.L., and Nielsen, P.H. (2013) Population dynamics of bacteria involved in enhanced biological phosphorus removal in Danish wastewater treatment plants. Water Res. 47: 1529–44. Mino, T., Tsuzuki, Y. and Matsuo, T. (1987) Effect of phosphorus accumulation on acetate metabolism in the biological phosphorus removal process. In: Biological Phosphate Removal from Wastewaters (Ramadori, R., Ed.), pp. 27-38. Pergamon Press, Oxford.  87 Mino, T., van Loosdrecht, M.C.M., and Heijnen, J.J. (1998) Microbiology and Biochemistry of the Enhanced Biological Phosphorus Removal Process. Water Res. 32: 3193–3207. Mino, T. and Satoh, H. (2006) Wastewater genomics. Nat. Biotechnol. 24: 1229–1230. Miura, Y., Watanabe, Y., and Okabe, S. (2007) Significance of Chloroflexi in performance of submerged membrane bioreactors (MBR) treating municipal wastewater. Environ. Sci. Technol. 41: 7787–94. Morales-Belpaire, I. and Gerin, P. a (2007) Factors affecting the fate of active proteins introduced in wastewater sludges: investigation with green fluorescent protein. Water Res. 41: 1723–33. Moreno, A.M., Matz, C., Kjelleberg, S., and Manefield, M. (2010) Identification of ciliate grazers of autotrophic bacteria in ammonia-oxidizing activated sludge by RNA stable isotope probing. Appl. Environ. Microbiol. 76: 2203–11. Morgenroth, E., Kommedal, R., and Harremoës, P. (2002) Processes and modeling of hydrolysis of particulate organic matter in aerobic wastewater treatment--a review. Water Sci. Technol. 45: 25–40. Neufeld, J.D., Chen, Y., Dumont, M.G., and Murrell, J.C. (2008) Marine methylotrophs revealed by stable-isotope probing, multiple displacement amplification and metagenomics. Environ. Microbiol. 10: 1526–35. Nguyen, H.T.T., Le, V.Q., Hansen, A.A., Nielsen, J.L., and Nielsen, P.H. (2011) High diversity and abundance of putative polyphosphate-accumulating Tetrasphaera-related bacteria in activated sludge systems. FEMS Microbiol. Ecol. 76: 256–67. Nicholls, A.H.A., Osborn, D.W., and Nicholls, H.A. (1979) Bacterial stress: prerequisite for biological removal of phosphorus. J. WPCF 51: 557–569. Nielsen, J.L., Nguyen, H., Meyer, R.L., and Nielsen, P.H. (2012) Identification of glucose-fermenting bacteria in a full-scale enhanced biological phosphorus removal plant by stable isotope probing. Microbiology 158: 1818–25. Nielsen, P.H., Kragelund, C., Seviour, R.J., and Nielsen, J.L. (2009) Identity and ecophysiology of filamentous bacteria in activated sludge. FEMS Microbiol. Rev. 33: 969–98. Nielsen, P.H., Mielczarek, A.T., Kragelund, C., Nielsen, J.L., Saunders, A.M., Kong, Y., et al. (2010) A conceptual ecosystem model of microbial communities in enhanced biological phosphorus removal plants. Water Res. 44: 5070–5088. Nielsen, P.H., Roslev, P., Dueholm, T.E., and Nielsen, J.L. (2002) Microthrix parvicella, a specialized lipid consumer in anaerobic-aerobic activated sludge plants. Wat. Sci. Tech. 46: 73–80. Nielsen, P.H., Saunders, A.M., Hansen, A.A., Larsen, P., and Nielsen, J.L. (2012) Microbial communities involved in enhanced biological phosphorus removal from wastewater — a model system in environmental biotechnology. Curr. Opin. Biotechnol. 23: 452–459. Nobu, M.K., Tamaki, H., Kubota, K., and Liu, W.-T. (2014) Metagenomic characterization of “Candidatus Defluviicoccus tetraformis strain TFO71,” a tetrad-forming organism, predominant in  88 an anaerobic-aerobic membrane bioreactor with deteriorated biological phosphorus removal. Environ. Microbiol. doi:10.111: 1–13. Nogueira, R. and Melo, L.F. (2006) Competition Between Nitrospira spp . and Nitrobacter spp . in Nitrite-Oxidizing Bioreactors. Biotechnol. Bioeng. 95: 169–175. Oehmen, A., Lemos, P.C., Carvalho, G., Yuan, Z., Keller, J., Blackall, L.L., and Reis, M. a M. (2007) Advances in enhanced biological phosphorus removal: from micro to macro scale. Water Res. 41: 2271–300. Oldham, W.K. (1986) Excess biological phosphorus removal in the activated sludge process using primary sludge fermentation. Can. J. Civ. Eng. Oldham, W.K. (1985) Full Scale Optimization of Biological Phosphorus Removal at Kelowna, Canada. Wat. Sci. Tech. 17: 243–257. Oldham, W.K. and Stevens, G.M. (1984) Initial operating experiences of a nutrient removal process (Modified Bardenpho) at Kelowna, British Columbia. Can. J. Civ. Eng. 11: 474–479. Olson, T.C. and Hooper, A.B. (1983) Energy coupling in the bacterial oxidation of small molecules : an extracytoplasmic dehydrogenase in Nitrosomonas. FEMS Microbiol. Lett. 19: 47–50. Orihel, D.M., Bird, D.F., Brylinsky, M., Chen, H., Donald, D.B., Huang, D.Y., et al. (2012) High microcystin concentrations occur only at low nitrogen-to-phosphorus ratios in nutrient-rich Canadian lakes. Can. J. Fish. Aquat. Sci. 69: 1457–1462. Osaka, T., Yoshie, S., Tsuneda, S., Hirata, A., Iwami, N., and Inamori, Y. (2006) Identification of acetate- or methanol-assimilating bacteria under nitrate-reducing conditions by stable-isotope probing. Microb. Ecol. 52: 253–66. Otawa, K., Lee, S.H., Yamazoe, A., Onuki, M., Satoh, H., and Mino, T. (2007) Abundance, diversity, and dynamics of viruses on microorganisms in activated sludge processes. Microb. Ecol. 53: 143–52. Park, H.-D., Wells, G.F., Bae, H., Criddle, C.S., and Francis, C. a (2006) Occurrence of ammonia-oxidizing archaea in wastewater treatment plant bioreactors. Appl. Environ. Microbiol. 72: 5643–7. Pedrós-Alió, C. (2006) Marine microbial diversity: can it be determined? Trends Microbiol. 14: 257–63. Pedrós-Alió, C. (2012) The rare bacterial biosphere. Ann. Rev. Mar. Sci. 4: 449–66. Pereira, H., Lemos, P.C., Reis, M.A.M., Cresp, J.P.S.G., Carrond, M.J.T., and Santos, H. (1996) Model for carbon metabolism in biological phosphorus removal processes based on in vivo 13C-NMR labelling experiments. Wat. Res. 30: 2128–2138. Pester, M., Bittner, N., Deevong, P., Wagner, M., and Loy, A. (2010) A “rare biosphere” microorganism contributes to sulfate reduction in a peatland. ISME J. 4: 1591–602. Petropoulos, P. and Gilbride, K.A. (2005) Nitrification in activated sludge batch reactors is linked to protozoan grazing of the bacterial population. Can. J. Civ. Eng. 799: 791–799.  89 Pinto, A.J. and Raskin, L. (2012) PCR biases distort bacterial and archaeal community structure in pyrosequencing datasets. PLoS One 7: e43093. Polz, M.F., Alm, E.J., and Hanage, W.P. (2013) Horizontal gene transfer and the evolution of bacterial and archaeal population structure. Trends Genet. 29: 170–5. Pradeep Ram, A.S. and Sime-Ngando, T. (2008) Functional responses of prokaryotes and viruses to grazer effects and nutrient additions in freshwater microcosms. ISME J. 2: 498–509. Purkhold, U., Pommerening-Röser, A., Juretschko, S., Schmid, M.C., Koops, H., and Wagner, M. (2000) Phylogeny of All Recognized Species of Ammonia Oxidizers Based on Comparative 16S rRNA and amoA Sequence Analysis : Implications for Molecular Diversity Surveys. Appl. Environ. Microbiol. 66: 5368–5382. Quast, C., Pruesse, E., Yilmaz, P., Gerken, J., Schweer, T., Yarza, P., et al. (2013) The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 41: D590–6. Rabinowitz, B. and Oldham, W.K. (1986) Excess biological phosphorus removal in the activated sludge process using primary sludge fermentation. Can. J. Civ. Eng. 13: 345–351. Ramette, A. (2007) Multivariate analyses in microbial ecology. FEMS Microbiol. Ecol. 62: 142–60. Rappé, M.S. and Giovannoni, S.J. (2003) The uncultured microbial majority. Annu. Rev. Microbiol. 57: 369–94. Rinke, C., Schwientek, P., Sczyrba, A., Ivanova, N.N., Anderson, I.J., Cheng, J.-F., et al. (2013) Insights into the phylogeny and coding potential of microbial dark matter. Nature 499: 431–7. Rodríguez-Blanco, A., Ghiglione, J.-F., Catala, P., Casamayor, E.O., and Lebaron, P. (2009) Spatial comparison of total vs. active bacterial populations by coupling genetic fingerprinting and clone library analyses in the NW Mediterranean Sea. FEMS Microbiol. Ecol. 67: 30–42. Rossetti, S., Tomei, M.C., Nielsen, P.H., and Tandoi, V. (2005) “Microthrix parvicella”, a filamentous bacterium causing bulking and foaming in activated sludge systems: a review of current knowledge. FEMS Microbiol. Rev. 29: 49–64. Rotthauwe, J. and Witzel, K. (1997) The Ammonia Monooxygenase Structural Gene amoA as a Functional Marker : Molecular Fine-scale Analysis of Natural Ammonia-Oxidizing Populations. Appl. Environ. Microbiol. 63: 4704–4712. Russell, J.B. and Cook, G.M. (1995) Energetics of bacterial growth: balance of anabolic and catabolic reactions. Microbiol. Rev. 59: 48–62. Saunders, A.M., Larsen, P., and Nielsen, P.H. (2013) Comparison of nutrient-removing microbial communities in activated sludge from full-scale MBRs and conventional plants. Wat. Sci. Tech. 68: 366–71. Sentchilo, V., Mayer, A.P., Guy, L., Miyazaki, R., Green Tringe, S., Barry, K., et al. (2013) Community-wide plasmid gene mobilization and selection. ISME J. 7: 1173–1186.  90 Seviour, R.J., Mino, T., and Onuki, M. (2003) The microbiology of biological phosphorus removal in activated sludge systems. FEMS Microbiol. Rev. 27: 99 – 127. Seviour, R.J., and Nielsen, P.H. (2010) Microbial Ecology of Activated Sludge. IWA, London, United Kingdom. Shade, A. and Handelsman, J. (2012) Beyond the Venn diagram: the hunt for a core microbiome. Environ. Microbiol. 14: 4–12. Shade, A., Hogan, C.S., Klimowicz, A.K., Linske, M., McManus, P.S., and Handelsman, J. (2012) Culturing captures members of the soil rare biosphere. Environ. Microbiol. 14: 2247–52. Shapiro, B.J. and Polz, M.F. (2014) Ordering microbial diversity into ecologically and genetically cohesive units. Trends Microbiol. 22: 235–247. Sharon, I. and Banfield, J.F. (2013) Microbiology. Genomes from metagenomics. Science (80-. ). 342: 1057–8. Sharp, J.H., Yoshiyama, K., Parker, A.E., Schwartz, M.C., Curless, S.E., Beauregard, A.Y., et al. (2009) A Biogeochemical View of Estuarine Eutrophication: Seasonal and Spatial Trends and Correlations in the Delaware Estuary. Estuaries and Coasts 32: 1023–1043. Sherr, E.B. and Sherr, B.F. (2002) Significance of predation by protists in aquatic microbial food webs. Antonie Van Leeuwenhoek 81: 293–308. Silva, A.F., Carvalho, G., Oehmen, A., Lousada-Ferreira, M., van Nieuwenhuijzen, A., Reis, M. a M., and Crespo, M.T.B. (2012) Microbial population analysis of nutrient removal-related organisms in membrane bioreactors. Appl. Microbiol. Biotechnol. 93: 2171–80. Skennerton, C.T., Imelfort, M., and Tyson, G.W. (2013) Crass: identification and reconstruction of CRISPR from unassembled metagenomic data. Nucleic Acids Res. 41: e105. Soddell, J. a, Stainsby, F.M., Eales, K.L., Seviour, R.J., and Goodfellow, M. (2006) Gordonia defluvii sp. nov., an actinomycete isolated from activated sludge foam. Int. J. Syst. Evol. Microbiol. 56: 2265–9. Soddell, J.A., Seviour, R.J., Blackall, L.L., and Hugenholtz, P. (1998) New Foam-Forming Nocardioforms Found in Actiaved Sludge. Wat. Sci. Tech. 37: 495–502. Sogin, M.L., Morrison, H.G., Huber, J. a, Mark Welch, D., Huse, S.M., Neal, P.R., et al. (2006) Microbial diversity in the deep sea and the underexplored “rare biosphere”. Proc. Natl. Acad. Sci. U. S. A. 103: 12115–20. Smolders, G., van der Meij, J., van Loosdrecht, M., and Heijnen, J. (1994a) Model of the Anaerobic Metabolism of the Biological Phosphorus Removal Process: Stoichiometry and pH Influence. Biotechnol. Bioeng. 43: 461-470. Smolders, G., van der Meij, J., van Loosdrecht, M., & Heijnen, J. (1994b) Stoichiometric Model of the Aerobic Metabolism of the Biological Phosphorus Removal Process. Biotechnol. Bioeng. 44: 837-848.  91 Smolders, G., van der Meij, J., van Loosdrecht, M., & Heijnen, J. (1995) A Structured Metabolic Model for Anaerobic and Aerobic Stoichiometry and Kinetics of the Biological Phosphorus Removal Process . Biotechnol. Bioeng. 47: 277-287. Spieck, E., Ehrich, S., Aamand, J., and Bock, E. (1998) Isolation and immunocytochemical location of the nitrite-oxidizing system in nitrospira moscoviensis. Arch. Microbiol. 169: 225–30. Spieck, E., Hartwig, C., McCormack, I., Maixner, F., Wagner, M., Lipski, A., and Daims, H. (2006) Selective enrichment and molecular characterization of a previously uncultured Nitrospira-like bacterium from activated sludge. Environ. Microbiol. 8: 405–15. Stahl, D. a and de la Torre, J.R. (2012) Physiology and diversity of ammonia-oxidizing archaea. Annu. Rev. Microbiol. 66: 83–101. Sutherland, I. (1995) Polysaccharide lyases. FEMS Microbiol. Rev. 16: 323–347. Szczepanowski, R., Linke, B., Krahn, I., Gartemann, K.-H., Gützkow, T., Eichler, W., et al. (2009) Detection of 140 clinically relevant antibiotic-resistance genes in the plasmid metagenome of wastewater treatment plant bacteria showing reduced susceptibility to selected antibiotics. Microbiology 155: 2306–19. Taroncher-Oldenburg, G., Griner, E.M., Francis, C.A., and Ward, B.B. (2003) Oligonucleotide Microarray for the Study of Functional Gene Diversity in the Nitrogen Cycle in the Environment. Appl. Environ. Microbiol. 69: 1159–1171. Thomsen, T.R., Kong, Y., and Nielsen, P.H. (2007) Ecophysiology of abundant denitrifying bacteria in activated sludge. FEMS Microbiol. Ecol. 60: 370–82. Tsukioka, Y., Yamashita, Y., Oho, T., Nakano, Y., and Koga, T. (1997) Biological function of the dTDP-rhamnose synthesis pathway in Streptococcus mutans . Biological Function of the dTDP-Rhamnose Synthesis Pathway in Streptococcus mutans. J. Bacteriol. 179: 1126–1134. Tyson, G.W., Chapman, J., Hugenholtz, P., Allen, E.E., Ram, R.J., Richardson, P.M., et al. (2004) Community structure and metabolism through reconstruction of microbial genomes from the environment. Nature 428: 37–43. Velicer, G.J. and Vos, M. (2009) Sociobiology of the myxobacteria. Annu. Rev. Microbiol. 63: 599–623. Vollertsen, J., Petersen, G., and Borregaard, V.R. (2006) Hydrolysis and fermentation of activated sludge to enhance biological phosphorus removal. Water Sci. Technol. 53: 55. Wan, C.-Y., De Wever, H., Diels, L., Thoeye, C., Liang, J.-B., and Huang, L.-N. (2011) Biodiversity and population dynamics of microorganisms in a full-scale membrane bioreactor for municipal wastewater treatment. Water Res. 45: 1129–38. Warren, A., Salvado, H., Curds, C.R., and Roberts, D.M. (2010) Protozoa in activated sludge processes. In Microbial Ecology of Activated Sludge (eds. Seviour, R.J., and Nielsen, P.H.). IWA, London, United Kingdom.   92 Wentzel, M.C., Lotter, L.H., Loewenthal, R.E., and Marais, G. (2000) Metabolic behaviour of Acinetobacter spp . in enhanced biological phosphorus removal - a biochemical model. Water SA 12: 209. Wexler, M., Richardson, D.J., and Bond, P.L. (2009) Radiolabelled proteomics to determine differential functioning of Accumulibacter during the anaerobic and aerobic phases of a bioreactor operating for. Environ. Microbiol. 11: 3029–3044. Wilhelm, L., Besemer, K., Fasching, C., Urich, T., Singer, G. a, Quince, C., and Battin, T.J. (2014) Rare but active taxa contribute to community dynamics of benthic biofilms in glacier-fed streams. Environ. Microbiol. 1–11. Wilmes, P., Andersson, A.F., Lefsrud, M.G., Wexler, M., Shah, M., Zhang, B., et al. (2008) Community proteogenomics highlights microbial strain-variant protein expression within activated sludge performing enhanced biological phosphorus removal. ISME J. 2: 853–64. Wingender, J., Neu, T.R., and Flemming, H.-C. (1999) Microbial extracellular polymeric substances: characterization, structure, and function. 1st edition. Springer, Berlin, Heidelberg. Wong, M.-T., Mino, T., Seviour, R.J., Onuki, M., and Liu, W.-T. (2005) In situ identification and characterization of the microbial community structure of full-scale enhanced biological phosphorous removal plants in Japan. Water Res. 39: 2901–14. Wong, M.-T., Tan, F.M., Ng, W.J., and Liu, W.-T. (2004) Identification and occurrence of tetrad-forming Alphaproteobacteria in anaerobic-aerobic activated sludge processes. Microbiology 150: 3741–8. Wright, J.J., Konwar, K.M., and Hallam, S.J. (2012) Microbial ecology of expanding oxygen minimum zones. Nat. Rev. Microbiol. 10: 381–94. Xia, Y., Kong, Y., and Nielsen, P.H. (2007) In situ detection of protein-hydrolysing microorganisms in activated sludge. FEMS Microbiol. Ecol. 60: 156–65. Xia, Y., Kong, Y., and Nielsen, P.H. (2008) In situ detection of starch-hydrolyzing microorganisms in activated sludge. FEMS Microbiol. Ecol. 66: 462–71. Xia, Y., Kong, Y., and Thomsen, T.R. (2008) Identification and Ecophysiological Characterization of Epiphytic Protein-Hydrolyzing Saprospiraceae (“ Candidatus Epiflobacter ” spp.) in Activated Sludge. Appl. Environ. Microbiol. 74: 2229 – 2238. Yoon, D.-N., Park, S.-J., Kim, S.-J., Jeon, C.O., Chae, J.-C., and Rhee, S.-K. (2010) Isolation, characterization, and abundance of filamentous members of Caldilineae in activated sludge. J. Microbiol. 48: 275–83. Zhang, J., Liu, Z., Wang, S., and Jiang, P. (2002) Characterization of a bioflocculant produced by the marine myxobacterium Nannocystis sp. NU-2. Appl. Microbiol. Biotechnol. 59: 517–22. Zhang, T., Shao, M.-F., and Ye, L. (2012) 454 Pyrosequencing Reveals Bacterial Diversity of Activated Sludge From 14 Sewage Treatment Plants. ISME J. 6: 1137–47.  93 Zhang, T., Ye, L., Tong, A.H.Y., Shao, M.-F., and Lok, S. (2011) Ammonia-oxidizing archaea and ammonia-oxidizing bacteria in six full-scale wastewater treatment bioreactors. Appl. Microbiol. Biotechnol. 91: 1215–25. Zhang, T., Zhang, X.-X., and Ye, L. (2011) Plasmid metagenome reveals high levels of antibiotic resistance genes and mobile genetic elements in activated sludge. PLoS One 6: e26041. Zhou, Y., Liang, Y., Lynch, K.H., Dennis, J.J., and Wishart, D.S. (2011) PHAST: a fast phage search tool. Nucleic Acids Res. 39: W347–52. Zilles, J.L., Peccia, J., Kim, M., Hung, C., and Noguera, D.R. (2002) Involvement of Rhodocyclus -Related Organisms in Phosphorus Removal in Full-Scale Wastewater Treatment Plants. Appl. Environ. Microbiol. 68: 2763 – 2769.     94 Appendix A – Chapter 2 supplementary material   Table A1 Sampling and sequencing statistics    DNA   RNA Sample Total reads Filtered readsa   Total reads Filtered readsa Day 1 (January, 31, 2013)         Anaerobic, replicate1 11970 11731  11570 11435 Anaerobic, replicate2 12171 11962  12001 11805 Anaerobic, replicate3 10262 10052  9771 9652 Anoxic, replicate1 10807 10626  11035 10893 Anoxic, replicate2 6080 5965  14432 14272 Anoxic, replicate3 10813 10603  10677 10528 Aerobic, replicate1 9363 9198  11551 11422 Aerobic, replicate2 - -  11661 11517 Aerobic, replicate3 8027 7887  14786 14591 Day 19 (February 18, 2013)         Anaerobic, replicate1 12874 8271  12262 12123 Anaerobic, replicate2 1588 10974  12599 12408 Anaerobic, replicate3 4427 10931  9295 9136 Anoxic, replicate1 4766 9453  10280 10132 Anoxic, replicate2 2993 8343  7810 7698 Anoxic, replicate3 5719 9232  9682 9546 Aerobic, replicate1 1829 10044  8623 8500 Aerobic, replicate2 543 11521  8715 8596 Aerobic, replicate3 1527 8973  11697 11519 Day 34 (March, 05, 2013)           Anaerobic, replicate1 8680 8305  13623 13412 Anaerobic, replicate2 11509 10174  8763 8635 Anaerobic, replicate3 11530 10057  13138 12861 Anoxic, replicate1 9704 8345  8323 8190 Anoxic, replicate2 8780 9974  9319 9158 Anoxic, replicate3 9696 10380  9493 9253 Aerobic, replicate1 10410 11071  10022 9805 Aerobic, replicate2 11657 10564  12007 11775 Aerobic, replicate3 9423 9341  9815 9649 Day 48 (March 19, 2013)           Anaerobic, replicate1 8755 12513  12288 12066 Anaerobic, replicate2 10784 1530  11252 11088 Anaerobic, replicate3 10572 4264  10231 10065 Anoxic, replicate1 8772 4595  11303 11094 Anoxic, replicate2 10265 2698  11560 11368 Anoxic, replicate3 10602 5566  3201 3153 Aerobic, replicate1 11260 1776  11742 11554  95 Aerobic, replicate2 10673 520  12423 12240 Aerobic, replicate3 9469 1481  11387 11270 Day 65 (April 05, 2013)           Anaerobic, replicate1 13358 12874  11925 11637 Anaerobic, replicate2 10609 9916  12921 12639 Anaerobic, replicate3 4603 4381  9028 8787 Anoxic, replicate1 3457 3352  10977 10748 Anoxic, replicate2 5056 4859  8936 8753 Anoxic, replicate3 5404 5210  8762 8567 Aerobic, replicate1 9545 9226  15612 15333 Aerobic, replicate2 7720 7441  10641 10470 Aerobic, replicate3 7173 6893  11773 11538 Day 79 (April 19, 2013)           Anaerobic, replicate1 8573 8459  11596 11289 Anaerobic, replicate2 8836 8732  12279 11969 Anaerobic, replicate3 9848 9738  11125 10839 Anoxic, replicate1 9037 8924  11409 11100 Anoxic, replicate2 3372 3328  10646 10288 Anoxic, replicate3 9133 9027  12142 11730 Aerobic, replicate1 10582 10454  12574 12272 Aerobic, replicate2 8487 8362  - - Aerobic, replicate3 10420 10287  24867 24282 Day 98 (May 08, 2013)           Anaerobic, replicate1 6691 6585  12135 11843 Anaerobic, replicate2 11117 10923  11405 11041 Anaerobic, replicate3 8929 8752  11717 11446 Anoxic, replicate1 9855 9671  11636 11264 Anoxic, replicate2 15029 14731  12271 11849 Anoxic, replicate3 9297 9121  11683 11446 Aerobic, replicate1 12622 12329  19446 18885 Aerobic, replicate2 22141 21733  - - Aerobic, replicate3 10512 10332  10763 10443 Day 112 (May 22, 2013)           Anaerobic, replicate1 9131 8938  13108 12814 Anaerobic, replicate2 10019 9798  11319 11078 Anaerobic, replicate3 10094 9907  10925 10666 Anoxic, replicate1 7506 7363  9813 9573 Anoxic, replicate2 10870 10663  11569 11279 Anoxic, replicate3 6043 5951  8990 8817 Aerobic, replicate1 10405 10201  11992 11637 Aerobic, replicate2 10313 10119  8925 8689 Aerobic, replicate3 9434 9266  11008 10846 a Singletons removed.             96 Table A2 OTU richness and diversity estimates     rRNA pool   rDNA pool Sample  OTUsa  Shannonb Simpsonb Chao1c  OTUs  Shannon Simpson Chao1 Anaerobic                     Day 1  2929 4.63 13.29 3286.66  3557 5.17 19.53 4029.03 Day 19  3667 5.34 47.37 4177.63  5297 6.07 38.76 6175.46 Day 34  3937 5.56 49.37 4509.88  2739 5.73 38.44 3269.21 Day 48  4040 5.47 42.79 4480.33  5196 6.11 44.17 6198.78 Day 65  4269 5.77 53.11 5041.57  4789 6.69 94.95 5737.37 Day 79  4596 5.84 54.09 5220.22  2952 5.64 59.31 3301.23 Day 98  4430 5.92 64.92 5067.12  3422 5.87 62.55 3783.41 Day 112  4114 5.44 27.98 4545.44  3583 5.85 58.92 4041.99 Anoxic                     Day 1  2748 3.95 5.69 3084.18  2862 4.71 10.69 3161.52 Day 19  3092 5.30 46.60 3320.17  4733 6.23 63.88 5329.31 Day 34  3249 5.44 42.80 3709.45  2358 6.12 74.65 2873.56 Day 48  3386 5.39 44.73 3814.49  4451 5.82 36.83 5011.12 Day 65  3616 5.63 42.10 4104.28  2448 6.25 69.61 2860.08 Day 79  4679 5.83 51.83 5303.43  2642 5.67 64.22 2961.01 Day 98  4627 5.92 63.56 5270.51  3808 5.80 61.13 4303.06 Day 112  3802 5.42 30.71 4426.81  3144 5.85 63.53 3588.08 Aerobic                     Day 1  3036 4.37 10.47 3507.68  2167 5.02 18.63 2693.75 Day 19  3123 5.22 44.88 3625.90  4596 6.10 61.01 5441.68 Day 34  3728 5.39 37.64 4182.73  1161 5.81 60.00 1389.04 Day 48  3757 5.51 50.83 4297.09  3372 5.49 36.63 3784.72 Day 65  4015 5.59 44.87 4446.16  2857 6.21 65.46 3392.51 Day 79  4552 5.74 58.31 6239.53  3147 5.69 61.60 3464.46 Day 98  4526 6.04 68.24 6183.16  4473 5.84 63.40 5062.06 Day 112   4017 5.56 37.41 4487.16   3595 5.28 53.46 4130.54 a Number of OTUs in sample.        b Shannon and Simpson diversity indices consider both OTU (97%) richness and evenness.   Inverse Simpson values are reported.       c Chao1 index for OTU richness.          97 Table A3 Abundance and activity of select EBPR taxa Taxonomy by Functional Group %rDNAa range %rRNAa range SSU rRNA:rDNA ratioa range Higher Classification Hydrolyzers                  Candidatus Microthrix parvicella 11.80 8.15-15.22 3.49 0.94 - 4.57 0.43 0.14 - 0.87 Actinobacteria  Gordonia  8.40 0.74-29.22 2.90 0.99 - 5.89 0.84 0.26 - 1.36 Corynebacteriales  Chitinophagaceae 7.02 2.31-14.77 1.28 0.70 - 1.67 0.41 0.10 - 0.78 Sphingobacteriales  Sapropiraceae 5.80 2.12-11.24 1.10 0.42 - 2.97 0.26 0.10 - 0.51 Sphingobacteriales  Flexibacter 4.82 1.20 - 9.74 0.65 0.27 - 1.21 0.35 0.07 - 1.18 Cytophagales  NS9 marine group 3.52 0.37 - 6.52 0.74 0.08 - 1.30 0.29 0.19 - 0.51 Flavobacteriales  env.OPS 17 1.48 0.46 - 3.15 0.31 0.07 - 0.94 0.51 0.10 - 2.41 Sphingobacteriales  Mycobacterium 1.31 0.33 - 3.17 0.24 0.16 -0.32 0.37 0.08 - 0.70 Corynebacteriales  Persicobacter 1.13 0.01 - 3.57 1.24 0.01 - 3.01 2.08 0.30 - 5.59 Cytophagales  PHOS-HE51 1.06 0.08 - 4.72 0.16 0.02 - 0.39 0.35 0.10 - 0.82 Sphingobacteriales  Flavobacterium 0.84 0.52 - 1.22 0.26 0.12 - 0.61 0.50 0.20 - 1.75 Flavobacteriales  Chryseobacterium 0.77 0.01 - 4.81 0.08 0.02 - 0.17 1.11 0.05 - 3.19 Flavobacteriales  Thermomicrobia 0.62 0.25 - 0.95 0.11 0.06 - 0.18 0.33 0.06 - 0.68 Chloroflexi  Caldilinea 0.59 0.28 - 1.15 0.23 0.05 - 0.43 0.54 0.18 - 1.11 Chloroflexi-Caldilineae   Anaerolineae 0.33 0.11 - 0.67 0.12 0.02 - 0.23 0.54 0.22 - 1.39 Chloroflexi  Candidate division TM7 0.19 0.05 - 0.78 0.01 0.00 - 0.01 0.11 0.01 - 0.43 TM7 Fermenters                  Lachnospiraceae 0.52 0.33 - 0.82 0.86 0.22 - 1.43 2.35 0.77 - 3.67 Clostridiales  Christensenellaceae 0.46 0.16 - 0.94 0.30 0.13 - 0.45 1.14 0.46 - 3.31 Clostridiales  Ruminococcaceae 0.46 0.36 - 0.56 0.38 0.17 - 0.56 1.22 0.56 - 3.10 Clostridiales  Streptococcus 0.38 0.23 - 0.55 5.42 1.7 - 12.03 20.90 4.81 - 46.43 Lactobacillales  Paludibacter 0.29 0.14 - 0.44 0.12 0.06 - 0.18 0.58 0.36 - 1.51 Bacteroidales  Lactococcus 0.16 0.06 - 0.22 4.65 2.97 - 8.47 49.28 17.51 - 112.03 Lactobacillales  98 PAO/GAO                  Propionivibrio 1.38 0.46 - 2.63 1.21 0.59 - 1.72 1.97 0.53 - 8.05 Rhodocyclaceae  Tetrasphaera 0.11 0.08 - 0.14 0.02 0.00 - 0.04 0.25 0.03 - 0.50 Actinobacteria  Candidatus Accumulibacter sp. 0.06 0.02 - 0.14 0.83 0.36 - 1.38 31.30 6.25 - 99.31 Rhodocyclaceae  Defluviicoccus 0.06 0.03 - 0.10 0.07 0.02 - 0.11 1.64 0.69 - 3.75 Alphaproteobacteria AOB/NOB                  Nitrospira 0.90 0.57 - 1.38 0.51 0.25 - 0.84 0.74 0.35 - 0.98 Nitrospirales  Candidatus Nitrotoga 0.04 ND - 0.09 0.36 0.17 - 0.63 58.21 1.98 - 226.67 Nitrosomonadales  Nitrosomonas  0.01 ND - 0.02 0.05 0.03 - 0.09 8.86 1.94 - 33.00 Nitrosomonadales Denitrifiers                  Thauera 2.60 0.28 - 4.89 1.56 0.28 - 3.29 0.91 0.20 - 1.92 Rhodocyclaceae  Acidovorax 2.27 0.61 - 3.63 0.62 0.26 - 1.05 0.56 0.22 - 2.12 Comamonadaceae  Dechloromonas 1.44 0.97 - 2.09 7.39 4.38 - 12.6 8.81 3.55 - 29.61 Rhodocyclaceae  Zoogloea 0.40 0.10 - 0.63 0.19 0.13 - 0.23 0.97 0.34 - 2.41 Rhodocyclaceae  Sphaerotilus 0.38 0.12 - 0.89 0.70 0.26 - 1.06 3.79 0.82 - 11.64 Comamonadaceae  Rhodobacter 0.30 0.08 - 0.65 0.11 0.06 - 0.15 0.82 0.20 - 1.96 Rhodobacteraceae  Ancalomicrobium 0.24 0.10 - 0.34 0.13 0.07 - 0.18 0.81 0.41 - 1.09 Hyphomicrobiaceae  Azospira 0.18 0.11 - 0.28 0.08 0.06 - 0.13 0.68 0.27 - 1.40 Rhodocyclaceae  Variovorax 0.18 0.08 - 0.40 0.16 0.09 - 0.22 1.75 0.24 - 3.50 Comamonadaceae  Hyphomicrobium 0.17 0.05 - 0.37 0.10 0.04 - 0.19 0.99 0.34 - 1.65 Hyphomicrobiaceae Grazers                  Peritrichia 0.20 ND - 0.73 10.50 0.87-19.85 412.97 18.54 - 1224.67 Ciliphora  Suctoria 0.20 ND - 0.81 1.80 0.25 - 5.87 96.44 8.27 - 197.75 Ciliphora  Rotifera 0.09 ND - 0.27 6.69 0.82-38.70 96.84 6.17 - 282.40 Metazoa  Euglenida 0.04 ND - 0.11 0.87 0.05 - 2.32 35.52 11.73 - 58.71 Excavata  Vannella 0.004 ND - 0.01 0.83 0.09 - 3.96 262.17 14.78 - 652.83 Amoebozoa  Aspidisca ND ND 1.61 0.26 - 4.76 1446.34 256.00-4395.00 Ciliphora Other                  99  Planctomycetes 3.78 1.87 - 6.56 1.78 0.87 - 3.02 0.66 0.34 - 1.09 Bacteria  Hydrogenophilaceae 1.48 0.21 - 3.45 0.59 0.34 - 0.75 1.42 0.17 - 4.58 Betaproteobacteria  LKM11 group  1.24 0.13 - 6.63 0.17 0.03 - 0.51 0.84 0.09 - 4.67 Fungi  Acidobacteria 0.77 0.36 - 1.00 0.14 0.09 - 0.24 0.25 0.15 - 0.35 Bacteria  Nannocystis 0.64 0.05 - 1.51 2.10 0.14 - 6.0 6.43 1.75 - 28.78 Myxobacteria  Prosthecobacter 0.42 0.19 - 0.71 0.03 0.01 - 0.04 0.10 0.06 - 0.23 Verrucomicrobia  Armatimonadetes 0.20 ND - 0.70 0.92 0.01 - 3.08 6.21 1.63 - 13.23 Bacteria  Methanosarcinales 0.13 0.02 - 0.34 0.37 0.13 - 0.86 5.17 2.87 - 3.20 Archaea  Methanobacteriales 0.03 ND - 0.05 0.23 0.06 - 0.66 13.11 1.87 - 6.95 Archaea  Basidiomycota 0.02 0.01 - 0.03 0.97 0.12 - 3.91 75.85 7.46 - 151.09 Fungi   Fungi 0.01 ND - 0.02 0.24 0.06 - 0.46 5.77 0.14 - 13.84 Fungi amean values. ND = not detectable. ND values were imputed to calculate SSU rRNA: rDNA ratios.    100 Table A4 Indicator OTUs – rDNA abundance   Period 1 Period 2 Period 3             OTUa Days 1, 19,  & 34 Days 48 &  65 Days 79, 98,  & 112 p value Taxonomic classification rDNA_max rDNA_avg rRNA_max rRNA_avg Abundant                 23654 0.16 0.11 0.73 0.007 Thauera 1.73% 0.75% 2.50% 1.18% 32670 0.14 0.10 0.76 0.008 Hydrogenophilaceae 1.38% 0.57% 0.50% 0.32% 33447 0.55 0.09 0.36 0.003 Dechloromonas 1.08% 0.48% 1.22% 0.77% Intermediate                 56156 0.14 0.09 0.77 0.006 Flexibacter 0.96% 0.35% 0.06% 0.04% 4653 0.23 0.07 0.70 0.003 Dechloromonas 0.68% 0.29% 9.95% 5.24% 28771 0.69 0.06 0.25 0.006 Aeromonas 0.52% 0.17% 0.10% 0.05% 61685 0.19 0.02 0.80 0.003 Denitratisoma 0.51% 0.14% 0.07% 0.02% 59787 0.31 0.08 0.62 0.003 Gemmatimonadaceae 0.45% 0.20% 0.00% 0.00% 57183 0.73 0.00 0.26 0.005 Filomicrobium 0.39% 0.16% 0.00% 0.00% 47748 0.25 0.03 0.72 0.003 Polyangiaceae 0.39% 0.12% 0.01% 0.00% 27989 0.78 0.06 0.16 0.003 Gordonia 0.32% 0.08% 0.03% 0.02% 37305 0.66 0.06 0.28 0.004 Enhydrobacter 0.26% 0.12% 0.33% 0.25% 33208 0.35 0.08 0.57 0.007 Caldilinea 0.26% 0.14% 0.02% 0.01% 43703 0.76 0.09 0.15 0.007 Solirubrobacterales 0.24% 0.08% 0.05% 0.03% 28383 0.04 0.00 0.94 0.008 Oceanospirillales 0.24% 0.06% 0.14% 0.03% 62931 0.06 0.02 0.92 0.003 Planctomycetes 0.23% 0.04% 0.01% 0.00% 62243 0.24 0.16 0.60 0.008 MLE1-12 0.21% 0.09% 0.45% 0.28% 45955 0.05 0.01 0.93 0.009 Bacteroidales 0.20% 0.04% 0.02% 0.01% 62405 0.62 0.07 0.31 0.004 Azospira 0.18% 0.09% 0.04% 0.03% 14606 0.25 0.04 0.71 0.005 Haliangiaceae 0.18% 0.07% 0.08% 0.05% 43954 0.11 0.00 0.89 0.005 Phenylobacterium 0.17% 0.03% 0.01% 0.00%  101 41360 0.61 0.09 0.30 0.008 Lactococcus 0.16% 0.08% 6.74% 3.54% 9875 0.83 0.06 0.12 0.008 Alphaproteobacteria 0.14% 0.04% 0.00% 0.00% 41070 0.10 0.07 0.83 0.009 Chitinophagaceae 0.14% 0.05% 0.10% 0.02% 22155 0.87 0.00 0.12 0.003 Candidate division TM7 0.13% 0.03% 0.00% 0.00% 32176 0.14 0.03 0.83 0.003 Hydrogenophilaceae 0.13% 0.06% 0.02% 0.01% 31872 0.09 0.01 0.88 0.005 Bryobacter 0.13% 0.03% 0.09% 0.04% 16742 0.92 0.00 0.08 0.003 Gordonia 0.12% 0.03% 0.05% 0.02% 52371 0.32 0.08 0.60 0.005 Hyphomonadaceae 0.11% 0.05% 0.01% 0.01% 25575 0.28 0.08 0.64 0.003 Bacteroidetes 0.11% 0.05% 0.04% 0.02% 34754 0.68 0.05 0.28 0.004 Rhizobacter 0.10% 0.04% 0.01% 0.01% a Only OTUs with >0.1% rDNA abundance were reported.          102 Table A5 Indicator OTUs – SSU rRNA:rDNA ratio OTUa Day  1 Days  19, 34 Day  48 Days  65,79,  98 Day  112 p val Taxonomic  classification rRNA max rRNA avg rDNA max rDNA avg Abundant                       14721 0.00 0.03 0.04 0.03 0.90 0.007 Verrucomicrobia 0.01% 0.01% 1.07% 0.32% Intermediate                     21475 0.04 0.06 0.80 0.01 0.09 0.008 Gordonia 0.07% 0.03% 0.40% 0.07% 19222 0.03 0.07 0.69 0.05 0.16 0.009 Pirellula 0.51% 0.18% 0.25% 0.12% 469 0.04 0.04 0.66 0.03 0.22 0.006 Afipia 0.02% 0.01% 0.16% 0.08% 10095 0.06 0.09 0.68 0.07 0.11 0.007 Neisseriaceae 0.25% 0.18% 0.10% 0.04% Rare                       33323 0.02 0.10 0.08 0.59 0.21 0.007 Conthreep 0.11% 0.03% 0.02% 0.00% 57134 0.00 0.02 0.91 0.00 0.07 0.007 Haplozoon 0.57% 0.10% 0.00% 0.00% 2749 0.00 0.02 0.89 0.00 0.09 0.001 Haplozoon 0.31% 0.06% 0.00% 0.00% 27860 0.00 0.02 0.89 0.00 0.08 0.003 Haplozoon 0.12% 0.02% 0.00% 0.00% a Only OTUs with >0.1% rDNA or rRNA abundance were reported.       103 Appendix B – Chapter 3 supplementary material    Figure B1 UDP-D-xylose biosynthesis pathways in M. parvicella spp. Reactions indicated in bronze; genome ORF indicated in purple; enzyme classification number indicated in blue. Microthrix parvicella Bio 17-1Microthrix parvicella UBC strainUDP-D-xylose biosynthesis 104    Figure B2 dTDP-L-rhamnose biosynthesis pathways in M. parvicella spp. Reactions indicated in bronze; genome ORF indicated in purple; enzyme classification number indicated in blue.  dTDP-L-rhamnose biosynthesisM.parvicella UBC strainM.parvicella Bio17-1 105 Table B1 Marker genes identified in population genome bins TIGRFAM   Bin001 Bin002 Bin003 Bin004 Accession Function M. parvicella Gordonia spp. N/A N/A Total marker N/A 103 83 81 13 Unique marker N/A 99 80 72 12 PGK N/A 1 1 2 1 Ribosomal_L23 ribosomal protein L23 1 1   Ribosomal_L5 ribosomal protein L5 1  1  Ribosomal_L3 ribosomal protein L3 1 1 1  Ribosomal_L6 ribosomal protein L6 2 2 2  Ribosomal_S17 ribosomal protein S17 1    Ribosomal_S9 ribosomal protein S9 1 1 1  Ribosomal_S8 ribosomal protein S8 1 1 1  Ribosomal_S11 ribosomal protein S11 1 1   Ribosomal_S13 ribosomal protein S13 1 1 1  Ribosomal_L10 ribosomal protein L10 1 1 1  Ribosomal_L4 ribosomal protein L4 1 1 1  tRNA-synt_1d tRNA synthetases class I 1 2   GrpE protein GrpE 1 2 1 1 Methyltransf_5 methyltransferase 1 1 1 2 TIGR00001 ribosomal protein L35 1 1 1  TIGR00002 ribosomal protein S16 1 1 1  TIGR00009 ribosomal protein L28 1 1 1  TIGR00012 ribosomal protein L29 1    TIGR00019 peptide chain release factor 1     TIGR00029 ribosomal protein S20 1    TIGR00043 probable rRNA maturation factor YbeY 1 1 1  TIGR00059 ribosomal protein L17 1    TIGR00060 ribosomal protein L18 1 1   TIGR00061 ribosomal protein L21 1 1 1  TIGR00062 ribosomal protein L27 1 1 1  TIGR00064 signal recognition particle-docking protein FtsY 1  1  TIGR00082 ribosome-binding factor A 1 1 1  TIGR00086 SsrA-binding protein 1 1 2  TIGR00092 GTP-binding protein YchF 1 1 1  TIGR00115 trigger factor 1 1 2  TIGR00116 translation elongation factor Ts 1 1 2  TIGR00152 dephospho-CoA kinase 1 1 1 1 TIGR00158 ribosomal protein L9 1 1 1  TIGR00165 ribosomal protein S18 1 1 1  TIGR00166 ribosomal protein S6 1 1 1  TIGR00168 translation initiation factor IF-3 1 1 1  TIGR00234 tyrosine--tRNA ligase 1 1 2  TIGR00337 CTP synthase 1 1 1  TIGR00344 alanine--tRNA ligase 1  1  TIGR00362 chromosomal replication initiator protein DnaA 1 1 1  TIGR00389 glycine--tRNA ligase 1 1 1  TIGR00392 isoleucine--tRNA ligase 2    TIGR00396 leucine--tRNA ligase 2 1   TIGR00409 proline--tRNA ligase 1 1  1 TIGR00414 serine--tRNA ligase 1 1 2  TIGR00418 threonine--tRNA ligase 1  1  TIGR00420 tRNA (5-methylaminomethyl-2-thiouridylate)-methyltransferase 1 1 1  TIGR00422 valine--tRNA ligase     TIGR00435 cysteine--tRNA ligase 2 1 1  TIGR00436 GTP-binding protein Era 1 1 1   106 TIGR00442 histidine--tRNA ligase 1  1 1 TIGR00459 aspartate--tRNA ligase 1    TIGR00460 methionyl-tRNA formyltransferase  1 1  TIGR00468 phenylalanine--tRNA ligase, alpha subunit 1  1  TIGR00472 phenylalanine--tRNA ligase, beta subunit 1  1  TIGR00487 translation initiation factor IF-2 1 1 1 1 TIGR00496 ribosome recycling factor 1  2 1 TIGR00575 DNA ligase, NAD-dependent 1 1 2 1 TIGR00631 excinuclease ABC subunit B 1 1   TIGR00663 DNA polymerase III, beta subunit 1 1 1  TIGR00755 ribosomal RNA small subunit methyltransferase A  1 1  TIGR00810 preprotein translocase, SecG subunit 1 1  1 TIGR00855 ribosomal protein L7/L12 1 1 1  TIGR00922 transcription termination/antitermination factor NusG 1 1 1  TIGR00952 ribosomal protein S15 1 1 1  TIGR00959 signal recognition particle protein 1  1  TIGR00963 preprotein translocase, SecA subunit 1 1   TIGR00964 preprotein translocase, SecE subunit 1 1   TIGR00967 preprotein translocase, SecY subunit 1 1   TIGR00981 ribosomal protein S12 1 1   TIGR01009 ribosomal protein S3 1 1   TIGR01011 ribosomal protein S2 1 1 1 1 TIGR01017 ribosomal protein S4 1 1 1  TIGR01021 ribosomal protein S5 1 1   TIGR01024 ribosomal protein L19 1  1 1 TIGR01029 ribosomal protein S7 1 1   TIGR01030 ribosomal protein L34     TIGR01031 ribosomal protein L32 1  1  TIGR01032 ribosomal protein L20 1 1 1  TIGR01044 ribosomal protein L22 1 1   TIGR01049 ribosomal protein S10 1 1 1  TIGR01050 ribosomal protein S19 1 1 1  TIGR01059 DNA gyrase, B subunit  1   TIGR01063 DNA gyrase, A subunit 1    TIGR01066 ribosomal protein L13 1 1 1  TIGR01067 ribosomal protein L14 1 1   TIGR01071 ribosomal protein L15 1 1 1  TIGR01079 ribosomal protein L24 1 1   TIGR01164 ribosomal protein L16 1 1   TIGR01169 ribosomal protein L1 1 1 1  TIGR01171 ribosomal protein L2 1 1 1  TIGR01391 DNA primase 1  1  TIGR01393 elongation factor 4 1 1   TIGR01632 ribosomal protein L11 1 1 1  TIGR01953 transcription termination factor NusA 1 1 1  TIGR02012 protein RecA 1 1   TIGR02013 DNA-directed RNA polymerase, beta subunit 1 1 1  TIGR02027 DNA-directed RNA polymerase, alpha subunit 1    TIGR02191 ribonuclease III 1 1 1  TIGR02350 chaperone protein DnaK 1    TIGR02387 DNA-directed RNA polymerase, gamma subunit 1    TIGR02397 DNA polymerase III, subunit gamma and tau 1 1 1  TIGR02432 tRNA(Ile)-lysidine synthetase  1 1  TIGR02729 Obg family GTPase CgtA 1 1 1  TIGR03263 guanylate kinase   1  TIGR03594 ribosome-associated GTPase EngA 1   1    107 Table B2 Bin001 (Candidatus ‘Microthrix parvicella’) variable genomic regions contig_orf %ID  M. parvicella  RN1 %ID  M. parvicella  Bio17-1 %ID nr  database RefSeq annotation non-redundant (nr) annotation outer membrane biosynthesis (EPS-related)       1_89 100 #N/A 47 hypothetical protein [Caulobacter vibrioides] hypothetical protein HypdeDRAFT_1834  1_90 100 #N/A 34 putative secreted protein [Nocardioidaceae bacterium Broad-1] G-D-S-L family lipolytic protein  1_91 100 #N/A 30 #N/A glycosyl transferase family 2  1_92 100 #N/A no hit #N/A no hit 1_93 100 #N/A no hit #N/A no hit 1_94 100 #N/A no hit #N/A no hit 1_95 100 40.87 59 CoA transferase [Frankia sp. EAN1pec] formyl-CoA transferase  1_96 100 54.85 69 hypothetical protein [Phenylobacterium zucineum] hypothetical protein PHZ_c2268  1_97 100 28.34 50 hypothetical protein [Actinopolymorpha alba] Hydroxymethylglutaryl-CoA lyase  1_98 100 90.53 54 hypothetical protein [Candidatus Microthrix parvicella] carotenoid oxygenase  1_99 100 83.05 61 hypothetical protein [Candidatus Microthrix parvicella] acetyl-CoA acetyltransferase  1_100 100 #N/A no hit #N/A no hit 1_101 100 74.84 34 hypothetical protein [Candidatus Microthrix parvicella]  1_102 100 83.09 55 hypothetical protein [Candidatus Microthrix parvicella] enoyl-CoA hydratase  1_103 100 77.66 42 hypothetical protein [Candidatus Microthrix parvicella] short-chain dehydrogenase/reductase SDR  1_104 100 #N/A no hit #N/A no hit 1_105 100 36.84 33 #N/A  periplasmic component of the Tol biopolymer transport system  1_106 100 38.05 29 group 1 glycosyl transferase [Acidothermus cellulolyticus] glycosyltransferase  1_107 100 #N/A 31 hypothetical protein [Salinicoccus albus] serine O-acetyltransferase  1_108 100 #N/A 34 #N/A Alpha-1,4-glucan-protein synthase, UDP-forming  1_109 100 32.89 40 hypothetical protein [Salinispora arenicola] dtdp-4-dehydrorhamnose reductase  1_110 100 36.21 35 3-demethylubiquinone-9 3-methyltransferase [alpha proteobacterium LLX12A] Methyltransferase type 11  1_111 100 37.3 41 glycosyl transferase family 2 [Prevotella loescheii] putative glycosyltransferase  1_112 99.93 #N/A 23 translocation protein TolB [Achromobacter arsenitoxydans] lipid A core-O-antigen ligase-like enyme  1_113 100 #N/A 30 glycosyl transferase [Arenimonas oryziterrae] cell wall biosynthesis glycosyltransferase  1_114 100 #N/A 28 polysaccharide biosynthesis family protein [Lyngbya aestuarii] polysaccharide biosynthesis protein   108 1_115 100 #N/A 26 #N/A exopolysaccharide transport protein  1_116 100 38.66 35 glycosyltransferase, group 1 family protein [delta proteobacterium NaphS2] putative family 2 glycosyltransferase WsfG  1_117 100 44.28 52 UDP-phosphate galactose phosphotransferase [Actinoplanes globisporus] UDP-phosphate galactose phosphotransferase  1_118 100 #N/A 33 N-formylglutamate amidohydrolase [Clostridium sp. KLE 1755] N-formylglutamate amidohydrolase superfamily  1_119 100 36.08 50 hypothetical protein [Streptomyces canus] hypothetical protein SPAM21_01025  1_120 100 #N/A 65 hypothetical protein [Salinibacterium sp. PAMC 21357] hypothetical protein SPAM21_01030  1_121 100 88.24 53 hypothetical protein [Candidatus Microthrix parvicella] AMP-dependent synthetase and ligase  1_122 100 58.97 33 hypothetical protein [Candidatus Microthrix parvicella] putative hydrolase  1_123 100 67.34 40 hypothetical protein [Candidatus Microthrix parvicella] hypothetical protein  1_124 100 66.17 no hit hypothetical protein [Candidatus Microthrix parvicella] no hit 1_125 100 31.03 53 hypothetical protein [Azoarcus toluclasticus] ThiamineS protein  1_126 100 #N/A 62 glycosyl hydrolase [Cupriavidus taiwanensis] glycosyl hydrolase, bnr repeat  restriction-modification system, partial       2_1 #N/A 100 no hit #N/A no hit 2_2 #N/A 94.93 38 hypothetical protein [Candidatus Microthrix parvicella] 5-methylcytosine restriction system component-like protein  2_3 #N/A #N/A no hit #N/A no hit 2_4 #N/A #N/A 41 hypothetical protein [Kocuria sp. UCD-OTCP] restriction endonuclease  2_5 #N/A #N/A no hit #N/A no hit 2_6 #N/A #N/A no hit hypothetical protein [Dehalobacter sp. FTH1] no hit antibiotic biosynthesis         2_26 #N/A #N/A no hit hypothetical protein, partial [Rhodococcus erythropolis] no hit 2_27 #N/A #N/A 45 hypothetical protein [Synechococcus elongatus] hydroxyneurosporene-O-methyltransferase  2_28 #N/A #N/A 57 Tryptophan halogenase PrnA [Streptomyces rimosus]  halogenase  2_29 #N/A #N/A 34 hypothetical protein [SAR406 cluster bacterium SCGC AB-629-J13] Kynurenine 3-monooxygenase  2_30 #N/A #N/A 37 hypothetical protein [Photorhabdus temperata] hypothetical protein plu0999  2_31 #N/A 30.25 40 putative Histidinol-phosphate aminotransferase 2 [Streptomyces aurantiacus] class I and II (histidinol-phosphate) aminotransferase  2_32 #N/A #N/A 37 #N/A protein involved in biosynthesis of mitomycin antibiotics/polyketide  2_33 #N/A #N/A no hit #N/A no hit 2_34 #N/A #N/A no hit #N/A no hit 2_35 #N/A #N/A 34 #N/A transposase of ISGme9, IS481 family  2_36 #N/A #N/A no hit hypothetical protein [candidate division OP9 bacterium SCGC AAA255-E04] no hit  109 antibiotic resistance         3_55 #N/A #N/A no hit conserved hypothetical protein [Candidatus Microthrix parvicella] no hit 3_56 #N/A #N/A 40 Plasmid stabilization system protein [Plesiocystis pacifica] Plasmid stabilization system protein  3_57 98.51 98.51 47 hypothetical protein [Candidatus Microthrix parvicella] cytochrome P450  3_58 91.19 91.19 38 putative Transcriptional regulator [Candidatus Microthrix parvicella] putative TetR family transcriptional regulator  3_59 93.65 93.65 53 catechol 2,3 dioxygenase [Arthrobacter gangotriensis] glyoxalase/bleomycin resistance protein/dioxygenase  3_60 #N/A #N/A 78 cupin [Gordonia rhizosphera] cupin domain-containing protein  3_61 #N/A #N/A no hit #N/A no hit 3_62 #N/A 39.24 56 hypothetical protein [Streptomyces sp. TOR3209] bifunctional deaminase-reductase domain protein  3_63 #N/A 46.67 66 hypothetical protein [Amycolatopsis nigrescens] glyoxalase/bleomycin resistance protein/dioxygenase  3_64 #N/A #N/A no hit #N/A no hit 3_65 #N/A #N/A no hit #N/A no hit             3_128 #N/A #N/A 55 transposase [Comamonadaceae] transposase IS4 family protein  3_129 92.13 82.54 35 hypothetical protein [Candidatus Microthrix parvicella] transposase IS4 family protein  3_130 #N/A 50.94 55 integrase [Mycobacterium] transposase  3_131 #N/A #N/A no hit tranposase [Corynebacterium diphtheriae] no hit 3_132 #N/A 26.7 40 hypothetical protein [Streptomyces sp. PVA 94-07] putative aspartate/aromatic aminotransferase  3_133 #N/A 28.81 37 hypothetical protein [Nocardiopsis ganjiahuensis]  Inositol phosphatase/fructose-1,6-bisphosphatase:Inositol monophosphatase 3_134 #N/A #N/A 38 hypothetical protein [Amycolatopsis sp. ATCC 39116] sulfotransferase  3_135 #N/A 25.56 42 adenylylsulfate kinase [Pyrobaculum arsenaticum] putative adenylyl-sulfate kinase (modular protein)  3_136 #N/A 29.2 50 hypothetical protein [Chitiniphilus shinanonensis] putative methyltransferase  3_137 #N/A #N/A 43 hypothetical protein [Chitiniphilus shinanonensis] Phosphoenolpyruvate synthase/pyruvate phosphate dikinase  3_138 #N/A #N/A 47 hypothetical protein [Amycolatopsis sp. ATCC 39116] putative gamma-glutamyl-gamma-aminobutyrate hydrolase  3_139 #N/A #N/A 45 hypothetical protein [Methylotenera mobilis] nucleotidyl transferase  3_140 #N/A 34.84 46 isoleucyl-tRNA synthase [Streptomyces] isoleucyl-tRNA synthetase  3_141 #N/A #N/A 24 #N/A  Proline--tRNA ligase  3_142 #N/A #N/A 46 peptidase U61 LD-carboxypeptidase A [Micromonospora sp. CNB394]  peptidase U61 LD-carboxypeptidase A  3_143 #N/A 58.1 60 leucyl-tRNA synthetase [Frankia sp. Iso899] leucyl-tRNA synthetase   110 3_144 #N/A #N/A no hit #N/A no hit 3_145 100 #N/A 57 hypothetical protein [Salinispora pacifica]  hypothetical protein Mspyr1_53990  3_146 99.08 29.59 61 hypothetical protein [Salinispora pacifica] site-specific recombinase XerD  3_147 99.72 29.68 63 hypothetical protein [Salinispora pacifica] site-specific recombinase XerD  3_148 41.24 48 45 integrase [Dietzia sp. UCD-THP] transposase              8_1 100 #N/A  rRNA adenine methyltransferase [Streptomyces violaceusniger] 8_2 96.53 #N/A  hypothetical protein [Gordonia rhizosphera]  8_3 97.48 #N/A  #N/A  8_4 100 #N/A  #N/A  8_5 94.55 #N/A  #N/A  8_6 93.59 #N/A  hypothetical protein [Mycobacterium smegmatis]  8_7 99.73 #N/A  AAA ATPase [Frankia sp. CN3]  8_8 90.15 #N/A  hypothetical protein [Rhodococcus sp. P14]  8_9 88.99 48.08  DNA methylase N-4/N-6 domain protein [Frankia sp. CN3] 8_10 83.41 #N/A  antirestriction protein [Synechococcus sp. PCC 7002]  8_11 #N/A #N/A  #N/A  8_12 #N/A #N/A  hypothetical protein [Nocardia sp. 348MFTsu5.1]  8_13 92.16 #N/A  hypothetical protein [Frankia sp. CN3]  8_14 #N/A #N/A  #N/A  8_15 #N/A #N/A  #N/A  8_16 90.38 #N/A  #N/A  8_17 80.19 #N/A  #N/A  8_18 97.87 34.88  XRE family transcriptional regulator [Rhodococcus sp. P14] 8_19 94.69 #N/A  hypothetical protein [Rhodococcus sp. P14]  8_20 92.64 25.17  recombinase [Rhodococcus sp. P14]  outer membrane biosynthesis (EPS-related)       9_1 100 #N/A no hit #N/A no hit 9_2 33.33 32.6 55 glutamate-1-semialdehyde aminotransferase-like protein [Thauera linaloolentis] putative bifunctional protein Glutamate-1-semialdehyde 2,1-aminomutase/3-deoxy-manno-octulosonate 9_3 83.97 #N/A 26 hypothetical protein [Afipia broomeae] aldo/keto reductase  9_4 #N/A #N/A no hit #N/A no hit  111 9_5 #N/A 30.56 57 hypothetical protein [Pseudomonas veronii] putative polysaccharide biosynthesis protein  9_6 33.33 33.46 70 transposase [Arthrobacter sp. 161MFSha2.1] integrase catalytic subunit  9_7 40.45 #N/A 74 transposase [Curtobacterium sp. B8]  transposase IS3/IS911 family protein  9_8 49.51 #N/A 56 N-acetylneuraminate synthase [Salisaeta longa] N-acetylneuraminate synthase  9_9 24.15 #N/A 30 hypothetical protein [Prochlorothrix hollandica]  glycosyltransferase family 28 protein  9_10 90.91 34.27 50 DegT/DnrJ/EryC1/StrS aminotransferase [Pseudomonas putida] DegT/DnrJ/EryC1/StrS aminotransferase  9_11 99.08 34.68 66 N-acetyl glucosamine/N-acetyl galactosamine epimerase [Amycolatopsis benzoatilytica] polysaccharide biosynthesis protein CapD  9_12 100 43.16 32 hypothetical protein [Sphingomonas wittichii] conserved hypothetical protein  9_13 100 32.56 40 ABC transporter, ATP-binding/permease protein [Tetrasphaera elongata] ABC transporter  9_14 100 30.17 42 type 11 methyltransferase [Natrialba asiatica] type 11 methyltransferase  9_15 100 #N/A 28 hypothetical protein [Paenibacillus fonticola]  glycosyl transferase, group 1 family protein  9_16 99.63 #N/A 32 hypothetical protein [Actinoplanes globisporus] Phytanoyl-CoA dioxygenase (PhyH)              10_69 #N/A #N/A 38 #N/A  transcriptional regulator, putative ATPase, winged helix family  10_70 #N/A 26.82 30 transcriptional regulator, partial [Corynebacterium-like bacterium B27] serine/threonine protein kinase  10_71 #N/A #N/A no hit #N/A no hit 10_72 #N/A #N/A 53 hypothetical protein [Nostoc sp. PCC 7120] hypothetical protein all7065  10_73 #N/A #N/A 45 hypothetical protein [Singularimonas variicoloris] XRE family transcriptional regulator 10_74 #N/A 33.54 48 serine kinase [Bradyrhizobium japonicum] HipA domain-containing protein  10_75 #N/A 33.99 40 putative uncharacterized protein [Bacteroides intestinalis CAG:315]  hypothetical protein MLP_42460  10_76 42.03 #N/A 40 hypothetical protein [Kocuria rhizophila]  hypothetical protein MMAR_4827  10_77 #N/A #N/A no hit #N/A no hit 10_78 #N/A #N/A 33 hypothetical protein [Ilumatobacter coccineus] glyoxalase/bleomycin resistance protein/dioxygenase              19_59 100 #N/A no hits #N/A no hits 19_60 100 #N/A no hits #N/A no hits 19_61 100 37.82 46 hypothetical protein [Amycolatopsis decaplanina] hypothetical protein Rv3179  19_62 #N/A #N/A 40 #N/A outer membrane autotransporter barrel domain 19_63 #N/A #N/A no hit #N/A no hit 19_64 #N/A 38.41 45 #N/A excisionase family DNA binding domain-containing protein   112 19_65 #N/A #N/A 39 hypothetical protein [Thermus islandicus] hypothetical protein Caur_3718  19_66 #N/A 100  transcriptional regulator [Methylobacter marinus]  19_67 #N/A 99.47 35 filamentation induced by cAMP protein Fic [Frankia sp. EUN1f] filamentation induced by cAMP protein Fic 19_68 #N/A 98.84  hypothetical protein [Candidatus Microthrix parvicella]  19_69 #N/A 87.61  hypothetical protein [Candidatus Microthrix parvicella]  19_70 #N/A 88.37  hypothetical protein [Candidatus Microthrix parvicella]  19_71 #N/A #N/A  #N/A  outer membrane protien (EPS-related)       22_45 #N/A 96.98 50 hypothetical protein [Candidatus Microthrix parvicella] LPXTG-motif cell wall anchor domain protein  22_46 #N/A 100 40 hypothetical protein [Candidatus Microthrix parvicella] heme oxygenase  22_47 69.78 100 42 hypothetical protein [Candidatus Microthrix parvicella] PilT protein domain protein  22_48 84.78 98.91 54 hypothetical protein [Candidatus Microthrix parvicella]  prevent-host-death family protein  22_49 95 99.69 43 hypothetical protein [Candidatus Microthrix parvicella]  exodeoxyribonuclease V alpha chain  22_50 38.35 99.56 37 hypothetical protein [Candidatus Microthrix parvicella] exodeoxyribonuclease V subunit beta  22_51 35.03 99.5 36 hypothetical protein [Candidatus Microthrix parvicella] exodeoxyribonuclease V subunit gamma  22_52 #N/A 99.38 50 hypothetical protein [Candidatus Microthrix parvicella] putative exonuclease SbcC  22_53 32 100 40 hypothetical protein [Candidatus Microthrix parvicella]  putative exonuclease  22_54 #N/A 98.43 no hit hypothetical protein [Candidatus Microthrix parvicella] no hit 22_55 #N/A 98.33 39 hypothetical protein [Candidatus Microthrix parvicella]  putative endoribonuclease  22_56 #N/A 98.89 56 hypothetical protein [Candidatus Microthrix parvicella] beta-lactamase domain-containing protein  22_57 99.63 99.88 33 hypothetical protein [Candidatus Microthrix parvicella] outer membrane adhesin like proteiin  22_58 #N/A 100 42 hypothetical protein [Candidatus Microthrix parvicella]   restriction-modification system       31_35 31.82 81.17 55 hypothetical protein [Candidatus Microthrix parvicella] transposase of ISAar4, IS3 family, IS3 group, orfB  31_36 #N/A 89.58 60 hypothetical protein [Candidatus Microthrix parvicella]  transposase IS3/IS911 family protein  31_37 #N/A 65.85 no hit #N/A no hit 31_38 #N/A 51.54 48 hypothetical protein [Candidatus Microthrix parvicella] hypothetical protein SZN_33891  31_39 #N/A 27.1 35 transposase [Bordetella petrii] integrase catalytic subunit  31_40 #N/A 30.65 48 hypothetical protein [Alicyclobacillus acidoterrestris] IstB ATP binding domain-containing protein  31_41 #N/A #N/A no hit #N/A no hit 31_42 70.53 72.18 47 hypothetical protein [Candidatus Microthrix parvicella] Type II restriction enzyme, methylase subunit   113 31_43 #N/A #N/A  #N/A no hit             32_1 #N/A #N/A no hit #N/A no hit 32_2 #N/A 31.4 38 hypothetical protein [Frankia sp. CN3] hypothetical protein AN3325.2  32_3 #N/A #N/A 35 integrase [Xanthomonas vesicatoria]  integrase, catalytic region  32_4 #N/A 35.03 70 hypothetical protein [Amycolatopsis methanolica] transposase  32_5 88.37 90.7 80 resolvase [Candidatus Microthrix parvicella] invertase/recombinase-like protein  32_6 #N/A 97.37 no hit #N/A no hit 32_7 #N/A 100 57 hypothetical protein [Candidatus Microthrix parvicella] putative acyl-CoA dehydrogenase FADE16  32_8 #N/A 99.63 62 hypothetical protein [Candidatus Microthrix parvicella] ABC-type phosphate/phosphonate transport system, periplasmic component  restriction-modification system       34_39 #N/A 100 29 hypothetical protein [Candidatus Microthrix parvicella] EcoKI restriction-modification system protein HsdS  34_40 32.86 100 36 hypothetical protein [Candidatus Microthrix parvicella] hypothetical protein Noca_4767  34_41 68.33 100 71 hypothetical protein [Candidatus Microthrix parvicella]  phage Gp37Gp68 family protein  34_42 #N/A 96.72 68 hypothetical protein [Candidatus Microthrix parvicella]  type I restriction modification system, methyltransferase subunit  34_43 #N/A 100 41 hypothetical protein [Candidatus Microthrix parvicella] hypothetical protein Rhom172_2840  34_44 #N/A 98.9 60 hypothetical protein [Candidatus Microthrix parvicella] ATPase  outer membrane biosynthesis (EPS-related)       39_20 89.08 28.99 no hit #N/A no hit 39_21 89.36 #N/A no hit #N/A no hit 39_22 89.55 30.88 no hit #N/A no hit 39_23 84.32 #N/A no hit #N/A no hit 39_24 90.2 91.62 61 hypothetical protein [Candidatus Microthrix parvicella] integrase catalytic subunit  39_25 99.75 100 53 hypothetical protein [Candidatus Microthrix parvicella] phage integrase  39_26 46.79 99.68 57 hypothetical protein [Candidatus Microthrix parvicella] integrase  39_27 59.16 100 65 putative integrase/recombinase y4rC [Candidatus Microthrix parvicella] tyrosine recombinase XerC  39_28 100 94.24 56 putative Integrase/transposase (fragment) [Candidatus Microthrix parvicella] Integrase catalytic region  39_29 100 91.58 no hit #N/A no hit 39_30 98.45 97.67  hypothetical protein [Candidatus Microthrix parvicella] transposition helper protein  39_31 100 #N/A no hit #N/A no hit  114 39_32 100 #N/A 54 hypothetical protein [Candidatus Microthrix parvicella] conserved hypothetical protein  39_33 100 34.21 39 alkyl hydroperoxide reductase [Caldithrix abyssi] alkyl hydroperoxide reductase/ Thiol specific antioxidant/ Mal allergen  39_34 100 #N/A 43 phytanoyl-CoA dioxygenase [Parvibaculum lavamentivorans] phytanoyl-CoA dioxygenase  39_35 100 #N/A 35 phytanoyl-CoA dioxygenase [Streptomyces clavuligerus] phytanoyl-CoA dioxygenase  39_36 100 #N/A 29 hypothetical protein [Saccharopolyspora spinosa] SnoK-like protein  39_37 100 #N/A no hit #N/A no hit 39_38 100 85.76 44 hypothetical protein [Candidatus Microthrix parvicella]  putative acyl-CoA N-acyltransferase  39_39 100 #N/A no hit #N/A no hit 39_40 100 51.56 42 hypothetical protein [Cesiribacter andamanensis] D-alanine export protein  39_41 100 98.19 33 hypothetical protein [Candidatus Microthrix parvicella] transposase of ISAar12, IS1380 family  39_42 100 42.8 58 O-acyltransferase [Riemerella anatipestifer] acyltransferase  39_43 99.73 45.9 46 hypothetical protein [Candidatus Microthrix parvicella] Lipopolysaccharide biosynthesis protein  39_44 99.64 #N/A 30 uncharacterized protein [Clostridium sp. CAG:411]  hypothetical protein  39_45 99.38 #N/A 37 #N/A hypothetical protein MAXJ12_18373  restriction-modification system       48_1 96.97 #N/A 56 transposase [Nakamurella multipartita] transposase  48_2 29.19 #N/A 27 #N/A hypothetical protein  48_3 #N/A #N/A no hit #N/A no hit 48_4 #N/A #N/A 24 #N/A hypothetical protein GORBP_083_00290  48_5 #N/A #N/A 44 hypothetical protein [Streptomyces sp. ScaeMP-e10] DNA restriction-modification system protein  restriction-modification system       57_17 #N/A #N/A 34 #N/A restriction endonuclease  57_18 #N/A #N/A 37 phage integrase family protein [Mycobacterium parascrofulaceum] phage integrase family protein  57_19 #N/A 29.44 43 putative transposase [Arthrobacter nicotinovorans]  integrase family protein  57_20 #N/A #N/A 35 hypothetical protein [Frankia sp. Iso899] hypothetical protein HMPREF1020_00117  57_21 #N/A #N/A no hit #N/A no hit toxin/antibiotic resistance         69_1 #N/A 83.33 40 putative integrase/recombinase y4rA [Candidatus Microthrix parvicella] putative integrase/recombinase  69_2 #N/A 56.25 no hit hypothetical protein [Candidatus Microthrix parvicella] no hit 69_3 #N/A #N/A 66 toxic anion resistance protein [Frankia sp. EAN1pec] toxic anion resistance family protein  69_4 #N/A #N/A 41 #N/A hypothetical protein Krad_3315  69_5 #N/A #N/A 66 hypothetical protein [Streptomyces sp. MspMP-M5] glyoxalase/bleomycin resistance/dioxygenase   115 69_6 #N/A #N/A no hit #N/A no hit 69_7 #N/A #N/A 28 hypothetical protein [Frankia sp. CcI3]  Haloacid dehalogenase domain protein hydrolase type 3  69_8 #N/A 31.52 37 calcium-translocating P-type ATPase PMCA-type [Bacteroides sp. CAG:462] Plasma membrane calcium-transporting ATPase  69_9 #N/A #N/A 54 hypothetical protein [Actinomadura atramentaria] conserved hypothetical protein  69_10 #N/A #N/A 33 phosphoribosyltransferase [Paenibacillus sp. ICGEB2008] hypothetical protein PPE_00945  69_11 #N/A #N/A 41 ATP/GTP-binding protein [Actinomadura atramentaria] ATP/GTP-binding protein  69_12 #N/A #N/A 55 tellurium resistance protein TerA [Deinococcus maricopensis] tellurium resistance protein TerA  69_13 #N/A #N/A 72 chemical-damaging agent resistance protein C [Pasteurella pneumotropica] tellurium resistance protein TerD  69_14 #N/A #N/A 71 chemical-damaging agent resistance protein C [Pseudomonas fragi] tellurium resistance protein TerD  69_15 #N/A #N/A 63 stress protein [Actinomadura atramentaria] stress protein  outer membrane/cell wall biosynthesis       95_1 #N/A #N/A 31 capsular polysaccharide biosynthesis protein [Fusobacterium sp. CAG:815]  capsular polysaccharide biosynthesis protein  95_2 #N/A #N/A 32 #N/A putative aminoglycoside phosphotransferase  95_3 #N/A 28.79 32 hypothetical protein [Bacillus sp. L1(2012)]  nucleoside-diphosphate-sugar transferase  95_4 #N/A 62.02 63 valyl-tRNA synthetase [Thermobifida fusca] valS gene product  95_5 #N/A #N/A no hit #N/A no hit 95_6 #N/A 33.98 26 phage integrase domain protein, partial [Bacillus nealsonii]  integrase family protein              108_1 #N/A 100  #N/A  108_2 #N/A 100  hypothetical protein [candidate division EM 19 bacterium JGI 0000001-G10] 108_3 #N/A 100  hypothetical protein [Candidatus Microthrix parvicella]  108_4 #N/A 100  hypothetical protein [Candidatus Microthrix parvicella]  108_5 #N/A #N/A  #N/A  108_6 #N/A 99.63  hypothetical protein [Candidatus Microthrix parvicella]  108_7 #N/A #N/A  #N/A  108_8 100 #N/A  hypothetical protein [Mycobacterium sp. 360MFTsu5.1]  108_9 100 #N/A  #N/A  108_10 100 #N/A  methylenetetrahydrofolate reductase [Nonomuraea coxensis] 108_11 100 #N/A  hypothetical protein, partial [Gordonia paraffinivorans]  108_12 100 #N/A  hypothetical protein [Gordonia paraffinivorans]  108_13 100 #N/A  hypothetical protein [Vibrio nigripulchritudo]   116 108_14 100 #N/A  hypothetical protein [Vibrio nigripulchritudo]  108_15 #N/A #N/A  #N/A  polysaccharide biosynthesis (likely cell wall-related)     130_1 #N/A 31.34 41 alcohol dehydrogenase [Fulvimarina pelagi] Nucleotidyl transferase  130_2 #N/A 33.82 31 glycosyl transferase family 2 [Chthoniobacter flavus] glycosyl transferase family protein  130_3 #N/A #N/A 37 metallophosphatase [Scytonema hofmanni] metallophosphoesterase  130_4 #N/A #N/A 43 hypothetical protein [Actinomadura atramentaria] putative N-acetylglucosamine-6-phosphate deacetylase  130_5 #N/A #N/A 34 hypothetical protein [Candidatus Poribacteria sp. WGA-4E] predicted protein  130_6 #N/A 36.68 36 hypothetical protein [Streptomyces sp. CNT372] glycosyl transferase family protein  130_7 #N/A 28.9 29 epimerase [Desulfotomaculum carboxydivorans] NAD-dependent epimerase/dehydratase  130_8 29.11 28.85 30 coenzyme PQQ biosynthesis protein E [Pyrobaculum aerophilum] family 2 glycosyl transferase  130_9 #N/A 37.89 34 hypothetical protein [Verrucomicrobium spinosum] glycosyltransferase, group 2 family protein  130_10 #N/A 26.11 31 hypothetical protein [Methanosarcina barkeri] pyrroloquinoline quinone biosynthesis protein E  130_11 #N/A #N/A 30 glycosyltransferase family protein [delta proteobacterium NaphS2] glycosyl transferase family 2  130_12 #N/A #N/A 34 #N/A glycosyltransferase              321_1 #N/A 33.61 48 hypothetical protein [Amycolatopsis balhimycina] transposase  321_2 #N/A 53.33 51 hypothetical protein [Candidatus Microthrix parvicella]  transposase  321_3 #N/A 34.65 51 hypothetical protein [Nocardioides sp. CF8]  putative transposase  321_4 #N/A #N/A no hit #N/A no hit 321_5 #N/A 30.24 59 transposase [Corynebacterineae] putative transposase  321_6 #N/A 29.07 67 integrase family protein [Rhodococcus wratislaviensis]  integrase family protein  321_7 #N/A 30.16 44 hypothetical protein [Gordonia polyisoprenivorans]  transposase IS3/IS911   117 Table B3 Bin002 (Gordonia spp.) variable regions contig_orf %ID  G.terrae %ID G.rhizosphera %ID G.amarae %ID nr RefSeq database annotation Non-redundant (nr) database annotation               6_14 #N/A #N/A #N/A 88 hypothetical protein [Mycobacterium marinum] hypothetical protein GOALK_093_00260  6_15 #N/A #N/A #N/A 46 hypothetical protein [Gordonia alkanivorans]  ref|YP_005585358.1|  6_16 #N/A #N/A #N/A na #N/A no hit               10_1 #N/A #N/A #N/A 49 hypothetical protein [Candidatus Microthrix parvicella]  oxidoreductase domain protein  10_2 #N/A 29.69 #N/A 31 hypothetical protein [Saccharomonospora saliphila] cupin superfamily protein  10_3 #N/A #N/A #N/A  #N/A no hit 10_4 #N/A #N/A #N/A  #N/A no hit 10_5 #N/A #N/A #N/A  hypothetical protein [Candidatus Microthrix parvicella] no hit 10_6 #N/A #N/A #N/A 37 hypothetical protein [Streptomyces vitaminophilus]  cupin 4 family protein  10_7 #N/A #N/A #N/A 35 #N/A cupin 4  10_8 #N/A #N/A #N/A 36 #N/A hypothetical protein MC7420_2498  10_9 #N/A #N/A #N/A 30 #N/A  cupin 4 family protein  10_10 #N/A #N/A #N/A 28 cupin superfamily protein [Janibacter hoylei] cupin  10_11 #N/A #N/A #N/A 28 hypothetical protein [Candidatus Microthrix parvicella] hypothetical protein SGM_5272  10_12 #N/A #N/A #N/A 27 #N/A hypothetical protein P9303_19051  10_13 #N/A #N/A #N/A 55 carbamoyl transferase [Hahella chejuensis] carbamoyl transferase  10_14 44.67 35.8 #N/A 58 hypothetical protein [Micromonospora sp. CNB394] ISMsm2, transposase  toxin-antitoxin systems (toxin)         11_7 #N/A 49.64 56.52  Hypothetical protein [Methylobacterium mesophilicum] PilT protein-like  11_8 #N/A #N/A 62.86 88 hypothetical protein [Thiomonas sp. FB-6] prevent-host-death family protein  11_9 #N/A #N/A #N/A 76 hypothetical protein [Nocardioides sp. Iso805N]  hypothetical protein Gbro_3426  11_10 #N/A #N/A #N/A 45 hypothetical protein [Nocardioides sp. Iso805N] PIN domain protein  Fatty-acid metabolsim           20_6 #N/A #N/A 30.79 44 hypothetical protein [Cellulomonas sp. JC225] secretory lipase  20_7 29.05 #N/A 43.48 52 putative serine/threonine protein kinase [Gordonia hirsuta] putative serine/threonine protein kinase  20_8 #N/A #N/A #N/A 35 putative lipase [Gordonia hirsuta] secreted protein  20_9 #N/A #N/A 46.88 47 hypothetical protein [Gordonia amarae] hypothetical protein GOAMR_13_00090   118 20_10 #N/A #N/A 37.91 38 hypothetical protein [Gordonia amarae] hypothetical protein GOAMR_13_00090  Fatty-acid metabolism           21_1 #N/A #N/A #N/A  #N/A no hit 21_2 37.04 36.77 #N/A 69 hypothetical protein [Gordonia soli] Acetyl-CoA acetyltransferase  21_3 62.37 #N/A #N/A 69 hypothetical protein [Gordonia terrae] hypothetical protein ROP_17870  21_4 32.14 #N/A #N/A 72 hypothetical protein [Gordonia terrae] hypothetical protein GOTRE_175_01880  21_5 84.77 #N/A #N/A 87 Acetyl-CoA acetyltransferase [Gordonia terrae] Acetyl-CoA acetyltransferase  21_6 55.16 57.44 55.35 76 aldehyde dehydrogenase [Rhodococcus ruber]  putative aldehyde dehydrogenase  21_7 31.62 #N/A 28.45 67 CAIB/BAIF family protein [Rhodococcus sp. EsD8]  formyl-coenzyme A transferase  21_8 82.12 54.17 53.65 83 acyl-CoA dehydrogenase [Gordonia terrae] acyl-CoA dehydrogenase  21_9 56.98 #N/A #N/A 57 putative acetyltransferase [Gordonia amicalis] putative acetyltransferase  21_10 79.92 #N/A 49.43 82 Enoyl-CoA hydratase / carnithine racemase [Gordonia terrae] Enoyl-CoA hydratase / carnithine racemase  21_11 69.58 63.2 #N/A 74 enoyl-CoA hydratase [Rhodococcus rhodochrous] enoyl-CoA hydratase                35_1 #N/A #N/A 94.05 50 hypothetical protein [Gordonia amarae]  virion core protein (lumpy skin disease virus)-like protein  35_2 #N/A #N/A 89.23 91 hypothetical protein [Gordonia amarae] hypothetical protein GOAMR_28_00130  35_3 30.4 30.83 87.74 88 hypothetical protein [Gordonia amarae] hypothetical protein GOAMR_28_00140  35_4 #N/A #N/A 94.42 66 short-chain dehydrogenase [Frankia sp. CN3] short-chain dehydrogenase/reductase SDR  35_5 29.56 #N/A 84.3 43 hypothetical protein [Gordonia amarae] AMP-dependent synthetase and ligase  35_6 #N/A 46 67.22 68 putative LuxR family transcriptional regulator [Gordonia amarae]  putative LuxR family transcriptional regulator  35_7 28.32 27.8 #N/A 51 #N/A putative LuxR family transcriptional regulator  35_8 #N/A #N/A 65.17 66 putative LuxR family transcriptional regulator [Gordonia amarae] putative LuxR family transcriptional regulator  Fatty-acid metabolism/ toxin-antitoxin          36_1 #N/A #N/A 83.12 84 hypothetical protein [Acetobacter pasteurianus] hypothetical protein GOAMR_43_00440  36_2 #N/A 24.19 86.9 87 hypothetical protein [Nocardia sp. BMG111209] putative major facilitator superfamily transporter  36_3 #N/A #N/A 93.95 64 lipase [Nocardia farcinica] putative lipase  36_4 #N/A #N/A 91.61 92 hypothetical protein [Gordonia hirsuta] hypothetical protein GOAMR_43_00470  36_5 #N/A #N/A #N/A 47 hypothetical protein [Gordonia bronchialis] hypothetical protein Gbro_0376  36_6 28.8 44.34 #N/A 45 UDP-glucuronosyltransferase [Gordonia rhizosphera] UDP-glucuronosyl/UDP-glucosyltransferase  36_7 #N/A 52.02 #N/A 55 TetR family transcriptional regulator [Gordonia rhizosphera] TetR family transcriptional regulator  36_8 33.94 42.86 27.94 86 putative RutC family protein YjgH [Streptomyces aurantiacus] endoribonuclease L-PSP  36_9 #N/A #N/A 78.68 50 hypothetical protein [Gordonia amarae] DNA polymerase beta domain-containing protein   119 36_10 #N/A #N/A #N/A 64 twitching motility protein PilT [Actinomyces massiliensis] PIN domain-containing protein  36_11 #N/A 36.23 #N/A 75 prevent-host-death protein [Arthrobacter sp. PAO19] prevent-host-death family protein                37_1 #N/A #N/A #N/A 77 sulfate transporter [Gordonia rhizosphera] putative sulfate transporter  37_2 36.63 36.9 30.98 78 NADPH:quinone reductase [Gordonia rhizosphera] putative oxidoreductase  37_3 #N/A 61.36 #N/A 88 AcrR family transcriptional regulator [Gordonia rhizosphera] putative TetR family transcriptional regulator  37_4 71.78 69.53 #N/A 93 cyclic diguanylate phosphodiesterase [Gordonia rubripertincta] hypothetical protein GOAMR_07_00140  37_5 89.15 90.93 #N/A 98 putative aldehyde dehydrogenase [Gordonia soli] putative aldehyde dehydrogenase  37_6 86.76 85.4 #N/A 94 hypothetical protein [Gordonia] hypothetical protein GOAMR_07_00160  37_7 #N/A #N/A #N/A 88 hypothetical protein [Gordonia soli] hypothetical protein GOAMR_07_00170  37_8 29.37 31.22 30.17 86 alpha/beta hydrolase [Gordonia polyisoprenivorans] putative carboxylesterase  37_9 #N/A #N/A #N/A 93 glutamate synthase large subunit [Gordonia soli] glutamate synthase large subunit  37_10 #N/A #N/A #N/A 93 glutamate synthase [Gordonia] glutamate synthase large subunit                38_1 55.37 #N/A #N/A 61 hypothetical protein [Nocardia sp. 348MFTsu5.1] isoniazid inducible protein iniC  38_2 49.57 #N/A #N/A 52 hypothetical protein [Nocardia sp. 348MFTsu5.1] isoniazid inducible protein IniA  38_3 #N/A #N/A #N/A  hypothetical protein [Gordonia effusa] no hit 38_4 #N/A #N/A #N/A 43 hypothetical protein [Nocardia sp. 348MFTsu5.1] hypothetical protein GOEFS_014_00250  38_5 #N/A #N/A #N/A 45 #N/A hypothetical protein MPHLEI_10590  38_6 33.66 #N/A #N/A 35 hypothetical protein, partial [Rhodococcus sp. JVH1] conserved hypothetical proline and threonine rich protein  38_7 #N/A #N/A #N/A 39 putative LuxR family transcriptional regulator [Gordonia effusa] putative LuxR family transcriptional regulator  Bacteriophage             41_1 #N/A #N/A #N/A 29 #N/A  minor tail protein  41_2 #N/A #N/A #N/A 50 hypothetical protein [Gordonia sihwensis] gp23  41_3 #N/A #N/A #N/A 51 hypothetical protein [Gordonia sihwensis]  phage tail tape measure protein, TP901 family, core region  41_4 #N/A #N/A #N/A  #N/A  41_5 #N/A #N/A #N/A  #N/A no hit 41_6 #N/A #N/A #N/A  hypothetical protein [Nocardia farcinica]  41_7 #N/A #N/A #N/A  hypothetical protein [Gordonia sihwensis]  41_8 #N/A #N/A #N/A  hypothetical protein [Gordonia sihwensis]  41_9 #N/A #N/A #N/A  #N/A no hit 41_10 #N/A #N/A #N/A  #N/A no hit  120 41_11 #N/A #N/A #N/A  hypothetical protein [Gordonia sihwensis]  41_12 #N/A #N/A #N/A  #N/A no hit 41_13 #N/A #N/A #N/A  hypothetical protein [Gordonia sihwensis]                50_1 #N/A #N/A #N/A 96 ABC transporter [Gordonia namibiensis] putative UvrA-like protein  50_2 #N/A #N/A #N/A  #N/A no hit 50_3 #N/A #N/A #N/A  #N/A no hit 50_4 #N/A #N/A #N/A  #N/A no hit 50_5 #N/A #N/A #N/A  #N/A no hit 50_6 #N/A #N/A 55.34 56 hypothetical protein [Gordonia amarae] hypothetical protein GOAMR_38_00120  50_7 42.24 #N/A 72.27 73 hypothetical protein [Gordonia amarae] hypothetical protein GOAMR_38_00110                51_1 #N/A #N/A 65.27  hypothetical protein [Gordonia aichiensis]  51_2 #N/A #N/A #N/A  #N/A no hit 51_3 #N/A #N/A #N/A  #N/A no hit 51_4 #N/A #N/A #N/A 61 hypothetical protein [Gordonia malaquae] XRE family transcriptional regulator  51_5 #N/A #N/A #N/A 67 hypothetical protein [Mycobacterium tuberculosis complex] hypothetical protein MLP_05470  51_6 33.57 43.28 #N/A 69 putative helicase [Gordonia aichiensis] putative helicase  51_7 #N/A #N/A #N/A 72 putative Xre family DNA-binding protein [Gordonia aichiensis] putative Xre family DNA binding protein  51_8 25.73 78.42 94.33 95 glycerol kinase [Gordonia soli] glycerol kinase  Hydrogen metabolism           52_1 #N/A 57.97 #N/A 62 hypothetical protein [Kineosphaera limosa] NHL repeat containing protein  52_2 #N/A 45.98 #N/A 54 NiFe hydrogenase maturation protein [Nocardioides sp. CF8]   52_3 #N/A 30.26 #N/A 62 hydrogenase [Nocardia sp. 348MFTsu5.1] hydrogenase assembly chaperone HypC/HupF  52_4 #N/A 79.34 #N/A 81 hydrogenase formation protein HypD [Actinoplanes globisporus] hydrogenase maturation protein HypD  52_5 #N/A 66.38 #N/A 70 hydrogenase [Mycobacterium vaccae] hydrogenase maturation protein Hype  52_6 55.72 #N/A #N/A 55 hypothetical protein [Nocardia sp. 348MFTsu5.1] TetR family transcriptional regulator  52_7 56.88 #N/A #N/A 60 hydrolase [Gordonia sp. KTR9] putative hydrolase  52_8 70.65 66.3 82.61 83 hypothetical protein [Gordonia sp. KTR9] hypothetical protein GOAMR_20_01660  52_9 #N/A #N/A 81.19 81 amine oxidase [Gordonia polyisoprenivorans] putative flavin-containing amine oxidase  52_10 #N/A 34 #N/A 60 transcriptional regulator [Herbaspirillum frisingense]  MarR family transcriptional regulator  52_11 #N/A #N/A #N/A  #N/A no hit  121 carbohydrate metabolism           53_1 #N/A #N/A 80.56 81 hypothetical protein [Nocardia farcinica] hypothetical protein GOAMR_20_01870  53_2 #N/A 44.09 87.1 88 hypothetical protein [Gordonia amarae] hypothetical protein GOAMR_20_01880  53_3 #N/A #N/A 77.05 55 hypothetical protein [Gordonia hirsuta] 16S RNA G1207 methylase RsmC  53_4 #N/A 31.93 #N/A 54 Trehalose and maltose hydrolases [Thermoanaerobacter] glycoside hydrolase  53_5 #N/A #N/A #N/A 65 PfkB domain protein [Gillisia limnaea]  PfkB family protein carbohydrate kinase  53_6 50.75 24.57 27.37 72 putative sugar transporter [Nocardia asteroides] sugar transporter  53_7 #N/A 29.7 #N/A 51 ROK-family transcriptional regulator [Rhodococcus sp. EsD8] putative NagC family transcriptional regulator  53_8 #N/A #N/A 88.54 89 putative aminotransferase [Gordonia aichiensis] putative aminotransferase                60_1 #N/A #N/A #N/A  #N/A  60_2 #N/A #N/A #N/A 40 hypothetical protein [Gordonia neofelifaecis] hypothetical protein SCNU_02285  60_3 #N/A #N/A #N/A 30 hypothetical protein [Gordonia amarae] virulence-associated E family protein  60_4 #N/A #N/A #N/A  #N/A no hit 60_5 #N/A #N/A #N/A  #N/A no hit 60_6 #N/A #N/A 27.2  hypothetical protein [Propionibacterium sp. HGH0353]  site-specific recombinase, DNA invertase Pin  60_7 #N/A #N/A 27.69 33 #N/A  conserved membrane protein of unknown function  60_8 35.34 #N/A 38.92 41 hypothetical protein [Actinoplanes globisporus] MerR family transcriptional regulator  60_9 #N/A #N/A 50.72 88 hypothetical protein [Streptomyces sp. HGB0020] putative ArsR family transcriptional regulator  polysaccharide biosynthesis         65_2 #N/A 26.93 24.88  polysaccharide biosynthesis protein [Rhodococcus ruber]  65_3 #N/A #N/A 32.93  glycosyl transferase family 1 [Mycobacterium gilvum] saxobsidens DD2] 65_4 #N/A #N/A #N/A  hypothetical protein [Mycobacterium gilvum]  65_5 #N/A #N/A 30.15  polymerase [Mycobacterium vanbaalenii]  65_6 #N/A #N/A #N/A  glycosyl transferase family 1 [Mycobacterium fortuitum]  65_7 #N/A #N/A #N/A  hypothetical protein [Smaragdicoccus niigatensis]                71_1 32.54 40.5 #N/A 98 3-phosphoglycerate dehydrogenase [Gordonia namibiensis] D-3-phosphoglycerate dehydrogenase  71_2 68.49 66.22 #N/A 85 peptidase S1 family protein [Gordonia amarae] peptidase S1 family protein  71_3 #N/A #N/A #N/A 33 hypothetical protein [Gordonia amarae] hypothetical protein GPOL_c30100  71_4 #N/A #N/A #N/A 48 hypothetical protein [Gordonia amarae] hypothetical protein GOAMR_09_00320  71_5 #N/A #N/A #N/A 71 hypothetical protein [Gordonia amarae] hypothetical protein GOAMR_09_00330   122 71_6 #N/A #N/A #N/A 72 hypothetical protein [Gordonia amarae] hypothetical protein GOAMR_09_00340  71_7 28.39 37.67 28.77 93 phosphoenolpyruvate synthase [Corynebacterium terpenotabidum] phosphoenolpyruvate synthase  71_8 #N/A 47.52 #N/A 86 putative phosphotransferase [Gordonia amarae] putative phosphotransferase             96_1 #N/A #N/A #N/A 40 hypothetical protein [Nonomuraea coxensis] integrase, catalytic region  96_2 #N/A #N/A #N/A 51 #N/A bacterial stress protein  96_3 48.48 48.48 28.91 60 MerR family transcriptional regulator [Rhodococcus] MerR family transcriptional regulator  96_4 60.1 #N/A #N/A 83 Cyanate permease [Gordonia terrae]  putative major facilitator superfamily transporter  96_5 61.32 59.58 #N/A 63 beta-lactamase [Rhodococcus] beta-lactamase  96_6 #N/A #N/A #N/A 45 hypothetical protein [Gordonia amarae] hypothetical protein GOAMR_12_00400  96_7 45.27 41.5 #N/A 82 MarR family transcriptional regulator [Glaciibacter superstes] putative MarR family transcriptional regulator  96_8 #N/A #N/A #N/A 62 DoxX family protein [Micromonospora sp. ATCC 39149] DoxX family protein  96_9 48.84 48.97 #N/A 82 NmrA family protein, partial [Leifsonia aquatica] putative NAD(P)H--quinone oxidoreductase                97_1 #N/A #N/A #N/A 37 #N/A hypothetical protein FrCN3DRAFT_8038  97_2 #N/A #N/A #N/A 54 putative PnuC family transporter [Gordonia malaquae] nicotinamide mononucleotide transporter PnuC  97_3 #N/A #N/A #N/A 45 hypothetical protein [Gordonia malaquae] cytidylyltransferase  97_4 #N/A 38.35 42.93 42 putative hydrolase [Gordonia malaquae] NUDIX hydrolase  97_5 36.05 36.56 37.21 73 DGPFAETKE family protein [Nocardiopsis halotolerans] DGPFAETKE family protein  97_6 32.66 42.43 37.26 68 RNA polymerase sigma24 factor [Frankia sp. BCU110501] sigma-70 region 2 domain-containing protein  97_7 #N/A #N/A #N/A 63 beta-lactamase [Streptomyces bottropensis] hypothetical protein PAI11_08780  97_8 53.82 34.83 35.56 60 AraC family transcriptional regulator [Saccharomonospora cyanea] AraC family transcriptional regulator  97_9 36.67 35.24 #N/A 47 hypothetical protein [Gordonia hirsuta] peptidase S51 dipeptidase E  Antibiotic biosynthesis           98_1 33.33 #N/A #N/A 45 #N/A hypothetical protein GOEFS_014_00330  98_2 #N/A #N/A 27.66 40 hypothetical protein [Gordonia amarae] HipA domain-containing protein  98_3 #N/A #N/A #N/A 80 hypothetical protein [Gordonia amarae] hypothetical protein GOAMR_19_00060  98_4 #N/A #N/A #N/A 66 antibiotic biosynthesis monooxygenase [Marinobacter lipolyticus] Antibiotic biosynthesis monooxygenase  98_5 65.58 65.28 #N/A 91 3-oxoacyl-ACP synthase [Gordonia polyisoprenivorans] 3-oxoacyl- 98_6 #N/A 52.27 #N/A 55 LpqP protein [Mycobacterium marinum] LpqP protein  98_7 32.64 46.9 #N/A 90 putative carboxylesterase [Gordonia amarae] putative carboxylesterase  98_8 32.65 #N/A #N/A 92 hypothetical protein [Mycobacterium tuberculosis complex] putative TetR family transcriptional regulator   123 98_9 27.27 #N/A #N/A 69 pyridoxamine 5'-phosphate oxidase [Ornithinimicrobium pekingense] pyridoxamine 5'-phosphate oxidase-like FMN-binding prot               99_1 #N/A #N/A #N/A  #N/A putative transposase  99_2 #N/A #N/A #N/A 42 prevent-host-death protein [Serinicoccus marinus] toxin-antitoxin system, antitoxin component, PHD family  99_3 #N/A #N/A #N/A 70 #N/A hypothetical protein GOAMR_19_01360  99_4 69.66 71.13 #N/A 84 NAD-dependent deacetylase [Gordonia polyisoprenivorans] NAD-dependent deacetylase  99_5 67.26 68.87 #N/A 92 cytosine deaminase [Gordonia rhizosphera] putative amidohydrolase  99_6 40 45.81 #N/A 73 hypothetical protein [Pseudoclavibacter faecalis] protein CrcB homolog  99_7 50.88 53.49 #N/A 81 hypothetical protein [Pseudoclavibacter faecalis] protein CrcB homolog  99_8 54.47 44.57 50.61 93 3-hydroxyacyl-CoA dehydrogenase [Rhodococcus wratislaviensis]  putative 3-hydroxyacyl-CoA dehydrogenase  99_9 43.56 82.98 43.87 90 acyl-CoA dehydrogenase [Rhodococcus sp. DK17] acyl-CoA dehydrogenase, short-chain specific  99_10 63.35 62.11 #N/A 90 TetR family transcriptional regulator [Mycobacterium vaccae] putative TetR family transcriptional regulator  99_11 #N/A #N/A #N/A 99 amidohydrolase [Mycobacterium] putative decarboxylase                108_1 #N/A #N/A #N/A  #N/A no hit 108_2 #N/A #N/A #N/A  #N/A no hit 108_3 #N/A #N/A #N/A  #N/A no hit 108_4 #N/A #N/A #N/A 53 hypothetical protein [Gordonia araii] hypothetical protein GOARA_013_00350  108_5 #N/A #N/A #N/A 38 N-acetylmuramoyl-L-alanine amidase [Corynebacterium terpenotabidum] lysozyme  108_6 #N/A #N/A #N/A 26 hypothetical protein [Gordonia polyisoprenivorans]  TPR repeat protein  108_7 #N/A #N/A #N/A  #N/A no hit Bacteriophage             116_1 #N/A #N/A #N/A 50 hypothetical protein [Streptomyces aurantiacus] gp14  116_2 #N/A #N/A #N/A 42 #N/A gp104  116_3 #N/A #N/A #N/A 49 #N/A gp15  116_4 #N/A #N/A #N/A 58 #N/A gp14  116_5 #N/A #N/A #N/A  #N/A  116_6 #N/A #N/A #N/A  hypothetical protein [Mycobacterium abscessus]  116_7 #N/A #N/A #N/A  Bacteriophage protein [Mycobacterium abscessus]  116_8 #N/A #N/A #N/A  #N/A  116_9 #N/A #N/A #N/A  #N/A  116_10 #N/A #N/A #N/A  hypothetical protein [Dietzia alimentaria]   124               132_1 #N/A #N/A 44.98 94 monooxygenase [Rhodococcus sp. DK17] putative FMNH2-dependent monooxygenase  132_2 #N/A #N/A #N/A 75 putative FMNH2-dependent monooxygenase [Gordonia amarae] putative FMNH2-dependent monooxygenase  132_3 #N/A #N/A #N/A 92 #N/A putative acetyl-CoA acyltransferase  132_4 50 66.12 #N/A 68 acetyl-CoA acetyltransferase [Nocardia sp. 348MFTsu5.1] acyl-CoA dehydrogenase, C-terminal domain protein  132_5 30.43 29.35 #N/A 81 putative oxidoreductase [Gordonia soli]  2-deoxy-D-gluconate 3-dehydrogenase  132_6 34.57 55.71 56.41 49 putative oxidoreductase [Gordonia soli] serine/threonine protein kinase  132_7 #N/A 50 #N/A 50 putative LuxR family transcriptional regulator [Gordonia soli]  transcriptional regulator, LuxR family  Aromatic compound degradation         133_1 33.68 74.91 33.33 75 AMP-dependent synthetase [Gordonia rhizosphera]  putative acyl-CoA synthetase  133_2 47.69 45.81 #N/A 88 putative 4-hydroxy-2-oxovalerate aldolase [Gordonia aichiensis] 4-hydroxy-2-oxovalerate aldolase CmtG  133_3 #N/A #N/A #N/A 80 acetaldehyde dehydrogenase CmtH [Gordonia aichiensis] acetaldehyde dehydrogenase CmtH  133_4 40.57 37.74 #N/A 71 2-hydroxypenta-2,4-dienoate hydratase CmtF [Gordonia aichiensis] 2-hydroxypenta-2,4-dienoate hydratase CmtF  133_5 26.89 #N/A #N/A 75 p-cumate dioxygenase small subunit [Gordonia polyisoprenivorans] p-cumate dioxygenase small subunit  133_6 #N/A #N/A 32.41 75 putative oxidoreductase [Gordonia polyisoprenivorans]  putative oxidoreductase  133_7 #N/A #N/A #N/A 50 hypothetical protein [Streptomyces prunicolor] putative hydrolase  cytochrome             159_1 #N/A #N/A #N/A 93 Mce family protein [Gordonia amarae] Mce family protein  159_2 26.75 #N/A 25.74 83 Mce family protein [Gordonia amarae] Mce family protein  159_3 51.71 #N/A 53.41 96 hypothetical protein [Nocardia asteroides] YrbE family protein  159_4 #N/A #N/A #N/A  hypothetical protein [Nocardia asteroides] YrbE family protein  159_5 #N/A #N/A #N/A  YrbE family protein [Nocardia asteroides] YrbE family protein  159_6 41.33 #N/A #N/A  short-chain dehydrogenase [Mycobacterium fortuitum] putative oxidoreductase  159_7 #N/A 29.59 #N/A 93 cytochrome P450 [Mycobacterium sp. 360MFTsu5.1] putative cytochrome P450                165_1 24.54 #N/A 94.32 35 dihydropyrimidinase [Gordonia amarae] hydantoin racemase  165_2 #N/A #N/A #N/A 47 Asp/Glu racemase [Jannaschia sp. CCS1] ATP-dependent protease FtsH  165_3 #N/A #N/A 80.46 31 hypothetical protein [Gordonia amarae] hypothetical protein Xcel_2882  165_4 #N/A #N/A #N/A 31 hypothetical protein [Sporichthya polymorpha] hypothetical protein Xcel_2882  165_5 #N/A #N/A #N/A 53 hypothetical protein [Sporichthya polymorpha] hypothetical protein  165_6 33.12 30.38 #N/A 43 hypothetical protein [Dermabacter sp. HFH0086]  ribosomal protein N-acetylase  165_7 #N/A #N/A #N/A 42 hypothetical protein [Gordonia malaquae] ATP-binding protein   125 Bacteriophage             166_1 #N/A #N/A #N/A  #N/A no hit 166_2 #N/A #N/A #N/A  hypothetical protein [Niabella aurantiaca] no hit 166_3 #N/A #N/A #N/A  #N/A no hit 166_4 #N/A #N/A #N/A  #N/A no hit 166_5 #N/A #N/A #N/A 33 phage tail tape measure protein [Rhodospirillum centenum] phage tail tape measure protein, TP901 family  166_6 #N/A #N/A #N/A  #N/A no hit 166_7 #N/A #N/A #N/A  #N/A no hit 166_8 #N/A #N/A #N/A  #N/A no hit 166_9 #N/A #N/A #N/A  #N/A no hit Aromatic compound degradation         167_1 #N/A #N/A #N/A 55 #N/A 2-ketocyclohexanecarboxyl-CoA hydrolase  167_2 #N/A 38.99 31.3 80 acetyl-CoA acetyltransferase [Smaragdicoccus niigatensis] acetyl-CoA acetyltransferase  167_3 #N/A 36.84 #N/A  hypothetical protein [Mycobacterium sp. VKM Ac-1815D]  167_4 #N/A 66.26 #N/A 79 acetyl-CoA acetyltransferase [Mycobacterium thermoresistibile] acetyl-CoA acetyltransferase  167_5 #N/A #N/A #N/A  hypothetical protein [Mycobacterium abscessus]  167_6 28.57 28.09 #N/A 70 dioxygenase [Mycobacterium indicus pranii] hydroxylase beta subunit, benzoate 1,2-dioxygenase  167_7 #N/A #N/A #N/A 80 dioxygenase [Mycobacterium sp. VKM Ac-1815D] dioxygenase large subunit  167_8 #N/A #N/A #N/A 74 dioxygenase [Mycobacterium sp. VKM Ac-1815D] benzoate 1,2-dioxygenase, large subunit                168_1 #N/A #N/A #N/A  #N/A no hit 168_2 #N/A #N/A #N/A  #N/A no hit 168_3 #N/A #N/A #N/A  #N/A no hit 168_4 #N/A #N/A #N/A  #N/A no hit 168_5 #N/A #N/A #N/A  #N/A no hit 168_6 #N/A #N/A #N/A  #N/A no hit 168_7 #N/A #N/A #N/A  #N/A no hit 168_8 #N/A #N/A #N/A  #N/A no hit 168_9 #N/A #N/A #N/A 39 ParB-like protein [Bifidobacterium breve] Chromosome partitioning protein parB  168_10 #N/A #N/A #N/A  #N/A no hit 168_11 #N/A #N/A #N/A  hypothetical protein [Rhodococcus rhodnii] no hit Stress response              126 177_1 83.92 87.21 #N/A 95 UDP-galactopyranose mutase [Gordonia soli] UDP-galactopyranose mutase  177_2 65.73 72.31 #N/A 66 stage II sporulation protein SpoIID [Gordonia sp. KTR9] SpoIID/LytB domain-containing protein  177_3 #N/A #N/A #N/A  #N/A  177_4 #N/A #N/A #N/A  #N/A  177_5 58.71 55.83 #N/A 62 putative N-acetylmuramoyl-L-alanine amidase [Gordonia terrae] putative N-acetylmuramoyl-L-alanine amidase  177_6 #N/A #N/A #N/A  hypothetical protein [Gordonia amarae]  hypothetical protein GOAMR_20_00230  177_7 #N/A #N/A #N/A 93 type I restriction-modification enzyme, R subunit [Mycobacterium avium]  type I restriction-modification system restriction subunit  177_8 #N/A #N/A #N/A  #N/A  178_1 #N/A #N/A 71.73 72 antibiotic transporter [Gordonia namibiensis]  putative ABC transporter permease protein  178_2 #N/A #N/A 78.59 69 IclR family transcriptional regulator [Gordonia namibiensis] putative ABC transporter ATP-binding protein  178_3 35.23 #N/A #N/A 74 putative non-ribosomal peptide synthetase [Gordonia alkanivorans] putative non-ribosomal peptide synthetase                179_1 #N/A #N/A #N/A  #N/A  179_2 #N/A #N/A #N/A  hypothetical protein [Amycolatopsis nigrescens]  179_3 #N/A #N/A #N/A  #N/A  179_4 #N/A 25.56 #N/A 62 methylmalonyl-CoA carboxyltransferase 12S subunit [Rhodococcus opacus] carboxyl transferase  179_5 43.85 #N/A #N/A 80 Short-chain dehydrogenase/reductase SDR [Rhodococcus sp. EsD8]  short-chain dehydrogenase/reductase SDR  179_6 #N/A #N/A #N/A 82 acyl-CoA dehydrogenase [Mycobacterium abscessus] acyl-CoA dehydrogenase  Toxin-antitoxin system           242_1 43.4 #N/A 83.19 84 hypothetical protein [Microbacterium sp. 292MF] putative methylated-DNA-cysteine methyltransferase  242_2 50.1 50.89 81.23 82 hypothetical protein [Microbacterium maritypicum] putative methylated-DNA-cysteine methyltransferase 242_3 #N/A #N/A #N/A  #N/A - 242_4 #N/A #N/A 42.74  integrase [Mycobacterium abscessus]  integrase  242_5 #N/A #N/A #N/A  #N/A - 242_6 #N/A #N/A #N/A  antitoxin HicB [Propionibacterium]  Fe-S-cluster redox enzyme  242_7 #N/A #N/A #N/A 68 toxin HicA [Actinomyces odontolyticus] response regulator, CheY-like receiver domain  242_8 #N/A #N/A #N/A 59 hypothetical protein [Rhodococcus erythropolis]  Fic protein  Toxin-antitoxin system           253_2 #N/A #N/A #N/A 38 hypothetical protein [Promicromonospora sukumoe] hypothetical protein Snas_5449  253_3 #N/A #N/A #N/A  #N/A no hit 253_4 #N/A #N/A #N/A  #N/A no hit 253_5 #N/A #N/A #N/A 78 hypothetical protein [Methylosinus sp. LW4] Ribbon-helix-helix protein, copG family   127 253_6 #N/A #N/A #N/A  #N/A no hit 253_7 #N/A #N/A #N/A 64 hypothetical protein [Streptomyces sp. HPH0547]  hypothetical protein FraEuI1c_6647  253_8 #N/A #N/A #N/A  #N/A no hit 253_9 #N/A 37.07 #N/A 57 transcriptional modulator of MazE/toxin, MazF [Cyanothece] putative PemK-like protein  253_10 #N/A #N/A #N/A 51 antitoxin [Patulibacter americanus]  ChpI  253_11 #N/A #N/A #N/A  #N/A no hit 253_12 #N/A #N/A #N/A 76 #N/A helix-turn-helix domain protein  253_13 58.11 #N/A #N/A 80 rifampin ADP-ribosyl transferase [Brevibacterium casei] rifampin ADP-ribosyl transferase  253_14 #N/A #N/A #N/A  #N/A no hit               254_1 81.79 80.18 #N/A 94 amino acid adenylation protein [Gordonia polyisoprenivorans] putative non-ribosomal peptide synthetase  254_2 57.41 #N/A #N/A 60 hypothetical protein [Actinomadura flavalba]  alpha/beta hydrolase fold protein  254_3 48.37 #N/A #N/A 57 hypothetical protein [Actinomadura flavalba] transcriptional regulator  254_4 #N/A 29.69 #N/A 57 filamentation induced by cAMP protein fic [Rhodococcus qingshengii] filamentation induced by cAMP protein Fic                255_1 83.13 37.5 #N/A 85 putative oxidoreductase [Gordonia aichiensis]  short-chain dehydrogenase/reductase SDR  255_2 70.79 71.19 #N/A 93 hypothetical protein [Gordonia soli] hypothetical protein GOAMR_03_01340  255_3 73.26 71.17 #N/A 92 enoyl-CoA hydratase [Rhodococcus erythropolis]  putative enoyl-CoA isomerase  255_4 70.32 32.41 #N/A 84 hypothetical protein [Gordonia paraffinivorans]  hypothetical protein GOAMR_03_01370  255_5 #N/A 40.35 #N/A 79 hypothetical protein [Gordonia amarae] hypothetical protein GOAMR_03_01380  255_6 #N/A 75.84 #N/A 93 heat shock protein 90 [Gordonia rhizosphera] chaperone protein HtpG                256_1 #N/A #N/A #N/A 53 conserved hypothetical protein [Bradyrhizobium sp. STM 3843] hypothetical protein  256_2 #N/A #N/A #N/A 82 twitching motility protein PilT [Mycobacterium sp. 141]  PilT protein domain-containing protein  256_3 #N/A #N/A #N/A 82 hypothetical protein [Mycobacterium sp. 141] transcription regulator of the Arc/MetJ class  256_4 52.89 52.51 #N/A 62 putative Fis family transcriptional regulator [Gordonia malaquae] helix-turn-helix domain-containing protein  256_5 40.16 41.79 46.44 85 putative aldehyde dehydrogenase [Gordonia hirsuta]  aldehyde dehydrogenase  256_6 #N/A #N/A #N/A 85 oxidoreductase, Rxyl_3153 family protein [Rhodococcus wratislaviensis] zinc-containing alcohol dehydrogenase                257_1 31.84 26 27.09 65 hypothetical protein [Nocardia sp. BMG111209] putative dioxygenase  257_2 #N/A #N/A #N/A 72 hypothetical protein [Gordonia amarae]  hypothetical protein GOAMR_10_00090  257_3 #N/A #N/A #N/A 51 hypothetical protein [Sorangium cellulosum] hypothetical protein O3I_17803   128 257_4 47.79 79.1 93.31 94 acyl-CoA synthetase [Gordonia rhizosphera] putative fatty-acid--CoA ligase  257_5 #N/A #N/A #N/A 64 hypothetical protein [Nocardiopsis lucentensis] hypothetical protein MLP_16900  257_6 #N/A #N/A #N/A 53 hypothetical protein [Nocardiopsis halotolerans] PilT protein domain-containing protein  257_7 #N/A #N/A #N/A 66 #N/A hypothetical protein MAP4268c   129 Table B4 Prophage regions from metagenome Annotation e value Sequence Region 1   PHAGE_Megavi_chiliensis_NC_016072: collagen-like protein 4.00E-11 MARPRRPKPKESGRRTKKKVSILNTESIEYVDW KDVNLLRRFQSDRAKIRARRVTGNNTQQQRQVAVAIR PHAGE_Rhodoc_REQ3_NC_016654: single stranded DNA binding protein 7.00E-30 MATNTVTIIGNVTRDPELRFTPSGQAVANFGVAV NRRWQNRQTNEWEEATSFFDIVAWAQLGENVSESCP 30S ribosomal protein S6 [Ilumatobacter coccineus YM16-304] gi|470180472|ref|YP_007566516.1| 3.00E-25 MNRAYELMVIIDADVADAENKVVVDRVEELIGAAGGELSSTDRWGRRKFAYLINHKAEGYYVVFEFTADP PHAGE_Ostreo_2_NC_014789: putative 3-methyl-2-oxobutanoate hydroxymethyltransferase 6.00E-47 MSDRPTVPQIRARKVRDGAEPLVMITAHDAPTARIADAGGVDMILVGDSLAMVALGYEDTLQVTIDDMVH hypothetical N/A MNPTVRDPSSSDTDCQNCGSTGLPTEPVQRVYCSPESPSDLDAATIDNEIEVWCAACVANYPHLAVKPG PHAGE_Prochl_P_SSM2_NC_006883: phage tail fiber-like protein 1.00E-09 MSSQPPARRSLFWRSRRFWFAAVVVVVLGAGGVWRLLDQVDLPEEITNPLANTSLICDASVPVGTCSIDN hypothetical protein [Frankia symbiont of Datisca glomerata] gi|336180323|ref|YP_004585698.1| 5.00E-21 MVMAGAGGGVMPGVRDAAGSGSLSRGVGCRTVEQLFARLQGRTSPSGEVDYRLTRRSTLRSLAAGEVDRA PHAGE_Microm_MpV1_NC_014767: hypothetical protein 3.00E-05 VSSLDNRTVLHRTRRHPSAGNLGPRPLLLAPGTPPPGLPPTSFPHGLSQQTNFVEFGPDLDHSTHQRLLG PHAGE_Mycoba_Myrna_NC_011273: gp28 1.00E-57 VIPPRLATLLDQLAPLAERFAGRGHRLYLVGGSVRDLLLGSDRLPDDLDFTTEASPEAIKAALDGWVDAL PHAGE_Rhodoc_E3_NC_021347: putative histone deacetylase protein 9.00E-09 MTVLVVRSESSVRHDTGQWHPERAARLTATTAALSDPELDGALRFVEARQATDDELHMVHTPEHVARIRV hypothetical N/A MVLVVNLGWDVVGEPVVPGEDPGFEDSALSGLPTCSRVAVGAI GCN5-like N-acetyltransferase [Acidimicrobium ferrooxidans DSM 10331] gi|256370849|ref|YP_003108673.1| 3.00E-25 MTAQLAVDARGAHPSVAGVGRCVADATERGFDRILTAALHRDDLFPFVHHGFEPVEELVVLAHDLIEVPV hypothetical protein Isova_3007 [Isoptericola variabilis 225] gi|334338427|ref|YP_004543579.1| 8.00E-14 MTTRQVHRLVVGMMMLLLTTIGLPVGTGGAVGADGAVATSRGAAPQETLRILHTTTFVPADGTFTFTVDT PHAGE_Pandor_salinus_NC_022098: serine/threonine kinase motif-containing 2.00E-20 MPGNATDSPSPSADYMPGAMLGNRYRLERKVGTGGMAQVWEASDLVLDRRVAVKILHPHLATDTSVERFR putative RNA polymerase ECF subfamily sigma factor [Ilumatobacter coccineus YM16-304] gi|470180484|ref|YP 2.00E-34 MTSSRRAGATDDELVAWAQGGDRLAIEVLLRRHYDRLYAVCRGVVGLGDADDATQATMMGIVGGLARFDG serine/threonine protein kinase [Haliangium ochraceum DSM 14365] gi|262193765|ref|YP_003264974.1| 1.00E-05 VSDEPLRPGALGPPPQPEPSVREAAIASSLTRFDELIVDGRLTEGNEPHSRSDGSMPSSAPVDELAQVRE hypothetical protein YM304_42790 [Ilumatobacter coccineus YM16-304] gi|470180487|ref|YP_007566531.1| 2.00E-05 MATSPAATHDPPSAPAGPSGTQSHGDAHWADQVADLIVDTVDRVRDRTVTPAHALAKYVVYGAVIAVLIL PHAGE_Bacill_G_NC_023719: gp344 5.00E-65 MSTIHRKVLIIGSGPAGLTAGIYTSRAQLEPLLVEGEPSSTSDQPGGQLMLTTEVENFPGFPDQVQGPDL PHAGE_Achrom_JWAlpha_NC_023556: hypothetical protein 3.00E-10 MSAAITHLTNSSFSEEVTGSDLPVLVDFWAEWCGPCKTIAPVLEELAQEHGDKLRIAKVDVDSEQALALR PHAGE_Salmon_SSU5_JQ965645: putative ParB-like nuclease domain-containing protein 1.00E-09 VLEVAVDDIRANPFQPRVDFDPESLGGLAASIAALGVLQPLLVRPAAQGTYHLIAGERRWRAARQAGLAT  130 PHAGE_Natria_PhiCh1_NC_004084: putative plasmid partitioning protein Soj 5.00E-26 VTSKTKKIKKPKSDVDVKGDVAKPTQTMTTRVVAIANQKGGVGKTTTTVNLGAALAERDLRVLVIDLDPQ PHAGE_Pandor_dulcis_NC_021858: pif1-like helicase 6.00E-05 MEWIELTGKTIDEARDLALEKLGVHESEAEVEVLEHPSVSMFGRVKSMARIRARVAPVAAPAKEERRRRG PHAGE_Megavi_chiliensis_NC_016072: hypothetical protein 7.00E-05 MPDLLGPFDPLFEFFGAIVAGIYAVIPSFGVAIVLFTFLVMVVTTPLTVKSTKSMLQMQRLQPELKQLQA hypothetical N/A VTRNRVRRRLRHLMADAERRGTLVTKDYLIVGGPAISNLSFDELSTHLNNALTSAERSVQGTRRHSPG hypothetical protein PFREUD_24220 [Propionibacterium freudenreichii subsp. shermanii CIRM-BIA1] gi|297627 3.00E-10 VKRTYQPKTRRRARRHGFRHRMAERSGRAVVKARRRKGRARLSA hypothetical N/A MDELSELSVRRCPVDGVRRVGPVASLPARPQASGKGG PHAGE_Bacill_Pony_NC_022770: replication initiator protein 2.00E-05 MEDEATSVWEAVARGVTHQVSSVVWRTTFSEVRAVDYDGATLTIVAPSLVLRDRIDNRFRPLLMGVISDL hypothetical N/A VCEPLDPGDDTTVIPTGRLPRTSYLGIIMDHRWIQLVRSRKGFG PHAGE_Mycoba_DS6A_NC_023744: gp34 9.00E-34 VKFRCERDVLADAVGSAGRATSGRGGALPVLAGLRLRLDGDHLEITGSDLDLTVTAEIEVAGGDDGVAVI Region 2   PHAGE_Rhodoc_ReqiPoco6_NC_023694:gp090; PP_02770; phage(gi593774803) 6.00E-82 MSISLRNDFVVDLLMAAGSDEFLCKAARVSTQGSASIDSEESYGLLNFLMKNRHGSPFEHGMMTFRIEAP PHAGE_Cyanop_PP_NC_022751: hypothetical protein; PP_02771; phage(gi557307645) 2.00E-07 MFAMAAVLHEGACKYGANNWRGITIEDHLNHLIMHAYAYLSGDRSDEHLSHIMCRAMFAQAVEITEQEKQ hypothetical; PP_02772 N/A MNAINHFNNTAVIIRGQDNQPQMELFDLVPDINSVRAEDIPDSHESNLVSHARRELEMIGEEPEWVEGYL PHAGE_Rhodoc_ReqiPoco6_NC_023694:gp078; PP_02773; phage(gi593774791) 2.00E-32 MANEEKTFKVEGAELIYKNFAGEKSAFNATGKREVSVVLSPEFAETLLADGWNVRQTKPDEDGEFRYYIT hypothetical; PP_02774 N/A MSEELHNVVITDTQVRCSCGYEFDTDLPPEAHKLGYIHAQMQLASKVDNQQTPDPDGY PHAGE_Rhodoc_ReqiPoco6_NC_023694:gp072; PP_02775; phage(gi593774785) 0 MVRGQSFYAIWDETKGLWSTDEYDVQRLVDEDLHRYASELHEKKGVNYTVLNLKSFDTKIWTTFRSYMRH hypothetical; PP_02776 N/A MALGRNKFTLTDREFAIIERSIDLRIKHLVGEINRTLPEVGLDNPHIRDMDRTLYECRELSKKLSAIKKA hypothetical; PP_02777 N/A LTSNIFLSHPLLINEILTELIPKLNLGLIMLGVSQWEQSTFTVG PHAGE_Rhodoc_ReqiPoco6_NC_023694:gp070; PP_02778; phage(gi593774783) 6.00E-18 METTEYIDDVQVSPTTIGILVGIASFGVGLTIGYFVGKRNKDVVYRTIPAVNSGAITSVVYDYSYSEDVV PHAGE_Rhodoc_ReqiPoco6_NC_023694:gp069; PP_02779; phage(gi593774782) 1.00E-60 MKYIPEQLARNLSRQILVARKHSPRVLFVAGIAGVVTSTVLACKATLKLEAELDEMQTQINNVKELKTEH PHAGE_Rhodoc_ReqiPoco6_NC_023694:gp065; PP_02780; phage(gi593774778) 4.00E-17 MLKRTITFTDYNGVRHTEDHYFHLSKVDLVRLEVAGEKSFAEYLQDIVKTEDRKGLIEIFEKLIQLSYGK hypothetical; PP_02781 N/A MENDTNTTVETNPYLESMKEGFAKGAATAVATVVVTQLATLLIEKSVGAVRNHRNKIADENTDQ PHAGE_Rhodoc_ReqiPoco6_NC_023694:gp054; PP_02782; phage(gi593774767) 3.00E-46 MTIKTVLHKTEKSLRDNSPVILTAIGVSGTLSTAYLAGKASYEIGYAFYDDMPTRDFFKLNWKKYIPAAV  131 hypothetical; PP_02783 N/A MKQILDAISYVGGLAVGVTLTAVAGVIVLVIAREVYEQLTKSDD hypothetical; PP_02784 N/A MESHTEPKKTFAGRIVSFAKNDIVQMTALMAATAVVSAGLAREITLRQAFTFEQMDYILKDKDLQKALID hypothetical; PP_02785 N/A MFNRAIQVKMVNTKKQEPQEPVASDSYFEKKAEVVSREIDGVMRKVGMLATGYVVVDTLRQVLVARANRF PHAGE_Rhodoc_ReqiPoco6_NC_023694:gp041; PP_02786; phage(gi593774754) 2.00E-08 MFGNFRKKEDPKLSAAIDAIYEEMTTHGPDSPEYPNMLGYLERLTELQAPKRHNRVSPDQMAVVLGNLLG PHAGE_Strept_Dp_1_NC_015274: holin; PP_02787; phage(gi327198372) 3.00E-07 MEQNGTVDVQGFQLSNRTYNKLKAFVTVILPAFSSAYYGLAELWDFPNVAAVIGTTAIITTLFGTLLGIS hypothetical protein [butyrate-producing bacterium SM4/1] gi|479183015|ref|YP_0078101 5.00E-09 MGIKFIEQKIIYKDEYEDFRRYLYEPYKELGGNGTADRIMAEIAKLPIRGYHTSDSNRIHLRREDKSNGT hypothetical protein [butyrate-producing bacterium SM4/1] gi|479183020|ref|YP_0078101 5.00E-14 MLDILKFNSGVPTRLENGEVVNGIVSKTWVERYRDPGEFTFTAKESSDLLSKLPVGTLISHMQTSEVMVV PHAGE_Rhodoc_ReqiPoco6_NC_023694:gp037; PP_02790; phage(gi593774750) 7.00E-12 MNLTSLDLYSNNVFVARFDCEPDGTSPFLLVDESGLGAETIVQRYVMESASGEQFYDLTVPSRTISLQMI PHAGE_Rhodoc_ReqiPoco6_NC_023694:tape measure protein; PP_02791; phage(gi593774749) 0 MPSVDNKIVSIEFDNNSFERKVAETMASLDKLKASLAFNDANKSFADLDSSVKKINFGSMASAVDGISTK hypothetical; PP_02792 N/A MKDGQLTANGTAVGGTETDSGLTTGSFNVISGRKYRIDVFAMILGSTAGNIASCKLTNASNTVLAQNNVL PHAGE_Rhodoc_ReqiPoco6_NC_023694:head protein; PP_02793; phage(gi593774744) 5.00E-41 MTKLTWDAPTEKTYETGLDRGVLYLQDGTAVVWNGLTAVTESSSRSTTPLYFDGKKFKDVVNLDQPKSKL hypothetical; PP_02794 N/A MNSDKIYVYRQDAELPDLGVAWYDRDGNLIDFSNGYSFTVKLVSQKDKTVALTKTAGIVGSNSKPNIIIG PHAGE_Caulob_CcrRogue_NC_019408:putative lectin-like domain protein; PP_02795; phage 2.00E-16 MHGGGYDTTIGSDHSVGLAYKEIYPAGSVTCPTWNQVSNGPDAWVALTLAFKPEPEAPWEGPEVYYSNED PHAGE_Rhodoc_ReqiPoco6_NC_023694:gp030; PP_02796; phage(gi593774743) 1.00E-40 MLKLIIKGDEVYDESTGQFGTVNDTILELEHSLLSVSKWESKFEKPFLANTEKTVDEIMDYIRFMIITPD PHAGE_Rhodoc_ReqiPoco6_NC_023694:head protein; PP_02797; phage(gi593774742) 1.00E-85 MTILTWDQSGQRLYETGVDRGVLYIPNAVSGLYDNGVAWNGLVSVTESPSGAESSAQYADNSKYLNLVSA PHAGE_Rhodoc_ReqiPoco6_NC_023694:prohead protease; PP_02798; phage(gi593774739) 0 MKAADFSGWATKAGLKCTDGRTITPDAFKDQNGVTVPLVWQHGHNDVENVLGHAVLEHRPEGVYAYCYFN hypothetical; PP_02799 N/A LILQLKENARIRQAEVREKLDKIISQLASKSKNARSKAEREKAKQEIESIRDELKTALEGAREAYAKLKD PHAGE_Rhodoc_ReqiPoco6_NC_023694:portal protein; PP_02800; phage(gi593774737) 1.00E-134 LAILNQVKRAINAFRSNEQSTQRYVTNADIGPGSSIRPHTTHSRHFNERSIVTSIYTRISVDVASVAIRH hypothetical; PP_02801 N/A MTARINADFKNSKGEKVSIDFANAVLSKAVSKNQREAYFKVGAVWVAAMLAGKYIGNARMSR PHAGE_Rhodoc_ReqiPoco6_NC_023694:gp020; PP_02802; phage(gi593774733) 2.00E-06 MSDVAVEEFLEHFGTKGMKWGVRNSRPTSGSSGKSEKPKHSKKKIAVGIAVGVGAIAVGVILAKNHKVKV PHAGE_Rhodoc_ReqiPoco6_NC_023694:gp019; PP_02803; phage(gi593774732) 1.00E-09 MSDISVEDFLAHYGVKGMKWGVRNESKSSNVTLGPPAGVVMRKDGSILIKPGANLQRLVRSNGESLPMKD PHAGE_Rhodoc_ReqiPoco6_NC_023694:TerL; PP_02804; phage(gi593774729) 0 MTLSNTATPKYYAEFRAAVLRGEIPVNEEISLEMNRIDDLIADPDIYYDDKAVEGFISYCENELTLTDGG  132 PHAGE_Rhodoc_ReqiPoco6_NC_023694:gp022; PP_02805; phage(gi593774735) 3.00E-05 MDTMSDHEEAREEFLEHYGVKGMKWGVRKKPSSSKRSLVTNAKKMSDSDLKAAVERLRLEREYVNINKDL PHAGE_Rhodoc_ReqiPoco6_NC_023694:gp014; PP_02806; phage(gi593774727) 2.00E-32 MESSILKSTKKVLGLSEDYIAFDLDIMTHINAAFSILNQLGVGPSTGFTIEDEQAQWSSFSSDAAVVNLV PHAGE_Thermo_THSA_485A_NC_018264:glycoside hydrolase family 25; PP_02807; phage(gi397912616) 1.00E-06 MSDTLYPVFYGTRLVTFDVLEATFSSKCHPEFWRRMKNFLLHQGGKFGIGGGWRAVGAQPDLSGFAPEGK hypothetical; PP_02808 N/A VARFMVIGDSISEARAVTTLGDRWQDRLAKMLRTKFPCVGV PHAGE_Rhodoc_ReqiPoco6_NC_023694:gp011; PP_02809; phage(gi593774724) 3.00E-15 VHHINPMSVDDLIRHEEWVLNPEYLITTTHDTHNAIHYGDQSLLKKPFTPRQMGDTKLW PHAGE_Rhodoc_ReqiPoco6_NC_023694:gp010; PP_02810; phage(gi593774723) 5.00E-17 MVEGTASAQVITHFLKLGTEREKLERERLRQEIILGQAKTDQIASAERVEKLYSRALQAMRQYQGQEIDD Region 3   PHAGE_Staphy_80alpha_NC_009526: tape measure protein; PP_03288; phage(gi148717898) 9.00E-08 VRAVFEARVAGAQKGLRDLAGDADKAGAKVDATAKSLKDLSSVTAKPKIDLAIEDAQRRLTAVTKELGEL hypothetical; PP_03289 N/A MSQWDDLDDHDRAWALGVGLADAEAEAEANAATCPSCGGLKAECQDPDNQHAYVVTTGRCYRTRALMEAQ hypothetical; PP_03290 N/A MRTRVLNHFHTFHPEAQAELNRLAAEEARLTLALARRSESEPEPKPKRRMSEPPAPVDDIAVELEKVRAA hypothetical; PP_03291 N/A MAQTAVAGVVAAGRVPTWIIPQASIATDPTPGTYSIPLTALTGGTTVKADCHMDAGDLSVSRSAQTRERQ hypothetical; PP_03292 N/A MPQNPAEFTRVSTPAGHFSVPSALVEAAGSEWKVLKQDAADSNGLPYPPKLREQSAAATPQAEASASADN hypothetical; PP_03293 N/A VHTEIKAALIAAGVYAVDGPADDLPSDGGVVRQAAVLWPSPGSHTYTRVSGSSSGRVDRVLITCVGATTF PHAGE_Strept_20617_NC_023503: phage protein; PP_03294; phage(gi588295123) 1.00E-06 VSAGAEFSRFARALRIAAGGLESDGRKAVDRVAQGALRTAQAHAGVDSGDLRSSLRVSSRGGELRAAVET hypothetical; PP_03295 N/A VIGPAIDAALPVMHANAESMMLDRCTIERATSTWDEAAQKTVTTWAPVITESVCDVDDGAASGRSIVTDE PHAGE_Mycoba_Brujita_NC_011291: gp9; PP_03296; phage(gi206599553) 5.00E-06 VTVDGSGCAVQMLPTLRLVYLVSITNDGAAVSNPEWSAAGFVRGSWTCRLRGVTATMRHGFEDWPADLLG hypothetical; PP_03297 N/A MSARKSAETNPQPKAVDVPDEVTPDVPKPPVKVPSIMGSTFAERAAANKAVAKSRTEAKG PHAGE_Mycoba_Butters_NC_021061: major capsid protein; PP_03298; phage(gi479336468) 3.00E-77 MSNARQRLEAAVKSLQEFSDQLDAADAPLSGEDMSNLKSRMEEIKDLKGQVEAEAEAAGALKDAKAFMAA PHAGE_Mycoba_HufflyPuff_NC_022981: capsid maturation protease; PP_03299; phage(gi563398893) 1.00E-24 VKKSYAFAAVKSLDSENPNGEFEVVLSAATVDRDGEVIEARAFEPLPESIPFHAFHDFHDPIGRGVPFYD PHAGE_Mycoba_Murphy_NC_021305: portal protein; PP_03300; phage(gi508179175) 2.00E-61 VFVSNGSLVTKTPLLAGSPTYFPKMSTAGLIYPTAYSQMYRGQLWINILVNKLAKAQARLPFPVYERDEL hypothetical; PP_03301 N/A VMFSNLVLKNRWRDRVVVTLKSGESFAGVLWSNDSRALVIRNASALGAGENRTDLSLDGEVIVLMADVAY PHAGE_Mycoba_Butters_NC_021061: terminase; PP_03302; phage(gi479336464) 2.00E-52 VTEYAPLFRHRPPERWTNGDLAAKVGIDLGLPPDDEQRELLDMIYAEKAPDRPAAFEVCVVGPRQNIKTS  133 PHAGE_Mycoba_Butters_NC_021061: hypothetical protein; PP_03303; phage(gi479336463) 8.00E-17 MSEVVCVCGKAFAAKSNRARYCSDRCRKRAQRGGGEVVELPAKVGQVEGSSLAQAGPVETATVDALKAAD Region 4   attL                                                 N/A TTGGGTGTCGGTC PHAGE_Arthro_vB_ArS_ArV2_NC_022972: putative endodeoxyribonuclease; PP_03315; phage(gi563398 6.00E-07 MKPQRLTIPAPAPWINANARDHWTKKGRLTRSWRSASAAWARHQKLRPVTQPVVIVATVVKTNSRRFDVE hypothetical; PP_03316                                                                N/A MIATRPECGTGYGIRYHLAKDEPFCDLCTDHVMDRRLVRETRTLTGSTTTQERTLRQAIHALAQMLDEHD hypothetical; PP_03317                                                                N/A MSSTTRAQAEALATFVRQLRPDWDHPGIVHAIGRCQREAVSEIAVALIRLAENGQAKTPALLPEPGRHWK PHAGE_Strept_VWB_NC_005345: hypothetical protein VWBp15; PP_03318; phage(gi41057231)      3.00E-08 VDDTLHSHPKTRRAGLAAMGLWTVCGSYCMAYKTNGFVPEWFVAGFQSGRKLAADLVRAGKWEDAVKDNE PHAGE_Pseudo_F116_NC_006552: DNA adenine methyltransferase; PP_03319; phage(gi56692911) 6.00E-45 VKPPFSYYGGKMTVGPEIARILPAHKHYVEPFAGSLAVLLAKEPSHAETVNDLDGDIVTFWRVLRDRADD PHAGE_Arthro_vB_ArS_ArV2_NC_022972: hypothetical protein; PP_03320; phage(gi563398174)      2.00E-08 VSLADWTCATCTTEGRRGSCCPGRCYCGHDTCHAFASWTPRPVLNVTDISKPKGKKGSAWAEREESTWID PHAGE_Mycoba_BigNuz_NC_023692: gp55; PP_03321; phage(gi593774685)                     8.00E-133 VSVYEGHGVTLHHGDCLDVLRSLPDCSVDSVVCDPPYALGFMGREWDTFGMDVGRGAQARSQRRAEVTPT hypothetical; PP_03322                                                                N/A MNTNRPALPVHVVNRCKFAALTFGNVRAAREYEQRTGRHASACGNCGKWHA hypothetical; PP_03323                                                                N/A VARLMPNQPKTPIRSVRIPDEEWRAAQARAAERGETVTDLIRRALRRYAK hypothetical; PP_03324                                                                N/A MSDPTPIKALDWHGFLRILAAHSPSKLLSANTIHADACEGVQIDPKRLGALFKAAADAGYIRLVGVENAA PHAGE_Mycoba_32HC_NC_023602: DnaQ; PP_03325; phage(gi589893377)                       7.00E-21 VSAPLVFLDTETDGIHPGRRVWEVAMIRRDYDGDSVKQMETHFFVGLDLRDSDPFGLRVGGFWDRHPAGR PHAGE_Arthro_vB_ArS_ArV2_NC_022972: hypothetical protein; PP_03326; phage(gi563398164)      4.00E-31 MDLTDSIAPRSDQMNAEDLLTGPRTFTVTEVRKGSSAEQPVSIYLAEFPSDRPFKPSKTVRRLIVSAWGK PHAGE_Mycoba_Charlie_NC_023729: gp47; PP_03327; phage(gi593779210)                    5.00E-42 MTITEPGIVTDLDERTYHADRGSLSHSGSKNLHDSPARFRWLLDNRVEKDSFDVGTLAHKLILRSTDNRI hypothetical; PP_03328                                                                N/A VSYFDLPEVQEARADLRDALDAAWQDARDHGHDDHRPTRAETDTDQWHAAWDDDEGHTWADMVRGVAVDS hypothetical; PP_03329                                                                N/A MTAAAIVPAQCPNCGRLTVDNRQHCPDMPASQWARVCLAMTCAKCGTGYHTHTTKETQS hypothetical; PP_03330                                                                N/A VAGRGDIANTRHVLTIGPVGSAGRHVTIHRHYDGPDSTTWGHLIIAGCWRGIADQLDHRIHTKKGHGWDE hypothetical; PP_03331                                                                N/A MSDAEIITQARLQLMHGTDPTDVADYILDALTERAWRGPSRDHDEVTA hypothetical; PP_03332                                                                N/A VGEADPMTLHTRVAVTSGDVTPEAVFAQCRSIIGADKHQVIEGDPLAMAPGQGLPALMWVESSNGEPETC hypothetical; PP_03333                                                                N/A MNIVETSDQIGALLILTATLVPVVAWAVWVAIDVLWERRTR hypothetical; PP_03334                                                                N/A MNDTTTIVWADLATEHHCDTCTCAPPIPDLNAALQAAATLDVRGAYSVRIDLDGHITLQGDAPNMLNLIG  134 hypothetical; PP_03335                                                                N/A LGRPAAFERFGGALLFLRSVVAAGGDECAGDDDGHEREGGE attR                                                                                  N/A TTGGGTGTCGGTC hypothetical; PP_03336                                                                N/A MSNIPPLLTTSEVAKSCGDVAVKTVTRWVESGQLAYAQKLGGLRGAYLFDPAEVARFKKSRERQVTS PHAGE_Mycoba_32HC_NC_023602: integrase; PP_03337; phage(gi589893371)                  2.00E-23 MSQDLAAAYLAHLEAEHAPANTIAARARVLRSVGSAGTATREDIEAWWATRRDLSPATRSNDLANLRAFY Region 5   attL                                                                                  N/A CAGCAGGCCGATGCC PHAGE_Bacill_G_NC_023719: gp245; PP_05619; phage(gi593777701)                         3.00E-22 MTTPQISAPDSTAAPDTPAHGRPLVGFEHCELRYPNGTHALSDVNLTVREGEFVSVVGPSGCGKSTLLRL hypothetical; PP_05620                                                                N/A MTTHTDQAAAPKDPVEPDPGAPSTTTDIAVLAQKASAASGF PHAGE_Liston_phiHSIC_NC_006953: hypothetical protein LPPPVgp44; PP_05621; phage(gi62362410) 2.00E-43 MTPEDGTSASARANDDFADFDAGYRFGQYSDDDFSAADFAPSTPEAPAQDGPLPPPFPLDGIDLLSPPGF hypothetical; PP_05622                                                                N/A MDRLAPAPFASRQIKRLVAALWGRMSPAERQAFKEWITKQ hypothetical; PP_05623                                                                N/A VSEKTGLQVGTIANIRDGKTQNPTYYALKRLSDYFEVNP PHAGE_Liston_phiHSIC_NC_006953: putative helicase subunit; PP_05624; phage(gi62362409)  4.00E-62 MRALAALAKEPNMSLIATAAKPADRPVLITLCGDSGMGKTSLAASFPKPIFIRAEDGMQAIPANNRPDAF PHAGE_Liston_phiHSIC_NC_006953: putative helicase subunit; PP_05625; phage(gi62362409)     3.00E-12 VGFLRLVTFTKGDDGERKKAISTGDRELVCHAVASNISKNRYGLTEALPFAAGENPLFAVIPALGAKHSI PHAGE_Liston_phiHSIC_NC_006953: hypothetical protein LPPPVgp42; PP_05626; phage(gi62362408) 9.00E-20 MAGFWNLSDGEDAAKTGAEYEIPGGNMDPIPAGSSVLAMIDEAKWDHTQNDAEEYISLRWTVLAPEEYKN PHAGE_Erwini_vB_EamM_Y2_NC_019504: tail fiber; PP_05627; phage(gi422934766)           2.00E-13 MEQRSEEWFAARKGRVTASMVGAILGVSPNLSRAGAMRRMVRDAHGAEPEFTGNIATQYGERNEDGAVDE PHAGE_Vibrio_pYD21_A_NC_020846: hypothetical protein; PP_05628; phage(gi472340491)      2.00E-05 LTKISKAGAVSYAKAVAELLPGTDLEKWRGKPSTYWMLK PHAGE_Liston_phiHSIC_NC_006953: putative helicase; PP_05629; phage(gi62362405)        2.00E-132 MGQKRDAQKKKDAEMTLRPYQQAAVDAAVEWMRKSLAPACIEAATGAGKSHIIAEIARQIHHQTSKRVLC PHAGE_Synech_S_MbCM6_NC_019444: hypothetical protein; PP_05630; phage(gi418487471)      1.00E-11 MATRAFAASVRQRRPITTDQVTITTAAANPSGATLPTAVAGDRVIIANRGANPVNIYPATGAAIGALAAN hypothetical; PP_05631                                                                N/A MLALAENTQAVRAMVEAMKAQNTHFADNNEMFKALGPVLSDLRHDGADSKAHLAAIRDALNRGR PHAGE_Acinet_AP22_NC_017984: putative endolysin/autolysin; PP_05632; phage(gi388570824) 4.00E-23 MTMKTSDAGLFALALHEGIVPAPYRDSVGVWTYGIGHTLGAGYPDPAKMLRGMPSNLDAALRDVFDLFRR hypothetical protein KVU_1650 [Ketogulonicigenium vulgare WSH-001] gi|385234143|ref|YP_00579 PP_05633 3.00E-06  LGRIVFQLKGRDMDWAPYARIAARYIIGGVGGTAVGDAVLNDPDLMNILTIAISGAAAALTEYLYALAKR hypothetical; PP_05634                                                                N/A MIAFIWKLILGGLWRPLLAVLGAAGLYVKGRADAKAKADSRALDATVKGQEAARKGRAEAVEKLRQGKTP PHAGE_Roseob_1_NC_015466: hypothetical protein RDJLphi1_gp31; PP_05635; phage(gi331028085) 1.00E-09 MATLVPRLAHSAALILLALPAQAQTPCTGLPDALAALAARYDEAPRVSGLMANGQLLIVTASEAGGFTVL hypothetical; PP_05636                                                                N/A MSDLIERLERWGRDEGLQNHFIGSRSARKDCAEAATALAEAEAEIARLKDVLEIVAAGPILGEPMARWIN  135 hypothetical; PP_05637                                                                N/A LTSRVAIKRALEAAGFVFLRGGWARKEAAPRLQDKIDRAVKDAAESVERIKNVGTHENHVGTHEK hypothetical; PP_05638                                                                N/A MTIAEKIQMMRDAGLRRTAARIWQNPNGTWSHHKDAQREWYGLDGCETDLRFFEEWEA PHAGE_Rhodob_RcapNL_NC_020489: phage integrase/recombinase; PP_05639; phage(gi461474991) 1.00E-15 VMREAPYYGWTIFTRTRSGKSIGGDVSAAAKLAGVKKTAHGLRKTRATVLAEGGATASQIAAWTGHKTLA PHAGE_Rhizob_RR1_A_NC_021560: hypothetical protein; PP_05640; phage(gi514231508)      9.00E-13 LIQAIAELRRDGPEDATWTASLVADGNPYQSSLAGAVATTLNAALSGDLIPKADADLAVALMVEKAADVV PHAGE_Rhodoc_REQ2_NC_016652: hypothetical protein; PP_05641; phage(gi372449849)       3.00E-06 LRSGDTCPGRPRGSASATSATTRCHLIAAAPDLARALLDARAEHAASLIRINAEAVEAVAQARADAQAAV cytochrome bd quinol oxidase subunit 1 [Intrasporangium calvum DSM 43043] gi|317123393|ref|Y05.1|;  PP_05642 6.00E-78  MYRRSARLGAIILLLGGVAVTISGDLQSRVMTQVQPMKMAAAEALYDTSPEGKGASFSIISVGTPDGQHE cytochrome d oxidase cyd, subunit II [Kytococcus sedentarius DSM 20547] gi|256824081|ref|YP_.1|;  PP_05643      3.00E-90 MELTTVWFILIAVLWIGYFVLEGFDFGVGILFPVLGRDDPDLGSNDLAETGEIRRRVMLSTVGPVWDGNE cytochrome d ubiquinol oxidase subunit II [Nocardiopsis dassonvillei subsp. dassonvillei DSMi|297559180|ref|YP_003678154.1|;  PP_05644 1.00E-23  MSVGSLFVALFPDVMPSTTDPAFSLTTINASSTDYTLKIMTWVAVVFTPIVIGYQGWSYWTFRKRVSGHH PHAGE_Plankt_PaV_LD_NC_016564: ABC transporter; PP_05645; phage(gi371496158)          5.00E-13 MGPIDPSLLRALPGARSRVARLAGMGVISGVLALGQAIAVAASVTAIVRGSSLAMPLAVLGAVLVLRGLV attR                                                                                  N/A CAGCAGGCCGATGCC  136 Table B5 Summary of spacer sequences GID DR concensus # DR  Variants Ave. DR  Length Spacer count Ave. SP  Length Ave. SP  Coverage # of Flankers Ave. FL  Length # of Reads Coverage G4 GTGCACCCCGGCAGCCCGCCGGGGTGGGAGTCTCAAC 1 37 45 35 1 6 37 26 1:25,2:10,3:6, 4:1,5:2,6:1, G9 CTCTCCGTCGGCGTTCGTCGACGGCCTCATTGAAGC 1 36 36 36 1 3 35 41 1:30,2:6, G10 CTTCCCCCGGCCATCAGGCCGGGGCTCCATTGCGGC 1 36 33 37 3 4 42 52 1:22,2:1,4:1,7:2,8:2, 9:2,10:1,11:1,12:1 G12 AGGAGGGGCTTGCGGTGTTTGTTCAGG 1 27 3 40 1 0 0 2 1:3, G15 CGGTTCACCTCCACGTGCGTGGAGACAAC 1 29 23 32 1 0 0 4 1:23, G16 ATTCACTGCCGTGTAGGCAGCTCAGAA 1 27 10 33 1 0 0 3 1:4,2:6, G19 GGCTCCCCCGCACACGCGGGGATCGACCC 1 29 6 32 1 0 0 2 1:6, G20 CCTGCCAAGAAAGCGCCGGCAAAGAAGGCACCGGTTAAAAAGGC 1 44 3 18 1 1 31 19 1:1,2:2, G23 GTTGCACTCAGGCTTTGCCCTGAGTGGGGATTGAAAC 1 37 5 35 1 1 38 1 1:5, G27 CCAGCATTCCCGGCCTAGTGTCGGGCTCCGTTGAAGCGG 1 39 3 31 1 0 0 1 1:3, G28 AGCCTACCAATGGGAAGTCGGTAGGGAAACCACGGCGCGCG 1 41 3 25 1 0 0 1 1:3, G29 GAGTGTAGCTATCCGGGGTGAGAGAGGGAGCTACAAC 1 37 3 30 1 0 0 1 1:3, G33 CTTATAATTGCACCAGTTTGGGATTGAAAC 1 30 31 36 1 2 39 9 1:31, G35 GTAGCGCCCGTCCTTAGTGACGGGCGAGGATTGAAAC 1 37 22 34 1 0 0 5 1:22, G37 GTCGCGCGCCCTTCACGGGGCGCGCGTGGATTGAAAC 1 37 10 34 1 0 0 1 1:10, G41 GCTCTCCGCGCCCGCGCGGGCGCGGCCTCGTTGAAGC 1 37 4 35 1 1 38 4 1:4, G48 GATCCCGCCCTCACCCGCACGGGCCGCA 1 28 3 33 2 1 37 4 2:1,3:2, G50 GTTCGCCATCGCATAGATGGTTTAGAAAA 1 29 9 31 1 0 0 1 1:9, G53 CGGGCTCGCCCGTCAGCGATGACGGGCGCGGATTGAAAC 1 39 5 34 1 0 0 1 1:5, G56 AGTTCTCGTCCCCTCGCGGGGTTTTGGGTCTGACGAC 1 37 15 37 1 1 42 9 1:14,2:1, G58 CGCGTTCCCCGCAGGCGCGGGGATGAACCG 1 30 4 32 1 0 0 4 1:4, G59 GGGGTCGCCCCTCGTGATCACGAGGGGCGTGGATTGAAAC 1 40 9 32 1 1 37 2 1:9, G60 CGTCGCGCCCCTCACGGGGCGCGCGGATTGAAACTA 1 36 4 32 1 0 0 1 1:4, G63 GACACGCTCCCCGGCGACGGGGAGCGAGGATTGAAACCAC 1 40 4 31 1 0 0 1 1:4, G67 ATTTCCGCGACTGAAAGGTCGCGGCCTCATTGAAGC 1 36 9 36 1 0 0 1 1:9, G70 CATGTGCTCAACGCCTTTCGGCATCAACGAATGATTCAC 1 39 5 33 1 0 0 1 1:5, G71 GGGGGAGGCCAGGAGGCGGCCGTATCGTGA 1 30 3 31 1 0 0 1 1:2,2:1,  137 G72 CTGTTGCACCCGCCTCTCGGGGCGGGTGGGGATTGAAACCA 1 41 5 31 1 2 33 1 1:5, G73 GTTGATAGCAATAATTCAAAGATACATTCTAAAAGCTATTCACAAC 1 46 6 30 1 1 32 1 1:6, G74 CCTTCAATGAGGCCGAGGCACGAGGCCTCGGAAAAC 1 36 8 35 1 2 37 2 1:8, G76 GTCACAAAGGAGGTTCCGCTCACGCGGATTGAAACA 1 36 3 36 1 0 0 1 1:3, G83 GTTTTTGCGCGTCTGCTCGAAAACCACAA 1 29 3 28 1 0 0 1 1:3, G84 ACAAGGCGTTGTAGACCCCGACGGGAGGAGGGATGAATTCGC 1 42 3 36 1 0 0 1 1:3, G85 GTCGCCGCCTTCACCGGCGGCGCGGATTGAAAC 1 33 12 33 1 1 102 3 1:11,2:1, G87 CTTTCAGTCTCCGCTCTTTCGGAGTAGGTGAGGAATC 1 37 5 35 1 0 0 1 1:5, G90 ATCGCGACTTGCGTCGCTCCTACG 1 24 3 46 1 0 0 2 1:3, G91 CCGCCGCCGGTGCAGGCGATGATGCTCAATGCCTGCGGTGCAGCGCC 1 47 3 23 1 1 28 1 1:3, G98 CGTGGTCCCCGCGTGGGCGGGGATGAGCCGCC 1 32 3 29 1 0 0 2 1:3, G100 GCGCGAGGTCGCGTTGTGACGCGACCAATTGAAACTA 1 37 5 33 1 0 0 1 1:5, G101 GCTTCAATTCGGCCACGGCGTTGATGCCGTGGAAAC 1 36 5 36 1 0 0 1 1:5, DR, direct repeats; SP, spacers; FL, flankers 


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items