UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

The use of high-throughput amplicon deep sequencing to explore aquatic virus communities Tian, Xi 2015

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2015_may_tian_xi.pdf [ 2.19MB ]
Metadata
JSON: 24-1.0167157.json
JSON-LD: 24-1.0167157-ld.json
RDF/XML (Pretty): 24-1.0167157-rdf.xml
RDF/JSON: 24-1.0167157-rdf.json
Turtle: 24-1.0167157-turtle.txt
N-Triples: 24-1.0167157-rdf-ntriples.txt
Original Record: 24-1.0167157-source.json
Full Text
24-1.0167157-fulltext.txt
Citation
24-1.0167157.ris

Full Text

The Use of High-Throughput Amplicon Deep Sequencing to Explore   Aquatic Virus Communities  by Xi Tian  B.Sc., The University of British Columbia, 2011  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF  MASTER OF SCIENCE in THE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES (Bioinformatics)  THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver)  March 2015  © Xi Tian, 2015 ii  Abstract  Viruses are the most abundant biological entity in aquatic ecosystems. In each milliliter of marine or fresh water, there are typically between one to ten million viruses. Aquatic viruses influence microbial diversity, mortality and evolution, which in turn affect biogeochemical cycles and energy fluxes in marine ecosystems. As most aquatic microbes have not been cultured, the viruses which infect them cannot be cultured; hence, non-culture based approaches are needed to ascertain changes in the composition and diversity of virus communities.  This research involves using PCR amplicons and high-throughput sequencing to uncover unknown diversity in marine and freshwater viruses and determine its temporal and spatial variation. Differences in the taxonomic profiles of viruses in the families Phycodnaviridae, Myoviridae, and Podoviridae across marine locations were assessed using 454 pyrosequencing. Temporal and spatial changes in the taxonomic profiles of viruses in the family Myoviridae were assessed in a stream using Illumina sequencing.  Results show that high-throughput sequencing of marker genes is a robust method to explore viral diversity, and revealed many previously unknown Operational Taxonomic Units (OTUs). Furthermore, distributions of OTUs within virus families differed markedly among samples, indicating that the virus distributions were spatially dynamic. Moreover, the variation of OTUs within the freshwater Myoviridae communities suggested that some OTUs could be used as indicators of agricultural runoff.   iii  Preface The presented work would have not been possible with the contributions made by collaborators, and colleagues. In chapter 2, the experiment was conceived and designed by Curtis Suttle, Amy Chan, Caroline Chenard, Jessica Clasen, Jessica Labonte, Clemens Pauz, Jerome Payet and Danielle Winget conceived the study and produced the data. Chris Payne and Larysa Pakhomova assisted with sample collection and preparations. Special thanks to the Vancouver Canadian Coast Guard Station, Victoria Whale Watching Company and the crew of the John Strickland which made the sample collection possible. I built the bioinformatics pipeline for the analysis of the sequencing data and Danielle Winget and I analyzed the data, and wrote the manuscript with input from Curtis Suttle.  In Chapter 3, the study was conceived by Patrick Tang, Judith Isaac-Renton, Natalie Prystajecky, Fiona Brinkman and Curtis Suttle. Miguel Uyaguari Diaz, Natalie Prystajecky, Tyler Nelson, Kirby Cronin, Sarah Tan, Matthew Croxen and I did the sample collection. Miguel Uyaguari Diaz and others at the BC Center of Disease Control (BCCDC) prepared the samples for sequencing. I built the bioinformatics pipeline and analyzed the data with input from members of the Watershed Discovery Team and members of the Suttle Laboratory. I led the interpretation of the data and writing of the manuscript with input from Miguel Uyaguari Diaz, Cheryl Chow, Julia Gustavsen, and Curtis Suttle.   iv  Table of Contents  Abstract .......................................................................................................................................... ii Preface ........................................................................................................................................... iii Table of Contents ......................................................................................................................... iv List of Tables ............................................................................................................................... vii List of Figures ............................................................................................................................. viii List of Abbreviations ................................................................................................................... ix Acknowledgements ........................................................................................................................x Dedication ..................................................................................................................................... xi Chapter 1: Introduction ................................................................................................................1 1.1 Aquatic viruses and their importance ............................................................................. 1 1.1.1 Viral effects on microbial mortality and diversity ...................................................... 1 1.1.2 Viral effects on biogeochemical cycling..................................................................... 2 1.2 Assessing viral diversity ................................................................................................. 2 1.2.1 Transmission electron microscopy (TEM) ................................................................. 2 1.2.2 DNA-DNA hybridization............................................................................................ 3 1.2.3 Pulsed-field gel electrophoresis (PFGE) ..................................................................... 3 1.2.4 PCR and DNA sequencing.......................................................................................... 4 1.2.4.1 Denaturing gradient gel electrophoresis (DGGE) ............................................... 4 1.2.4.2 Restriction fragment length polymorphism (RFLP) ........................................... 4 1.2.4.3 DNA sequencing ................................................................................................. 5 1.2.4.3.1 Shotgun and targeted amplicon sequencing .................................................. 5 v  1.3 Spatial and temporal diversity in aquatic viruses ........................................................... 7 1.3.1 Factors for viral diversity ............................................................................................ 7 1.3.2 Spatial and temporal diversity in the marine environments ........................................ 7 1.3.3 Spatial and temporal diversity in the freshwater environments .................................. 8 1.4 Research objectives and outline of this thesis ................................................................. 9 Chapter 2: Sequencing of Marker-gene Fragments for Three Viral Families Reveals Pronounced Differences among Marine Environmental Samples ..........................................10 2.1 Introduction ................................................................................................................... 10 2.2 Materials & methods ..................................................................................................... 13 2.2.1 Sample collection & virus concentration .................................................................. 13 2.2.2 DNA extraction ......................................................................................................... 15 2.2.3 PCR amplification of virus genes ............................................................................. 15 2.2.4 Sequencing ................................................................................................................ 17 2.2.5 Bioinformatic analyses.............................................................................................. 17 2.3 Results and discussion .................................................................................................. 21 2.3.1 Bioinformatic pipeline .............................................................................................. 21 2.3.2 Phycodnaviridae ........................................................................................................ 27 2.3.3 Myoviridae ................................................................................................................ 30 2.3.4 Podoviridae ............................................................................................................... 33 2.4 Conclusions ................................................................................................................... 37 Chapter 3: Temporal and Spatial Variation of Bacteriophage Composition in an Agriculturally-influenced Freshwater Watershed ....................................................................38 3.1 Introduction ................................................................................................................... 38 vi  3.2 Material and methods .................................................................................................... 41 3.2.1 Environmental data collection .................................................................................. 41 3.2.2 Sample collection ...................................................................................................... 41 3.2.3 Concentration of viral particles ................................................................................. 42 3.2.4 Nucleic acid extraction ............................................................................................. 43 3.2.5 Amplification of g23 encoding the viral major capsid protein ................................. 43 3.2.6 Sequencing and bioinformatics analysis ................................................................... 44 3.2.7 OTU analysis ............................................................................................................ 44 3.3 Results and discussion .................................................................................................. 45 3.4 Summary ....................................................................................................................... 53 Chapter 4: Conclusion .................................................................................................................54 Bibliography .................................................................................................................................57  vii  List of Tables  Table 2-1 GOB sampling details................................................................................................... 14 Table 2-2 Sequence retention........................................................................................................ 22 Table 2-3 Bins tracking ................................................................................................................. 23 Table 3-1 Environmental data ....................................................................................................... 48  viii  List of Figures  Figure 2-1 Rarefaction curves for the three amplified gene products. ......................................... 24 Figure 2-2 Shannon’s diversity indices for marker sequences. .................................................... 26 Figure 2-3 Top 50 OTUs from the Phycodnaviridae. ................................................................... 27 Figure 2-4  Phycodnaviridae phylogenetic tree and heat map showing relative abundance. ....... 28 Figure 2-5 Frequency histograms of the 50 most abundant Myoviridae OTUs. .......................... 31 Figure 2-6 Myoviridae phylogenetic tree and heat map showing relative abundance. ................. 31 Figure 2-7 Top 50 OTUs from the Podoviridae are shown as stacked histograms. ..................... 33 Figure 2-8 Podoviridae phylogenetic tree and heat map showing relative abundance. ............... 34 Figure 3-1 An illustration of the sampling sites at the watershed................................................. 42 Figure 3-2 Shannon’s diversity indices of the 3 sample sites over time. ...................................... 46 Figure 3-3 Thirteen months of environmental data are graphed using a principal coordinate analysis (PCA) and visualized by sample location. ...................................................................... 49 Figure 3-4 Principal coordinate analysis of the environmental data visualized by seasons. ........ 50 Figure 3-5 OTUs that represented more than 5% of any sample are visualized in a heatmap. .... 51 Figure 3-6 Phylogenetic analysis of g23 sequences from representative isolates and 25 OTUs from this study. ............................................................................................................................. 52  ix  List of Abbreviations  BLAST Basic Local Alignment Search Tool DGGE  denaturing gradient gel electrophoresis DNA  deoxyribonucleic acid DOM  dissolved organic matter ml  milliliter OTU  Operational Taxonomic Units PCR  polymerase chain reaction PFGE  pulsed-field gel electrophoresis RNA  Ribonucleic acid TEM  Transmission electron microscopy RFLP  Restriction Fragment Length Polymorphism UV   Ultra violet    x  Acknowledgements  First and foremost I would like to thank my supervisor, Dr. Curtis Suttle, who guided me throughout the years of grad school.   I would like to also express my appreciation for my thesis committee, Dr. Fiona Brinkman, and Dr. Steven Hallam for their time and effort to review my work. Special thanks to Dr. Patrick Tang to read my thesis and be at my thesis defense and Dr. Paul Pavlidis as the examination chair.   Finally, I would like to thank the members on the watershed project, including people from the Center of Disease Control and Simon Fraser University. It was a pleasure working with all of you. Special thanks to the past and present members of the Suttle lab, especially Danielle Winget who taught me many things since I first started as a graduate student.   xi  Dedication  For my family, and those who have always supported me. 1  Chapter 1: Introduction Viruses are small infectious agents containing DNA or RNA encapsulated in a protein coat; they are most widely known for causing diseases from the common cold to the deadly Ebola; they are also the most common biological entities on earth. In aquatic systems that cover about 71% of the earth’s surface, about 96% of which are oceans, there are typically about 10 million free viruses particles per mL, representing about 1030  viruses in total; if stretched end-to-end they would traverse about 200 million light years (Suttle, 2007).  1.1 Aquatic viruses and their importance Aquatic viruses influence microbial diversity, mortality and evolution, which in turn affects biogeochemical cycles and energy fluxes in marine ecosystems (Fuhrman 1999; Wilhelm and Suttle 1999; Suttle 2005, 2007; Weinbauer 2004; Wommack and Colwell 2000). They infect aquatic life from microbes to whales, and are also vectors of horizontal gene transfer (reviewed by Chibani-chennoufi et al. 2004; Weinbauer 2004). 1.1.1 Viral effects on microbial mortality and diversity Viruses can affect microbial diversity because viral infections are density-dependent  and host-specific (Ackermann 2007; Fuhrman 1999; Weinbauer 2004; Suttle 2007). The most abundant microbial taxa have the highest contact rates with virus particles and hence the highest probability of contacting an infectious virus. This concept leads to the concept of “kill the winner” (Thingstad et al., 1997; Thingstad, 1998), in which the most rapidly growing bacteria are prevented from dominating the community because of viral lysis. More viruses are released by lysis of the host bacteria, which infect remaining host cells; thereby, microbial diversity is maintained even if environmental factors favor dominance by only a few groups (Rodriguez-Valera et al., 2009). Studies have shown that viral abundance and diversity are directly correlated 2  with host species’ abundance and diversity (Fuhrman & Schwalbach, 2003). Consequently, knowledge of viral diversity provides insights into host diversity.  1.1.2 Viral effects on biogeochemical cycling Viral infection leads to consequent effects on biogeochemical cycling. The products of cell lysis by viruses are released to the pool of dissolved organic matter (DOM) (J. A. Fuhrman, 1999; Middelboe & Lyck, 2002; Wilhelm & Suttle, 1999b), which short circuits the flow of organic matter and shunts organic carbon from cells into bacteria. It is estimated that 5% to 26% of the organic carbon fixed by photosynthesis passes through the viral shunt (Wilhelm & Suttle, 1999b). In turn, the nutrients released by viral lysis provides substrates for the growth of heterotrophic bacteria (J. A. Fuhrman, 1999; Gobler et al., 1997; Wilhelm & Suttle, 1999b), and indirectly results in regeneration of nutrients such as iron (Poorvin, Rinta-Kanto, Hutchins, & Wilhelm, 2004) and ammonium (Shelford, Middelboe, Møller, & Suttle, 2012; Weinbauer et al., 2011). Estimates suggest that viral lysis can increase ambient dissolved organic carbon (DOC) by 30%, as well as the concentrations of other nutrients (Gobler et al., 1997), emphasizing the important role of viruses in biogeochemical cycling.  1.2 Assessing viral diversity 1.2.1 Transmission electron microscopy (TEM) Morphological characterization of viruses requires TEM (Markus G Weinbauer, 2004; K. E. Wommack & Colwell, 2000), and this was the approach first used to examine the abundance and diversity of marine viruses (Bergh, Borsheim, Bratbak, & Heldal, 1989; Proctor & Fuhrman, 1990; Torrella & Morita, 1979).  Many of the particles had head and tail structures consistent with bacteriophages (Bergh et al., 1989; Proctor & Fuhrman, 1990; Torrella & Morita, 1979), although most do not have tails (J. Brum, Schenck, & Sullivan, 2013). TEM was also used to 3  look at the percentage of visibly infected cells, which indicated that the proportion of the bacterial community lysed by viruses varies widely (Proctor & Fuhrman, 1990; M G Weinbauer, Winter, & Hofle, 2002). TEM also showed that different families of viruses can infect the dominant group of primary producers in the ocean (Sullivan, Waterbury, & Chisholm, 2003; Curtis A Suttle & Chan, 1990; Waterbury & Valois, 1993). However, the TEM method often under-estimates abundance in comparison to other techniques and the morphological differences does not reflect the underlying genetic diversity.  1.2.2 DNA-DNA hybridization  DNA-DNA hybridization can yield sequence similarity. In DNA-DNA hybridization, a virus isolate is labeled radioactively or fluorescently and probed against other viral isolates on a Southern blot. This method is useful for comparing the genetic similarity among viruses (Jarvis 1984; Ogunseitan et. al 1992). However, this method does not allow researchers to discover new viruses, distinguish among distantly related viruses or among very similar viruses, (Cottrell et al. 1995), or uncover any sequence data to reveal a gene or genome of interest. In the era of low-cost sequencing, this approach has largely been abandoned.    1.2.3 Pulsed-field gel electrophoresis (PFGE) PFGE can be used to estimate the range of viral genome sizes in natural water samples by separating and visualizing different sized genomes on a gel. A snapshot of the community is taken based on band intensity and the generated patterns. This technique has been used to reveal a wide range of genome sizes between 9 kb and 700 kb), although most were between 70 kb and 80 kb ( Wommack et al. 1999; Wommack and Colwell 2000). PFGE is a very low resolution approach for estimating viral diversity, as different viruses can have similar genome sizes (Riemann & Middelboe, 2002; Steward, Montiel, & Azam, 2000; K. Wommack, Ravel, Hill, & 4  Chun, 1999), and the number of viruses needed to resolved a band is typically in the millions.  In addition, only dsDNA viruses are typically resolved on PFGE gels. Consequently, the results from PFGE are qualitative and do not reveal anything about the genetic diversity of viral communities.  1.2.4 PCR and DNA sequencing PCR and sequencing is an approach that can be used to examine the diversity and dynamics of viruses; however, because viruses do not have a common genetic marker, diversity comparisons are only possible within specific groups of viruses. Some examples include DNA polymerase for the Phycodnaviridae and the Podoviridae, the major capsid protein for T4-like Myoviridae, and the RNA-dependent RNA polymerase (RdRp) gene for the Marnaviridae.  1.2.4.1 Denaturing gradient gel electrophoresis (DGGE) PCR amplification of marker genes combined with DGGE, in which a DNA sample is electrophoresed across a gel with an increasing concentration of a denaturant to separate DNA of the same size that differs in sequence, has been used to compare viral communities.  Comparing banding patterns allows has allowed changes in viral communities to be resolved across time and space in aquatic environments (e.g. Short and Suttle 1999; Adriaenssens and Cowan 2014). 1.2.4.2  Restriction fragment length polymorphism (RFLP)  Another approach is to cut the amplicons with restriction enzymes to create fragments of different size depending on sequence, and to analyze the RFLPs by capillary electrophoresis. The resulting fingerprints vary depending on the relative abundance of amplicon sequences. This technique has been used to follow temporal and spatial changes in marine cyanophage communities (Marston & Sallee, 2003; Pagarete et al., 2013).  5  1.2.4.3 DNA sequencing A very high resolution approach for comparing the genetic composition of specific groups of viruses across time and space is by high-throughput sequencing of PCR amplicons, which can yield billions of sequences at a manageable cost. This approach has uncovered enormous viral diversity that was unknown from cultured isolates.  The Sanger sequencing method requires cloning and was the main sequencing method until the mid-2000s (Sanger, Nicklen, & Coulson, 1977). It produces relatively long sequences (400-900bp) with relatively high level of accuracy (99.9%), but at relatively high cost and low throughput.   Subsequently, non-Sanger based high-throughput DNA sequencing technologies emerged in which millions and billions of DNA strands can be sequenced in parallel. Pyrosequencing was one of the earlier non-Sanger methods; it yields about 1 million 300 to 700 bp sequences in a run. It has been used to reveal the diversity of Prasinoviruses off the African coast  (Clerissi et al. 2014) and the high temporal and spatial diversity of marine RNA viruses in coastal British Columbia waters (Gustavsen et al. 2014). One of the downsides of pyrosequencing is that it is prone to error in determining the correct number of bases when there be a series of the same nucleotide (Balzer, Malde, & Jonassen, 2011).  Illumina sequencing is a high-throughput technology that produces up to 3 billion bases in a run. Illumina sequencing produces more realistic estimates of 16S rRNA bacterial community richness and evenness than 454 pyrosequencing (Logares et al., 2013).  1.2.4.3.1 Shotgun and targeted amplicon sequencing Shotgun and targeted amplicon sequencing are two approaches that can be used to estimate viral diversity in natural samples. Each method has own advantages and disadvantages.  6  An advantage of the shotgun approach is that the sequences recovered “randomly” in proportion to their occurrence in the environmental sample, which allows researchers to assemble contiguous sequences, and potentially whole genomes from short fragments of nucleic-acid sequences (Kristensen, Mushegian, Dolja, & Koonin, 2012). In addition, since the nucleic acids are sequenced randomly, this approach can also be used to predict the environmental coding potential for proteins, metabolic pathways, and the community metabolic potential (Thomas Schoenfeld, Liles, Wommack, Polson, & Mead, 2010). The first marine shotgun metagenomic surveys revealed that almost all viral sequences had no hits to sequences in databases (Breitbart et al. 2002; Angly et al. 2006). The lack of representative sequences in databases is one of the weakness for analyzing shotgun sequenced viral communities, as sequences without any recognizable similarity to those in databases are often discarded. As well, the enormous diversity of aquatic viral communities means that the sequencing depth has to be enormous to achieve any meaningful assembly.   PCR amplicon sequencing targets a specific gene of interest in an attempt to infer the diversity and evolutionary relationships within a specific taxonomic group, or related group of sequences. For example, the 16S rRNA and 18S rRNA genes have been used extensively as markers to explore the diversity of prokaryotes and eukaryotes, respectively (Field, Olsen, & Lane, 1988; Weisburg, Barns, Pelletie, & Lane, 1991). Given that viruses do not have universally conserved genetic markers, genes encoding DNA polymerase and structural proteins are often used as viral markers (Clasen & Suttle, 2009; Filee et al., 2005; Labonte, Reid, & Suttle, 2009).  A disadvantage of targeted amplicon sequencing is that it only provides information on the diversity and evolutionary relationships of a single marker gene, and the inferences that can be made about the diversity and evolution of the rest of the genome are limited. Moreover, PCR 7  amplification of DNA is often biased, so that the relative abundance of a specific sequence in the original sample may not be reflected by its relative abundance in the amplicon library. However, the advantage of amplicon sequencing is that because the same region of a gene is sequenced in much more depth compared to shotgun sequencing, it is great for diversity analysis. Also, any previously unknown sequences can be assigned as Operational Taxonomic Units (OTUs) by defining a similarity threshold. The targeted sequencing approach has led to the discovery of many previously unknown phylogenetic groups of DNA and RNA viruses (Clasen & Suttle, 2009; Comeau & Krisch, 2008; Culley, Lang, & Suttle, 2003; Filee et al., 2005; Labonte et al., 2009).  1.3 Spatial and temporal diversity in aquatic viruses 1.3.1 Factors for viral diversity Research has shown that viral diversity varies seasonally and spatially in marine and freshwater environments (Weinbauer, 2004), although the factors affecting viral diversity are poorly understood. Diversity is affected by both biotic and abiotic conditions, including the quality and availability of the host, and selective grazing (Jürgens, Pernthaler, Schalla, & Amann, 1999). In addition, genetic exchanges between viruses and hosts affects viral diversity (Moineau, Pandian, & Klaenhammer, 1994). Abiotic factors such as light, UV, pH and temperature could also contribute to the change of viral diversity in the environment.  1.3.2 Spatial and temporal diversity in the marine environments Comprehensive data on the temporal and spatial distribution of viruses in marine, freshwater and soil environments are not available. Research on marine environments suggests that there are either a few groups dominate or an even distribution of viruses in marine environments (Angly et al., 2006; Breitbart et al., 2002; Suttle, 2007; Weinbauer, 2004). 8  Furthermore, viral composition varies greatly across locations. For example, one study found that few of their samples had 72% of viral (Operational Taxonomic Unit) OTUs were shared between at least two sites, while other samples found that only 2 OTUs were shared. (Marston et al., 2013). In addition, viruses from different geographic locations can be genetically similar, while viruses from the same environment can be very different (Breitbart & Rohwer, 2005; Short & Suttle, 2002; Wilson et al., 1999) .  Viral diversity exhibits seasonal patterns. A study on Phycodnaviridae using DGGE showed that some OTUs were consistently present throughout the season, while others varied (Short & Suttle, 2003). Some studies have shown that some OTUs are consistently present even when the community appears to be dynamic (Djikeng, Kuzmickas, Anderson, & Spiro, 2009; Rodriguez-Brito et al., 2010). Other studies have observed cyclical patterns (Chow & Fuhrman, 2012; Clasen et al., 2013; Marston et al., 2013). 1.3.3 Spatial and temporal diversity in the freshwater environments The diversity and distribution of viruses in freshwater also varies temporally and spatially. Studies have observed monthly changes in lakes and rivers, usually a sharp increase in abundance in a particular season (Bettarel, Sime-Ngando, Amblard, Carrias, & Portelli, 2003; Brum, Steward, Jiang, & Jellison, 2005; Filippini, Buesing, & Gessner, 2008). Previous work suggested that seasonal peaks in viral abundance could be caused by phytoplankton blooms (Wommack & Colwell, 2000). Filippini et al. (2008) confirmed that there is a seasonal change in viral diversity in Swiss lakes and further reported that horizontal spatial variation was weak, possibility due to water exchange.  9  1.4 Research objectives and outline of this thesis As agents of mortality, viruses influence microbial diversity and nutrient cycles, but most are not culturable; hence, non-culture based approaches are needed to ascertain changes in the composition and diversity of virus communities.  The major questions I address in this thesis are as follows:  1) What is the difference in the taxonomic profiles of the viral families Phycodnaviridae, Myoviridae, and Podoviridae among three coastal locations in southern British Columbia?  The null hypothesis is that the viral communities do not differ among locations.  2) What is the difference in the taxonomic profiles of the viral family Myoviridae over the span of a year at three locations in a freshwater watershed that differ in agricultural impact?  The null hypothesis is that the viral communities do not differ with time or among locations. The research involves using PCR amplicons and high-throughput sequencing to uncover unknown diversity in marine and freshwater viruses and determine its temporal and spatial variation. Differences in the taxonomic profiles of viruses in the families Phycodnaviridae, Myoviridae, and Podoviridae across marine locations were assessed using 454 pyrosequcing, which will be discussed in Chapter 2. Temporal and spatial changes in the taxonomic profiles of viruses in the family Myoviridae in the freshwater system were assessed using Illumina sequencing; these results will be discussed in Chapter 3.  10  Chapter 2: Sequencing of Marker-gene Fragments for Three Viral Families Reveals Pronounced Differences among Marine Environmental Samples 2.1 Introduction  High-throughput sequencing technology has revolutionized our understanding and examination of microbial diversity by providing massive sequencing capacity at an ever decreasing cost.  From an un-culturable, and thus unknown, majority (Amann, Ludwig, & Schleifer, 1995), we can now thoroughly, efficiently, and almost routinely examine human and environmental microbiomes (e.g. DeLong et al., 2006; Venter et al., 2004).  The base of this era of microbial ecology is built on community species profiles produced by massively-parallel sequencing of short, PCR-amplified regions of the ribosomal DNA gene.  From January to December 2014, over 230 scholarly articles featuring 16S rDNA deep sequencing or pyrosequencing were cataloged in PubMed alone.  Although the picture of microbial diversity is ever sharpening, our understanding of viral diversity remains fuzzy.  Viruses are ubiquitous “nanomachines” (Hemminga et al., 2010; Suttle, 2007) that represent a genetic reservoir equaling, if not exceeding, that found in prokaryotes.  Through infection and host cell lysis, viruses influence microbial community structure (Fuhrman & Schwalbach, 2003; Jacquet et al., 2002; Martinez, Schroeder, Larsen, Bratbak, & Wilson, 2007; Middelboe et al., 2001; Tarutani, Nagasaki, & Yamaguchi, 2000; C Winter, Herndl, & Weinbauer, 2004), biogeochemical cycling (Bratbak, Heldal, Thingstad, Riemann, & Haslund, 1992; Shelford et al., 2012; Suttle, 2005; Wilhelm & Suttle, 1999a), and mediate horizontal gene transfer (Brussow, Canchaya, & Hardt, 2004; Huang, Wilhelm, Jiao, & Chen, 2010; Jiang & 11  Paul, 1998).  However, these abundant marine denizens are arguably also the least well characterized both physically and genetically.     Culture-based examinations of virion structure, viral infection cycles, genome sequence and transcription patterns offer the most complete pictures of virus diversity (Wang & Chen, 2008; Wilson et al., 2005), but are limited to those viruses which infect culturable hosts and are themselves amenable to isolation and culture.  Because viruses lack a common genetic marker, complete species inventories of virus communities cannot be performed as they are for prokaryotes and eukaryotes (Edwards & Rohwer, 2005).  Whole virus community diversity has been assessed with DNA-DNA hybridization (Wommack, Ravel, Hill, & Colwell, 1999), pulsed-field gel electrophoresis (e.g. Sandaa & Larsen, 2006; Steward et al., 2000; Wommack, Ravel, Hill, Chun, & Colwell, 1999), and randomly amplified polymorphic DNA (RAPD) PCR (Wells & Deming, 2006; Winget & Wommack, 2008; Wommack, Ravel, Hill, Chun, et al., 1999) for effective comparisons of community richness across temporal, spatial, and depth gradients.  Unfortunately, each of these techniques fails to capture full community diversity due either to detection limits (hybridization, PFGE) or interrogating only a subset of viruses (RAPD-PCR).  None are reliably quantitative.  Advances in sequencing technology offer a possible solution to the difficulties of assessing viral diversity.  By shotgun sequencing viral DNA purified directly from the environment, metagenomic sequencing bypasses the need for a common viral genetic marker, isolation, or cultivation.  Previous marine viral metagenomes proved the vast amount of genetic diversity hiding within the virosphere (Angly et al., 2006; Bench et al., 2007; Breitbart et al., 2002; Williamson et al., 2008) with estimates of thousands viral species per liter of seawater (Angly et al., 2006; Bench et al., 2007; Breitbart et al., 2002). However, compared to microbial 12  metagenomes, few large contigs (>5kb) and even fewer full viral genomes are usually assembled from viral metagenomes.  The majority of viral sequences obtained lack any homologue in known sequence space (Bench et al., 2007; Rodriguez-Brito et al., 2010; T Schoenfeld et al., 2008; Williamson et al., 2008), with notable exceptions in low complexity environments (Andersson & Banfield, 2008; Garcia-Heredia et al., 2012).  Estimates of virus richness and evenness can be based on the contig spectra obtained by assembling metagenome reads (Mya Breitbart et al., 2002), but such estimates are highly dependent on assumptions of average genome size, rates of genome evolution, and viral diversity within a strain or genus (T Schoenfeld et al., 2008), leaving them as intriguing, but inconclusive clues to viral diversity.  Thus, marine viral ecologists often focus on cataloging viral richness through PCR amplification of genes conserved within viral groups or families. Established techniques exist for assessing the richness of the capsid structural genes of Myoviridae (g23 and g20) (g23: (Clokie, Millard, & Mann, 2010; Comeau & Krisch, 2008; Filee et al., 2005; Jia, Ishihara, Nakajima, Asakawa, & Kimura, 2007) and the DNA polymerases of Podoviridae (Breitbart, Miyake, & Rohwer, 2004; Chen et al., 2009; Huang et al., 2010; Labonte et al., 2009), algal viruses, the Phycodnaviridae (Chen, Suttle, & Short, 1996; Clasen & Suttle, 2009; Culley, Asuncion, & Steward, 2009; Gimenes, Zanotto, Suttle, da Cunha, & Mehnert, 2012; Park, Lee, Lee, Kim, & Choi, 2011; Short & Suttle, 1999, 2002, 2003; Short & Short, 2008, 2009), and RNA viruses (Gustavsen et al., 2014).  Surprisingly, PCR amplification of specific viral genes from environmental samples has yet to be combined with high-throughput, short read length next-generation sequencing, as it has in microbial ecology (Comeau, Li, Tremblay, Carmack, & Lovejoy, 2011; Galand, Casamayor, Kirchman, & Lovejoy, 2009).  By sequencing thousands of amplicons from the same PCR reaction, a sample robust enough for viral strain richness and 13  evenness estimations, which are necessary to quantify and predict viral diversity, should be obtained.    In order to assess how the diversity of viruses differed among five coastal seawater samples from British Columbia, Canada, PCR amplicons representing two families of bacteriophages (Myoviridae and Podoviridae) and a family of viruses infecting eukaryotic algae (Phycodnaviridae) were deeply sequenced. A validated bioinformatic pipeline for analyzing the data was developed, and the amplicons were clustered into operational taxonomic units (OTUs) based on genetic similarity, and their distribution and relative abundances compared across locations and between depths. The data demonstrate that amplicon deep sequencing is a powerful approach for characterizing diversity within viral families, and that the taxonomic composition within families varies markedly among samples.   2.2 Materials & methods 2.2.1 Sample collection & virus concentration  Water samples (20 to 72 litres) were collected from locations near Vancouver, British Columbia (Table 1).  Two samples from English Bay, near Vancouver, and one in the Strait of Juan de Fuca near Victoria were collected from surface waters by bucket in June and July 2006, and are referred to as GOB samples.  Eight Saanich samples were collected by Niskin bottle from 10m and 200m from April 2007 through December 2008 (Table 1).  Within 24 hours of collection, Saanich samples were sequentially pre-filtered through 47-mm diameter GF/D (2.7 m nominal pore size, Whatman) and 0.22 m Sterivex filters (Millipore, Zaikova et al., 2010), while the GOB samples were sequentially pre-filtered through GC50 (1.2 m nominal pore size, Micro Filtration Systems) and 0.45 m pore size HVLP (Durapore, Millipore) filters.  After pre-filtration, all samples were concentrated to < 1 liter by tangential flow filtration (30kD Prep-14  Scale TFF, Millipore) (Curtis A. Suttle, Chan, & Cottrell, 1991).  Viral concentrates (VCs) were stored at 4℃ in the dark until DNA extraction.     Sample Date Collected Location Depth (m) Temperature (oC) Salinity (psu) Volume Collected (L) Prefilter Final VC Volume (mL) Volume VC Extracted (mL) GOBI 22/06/06 49.28N, 123.20W 1 18 12 41 GC50-HVLP 900 650 GOB2 28/06/06 49.32N, 123.25W 1 14 23 45 GC50-HVLP 795 650 GOB3 18/07/06 48.45N, 123.32W 1 8.3 33 72 GC50-HVLP 620 420 Saanich Oxic 24/04/07 48.58N, 123.5W 10 na na unknown GF/D-Sterivex 245 1.7 13/02/08 48.58N, 123.5W 10 7.5 30 18 GF/D-Sterivex 287 7.8 19/03/08 48.58N, 123.5W 10 7.5 30 18 GF/D-Sterivex 168 5.9 09/04/08 48.58N, 123.5W 10 7.6 30 16 GF/D-Sterivex 230 2.9 11/06/08 48.58N, 123.5W 10 9.9 30 14 GF/D-Sterivex 185 1.1 11/08/08 48.58N, 123.5W 10 13 29 18 GF/D-Sterivex 190 3.1 10/12/08 48.58N, 123.5W 10 9.1 30 15 GF/D-Sterivex 210 11 Saanich Anoxic 24/04/07 48.58N, 123.5W 200 na na 17 GF/D-Sterivex 220 10 13/02/08 48.58N, 123.5W 200 9.4 31 17 GF/D-Sterivex 300 8.9 19/03/08 48.58N, 123.5W 200 9.4 31 16 GF/D-Sterivex 220 23 09/04/08 48.58N, 123.5W 200 9.4 31 19 GF/D-Sterivex 250 8.3 11/06/08 48.58N, 123.5W 200 9.4 31 18 GF/D-Sterivex 235 15 11/08/08 48.58N, 123.5W 200 9.4 31 19 GF/D-Sterivex 160 78 02/11/08 48.58N, 123.5W 200 na na 16 GF/D-Sterivex 186 41           Table 2-1 GOB sampling details 15  2.2.2 DNA extraction  For the 10m and 200m Saanich samples, viral abundances in each VC was assessed by flow cytometry as previously described (Brussaard, 2004), and 1x109 viruses from each VC were pooled prior to ultracentrifugation and DNA extraction; thus, each of these samples represents a mix of the viruses in Saanich Inlet, BC from spring 2007 to winter 2008, at either the 10m or 200m depth..  Prior to DNA extraction, each of the five samples (3 GOB, 2 Saanich) was further concentrated by ultracentrifugation (124000g, 4h, 15℃).  The supernatant was removed, and pelleted viruses resuspended in 500L of 10mM Tris-HCl with 1% SDS at 4℃ overnight.  Pellets were pooled the next day, and tubes were rinsed with an additional 700L of 10mM Tris-HCl, 1% SDS to improve virus recovery.  Viral capsids were lysed by addition of Proteinase K to a final concentration of 100 g ml-1 and incubated at 55℃ for one hour.  DNA extraction followed a standard phenol: chloroform extraction protocol followed by ethanol precipitation (Short Protocols in Molecular Biology).  DNA was resuspended in DEPC treated water (Gibco) at 4℃ and stored at -20℃. 2.2.3 PCR amplification of virus genes  A portion of the gene encoding the major capsid protein of Myoviridae (g23) was amplified from DNA extracts with primers MZ1A1bis and MZ1A6 (Filee et al, 2005).  Briefly, PCR reactions contained 1x Taq polymerase reaction buffer, 1.5 mM MgCl2, 0.2mM dNTPs (each), 1M each primer, 1U Platinum Taq DNA polymerase (Invitrogen), and 5L of template DNA (Filee et al., 2005).  PCR conditions were as follows: i) 94℃ for 90s; ii) 94℃ for 30s; iii) 35℃ for 1min; iv) 72℃ for 45s; v) repeat steps ii-iv for a total of 5 cycles; vi) 94℃ for 30s; vii) 16  50℃ for 1min; viii) 72℃ for 45s; ix) repeat steps vi-viii for a total of 35 cycles, x) 72℃ for 9min; xi) hold at 4℃.  A portion of the DNA polymerase gene of Phycodnaviridae (AVS) was amplified from DNA extracts with primers AVS1 and AVS2 (Chen & Suttle, 1995).  PCR reactions contained 1x Taq polymerase reaction buffer, 1.5 mM MgCl2, 0.2mM dNTPs (each), 1M AVS1, 3M AVS2, 1U Platinum Taq DNA polymerase (Invitrogen), and 2L of template DNA.  PCR conditions were as follows: i) 94℃ for 90s; ii) 94℃ for 45s; iii) 45℃ for 45s; iv) 72℃ for 45s; v) repeat steps ii-iv for a total of 35 cycles; vi) 72℃ for 10min; vii) hold at 4℃.  The DNA polymerase gene of Podoviridae (PODO) was amplified from DNA extracts with primers PodoF and PodoR2 (Labonte et al., 2009).  Reactions contained 1x Taq polymerase reaction buffer, 1.5 mM MgCl2, 1mM dNTPs (each), 1.2M each primer, 2.5U Platinum Taq DNA polymerase (Invitrogen), and 2L of template DNA.  PCR conditions were as follows: i) 94℃ for 3min; ii) 94℃ for 30s; iii) 56℃ for 30s; iv) 72℃ for 1min; v) repeat steps ii-iv for a total of 39 cycles;  vi) 72℃ for 10min; xi) hold at 4℃.  A second 20-cycle round of PCR amplification was necessary to improve yield of PCR products (Labonte et al., 2009).  Reactions conditions mirrored those above with the exception that 1L of first-round PCR reaction was used as template.  For each sample-primer combination, four to ten separate PCR amplifications were performed, depending on sample and primer.  All PCR reactions were visualized on 1% agarose gels, 0.5x TBE and stained with SYBR Gold (2.5x final concentration) to confirm that the expected size range of the product was amplified. After visualization, the remaining 45L of the 50L PCR products were purified using a MinElute PCR Purification Kit (Qiagen) according to 17  the manufacturer's instructions.  PCR purifications for each sample-primer set were combined onto one column during PCR purification and eluted in either autoclaved and 0.22m filtered MilliQ or DEPC-treated water (Gibco) and stored at -20℃ to create one DNA extract per sample-primer combination from the several PCR procedural replicates. 2.2.4 Sequencing  For each GOB sample, g23, AVS, and PODO purified PCR products were mixed and sequenced with 454 Titanium chemistry.  Each sequencing library preparation mixture contained 525ng of each viral PCR product, with the exception of PODO in GOB I (143ng).  For each Saanich sample, 500ng of each purified viral PCR product (g23, AVS, PODO) were combined and subjected to 454 Titanium sequencing.  Five hundred nanograms of purified Microviridae PCR products were combined with the Saanich samples at the time of sequencing submission, and those data are reported elsewhere (Labonte et al, manuscript in review).  All sequencing was performed at the Broad Institute of MIT and Harvard (Cambridge, MA, USA). Samples were specifically requested not to be sheared prior to library preparation to avoid primer-sequence disassociation. 2.2.5 Bioinformatic analyses  To assign sequences to an OTU, an analysis pipeline was developed and validated using freely available scripts and software. Forward and reverse sequences were kept separate throughout the analysis.  Briefly, sequences in each sample were identified by primer, de-noised to account for sequencing and PCR error, decontaminated of spurious PCR products and chimeras, translated into amino-acid sequences, aligned, trimmed, and clustered at 100% identity to collapse identical sequences.  Then, samples were clustered together for final comparisons at 92% identity for AVS (Clasen & Suttle, 2009) and 95% identity for PODO and g23 sequences 18  (Labonte et al., 2009). After the second round of clustering, singletons were then removed from each sample.  A detailed description of these pipeline steps and validation follows.  Raw 454 reads were subjected to quality control filtering and identified by primer using split_libraries.py script in QIIME package v1.4 from Qiime.org (Caporaso et al., 2010).  Designed to bin next-generation sequences by barcodes, our forward and reverse primer sequences acted as suitable barcode identifiers.  Up to three base-pair mismatches in primer identification was allowed, which improved sequence retrieval for the Saanich data in particular.  Split_libraries.py also removes low quality sequences according to user set variables.  For these data, sequences were deemed low quality and removed from analysis if they were < 50bp or >1000bp in length, contained 1 or more N's, contained a homopolymer of >6 bp, or had an average quality score <25.  Reverse primers when present were removed, and sequences were scanned at 50bp intervals for quality and truncated to the first base in the first window that had an average quality score <25.   As sequencing and PCR errors artificially inflate numbers of OTUs in a sample (Kunin, Engelbrektson, Ochman, & Hugenholtz, 2010), sequences from the library splitting above were subjected to quality control using the denoiser.py script in QIIME v1.4 (Reeder & Knight, 2010), and the default settings for 454 Titanium data. Essentially, sequences which differed only due to sequence length, sequencing error, or PCR error were merged into one OTU bin.  Primer sequences were removed, and the resulting centroid and singleton bin files were merged into one file for further processing. This step reduced the number of sequences for analysis to 3.5% of the starting number, greatly reducing downstream computational demands.  Spurious PCR or sequencing products were next removed by BLAST comparisons.  First, each OTU bin in a data set was compared to an appropriate gene-specific database of known 19  sequences complied from GenBank using BLASTx (e-value < 10-3).  For example, g23 data sets were compared to a database containing only known g23 sequences.  Bins with significant homology to one or more sequences in the gene-specific database were immediately passed to the next step of the pipeline (69% of bins representing 96% of all sequences).  Sequences without a homologue (3882 bins representing 13187 sequences) were further subjected to identification via BLASTx versus GenBank nr.  Bins with a significant homolog in GenBank nr (e-value < 10-5) were then manually examined to determine if they were a valid sequence for the appropriate primer type. If so, they were added back to the analysis (12 bins representing 21 sequences). Bins that lacked homology with the gene-specific databases and GenBank nr were discarded unless they belonged to a bin containing more than one sequence.  Returned sequences, which lacked a homologue but represented >1 sequenced read, added 5% of bins (432 bins representing 6716 sequences) back to the analysis. Fifty-eight percent (3874) of the returned sequences were g23 reverse sequences. Overall, decontamination removed 28% of bins (2% of sequences), and greatly improved the efficiency of both de novo chimera removal and sequence alignments.    Chimeric artifacts can be formed during the PCR process and must be removed. During PCR, amplification may terminate prematurely, so that during the next PCR cycle another similar DNA strand may attach where the first left off, and complete the amplicon from this second parent, forming a chimeric PCR product. Chimeric PCR products were next removed using UCHIME (v4.2.40, Edgar, 2010).  Both de novo and reference database chimera detection modes were used to examine each data set.  The reference databases used in chimera detection were those complied for the gene specific decontamination script. Bins which were identified by both algorithms (the intersection of the two) were flagged as possible chimeras (39 bins, 0.02% 20  of sequences).  To check for falsely identified chimeric bins, the flagged bins were compared to GenBank nr by BLASTn, and the results manually examined.  A total of 28 true chimeric bins were identified, eight g23 forward, seventeen g23 reverse and three AVS forward bins.  The 11 falsely flagged bins were returned to the appropriate analyzes.    The remaining bins were translated using FragGeneScan (v1.16, Rho, Tang, & Ye, 2010), a software tool for identifying amino-acid sequences in short and potentially error-rich sequences. A 1% sequencing error was assumed.  FragGeneScan automatically removed bins of <50bp in length (9% bins, 12484 sequences).  Bins for which more than one translation was called were also culled (<1% of bins, 845 sequences).  Bins for each sample-primer combination were next aligned, trimmed to a consistent length, and clustered at 100% identity using MAFFT (v7) and USEARCH v6.1.544.  Singleton sequences after 100% clustering were next removed to avoid possible spurious products.  Then, clusters from all samples for a given primer (forward and reverse analyzed separately) were clustered at 92% amino-acid identity for AVS and 95% amino acid identity for PODO and g23 sequences to create viral OTUs (USEARCH v6.1.544).  Lastly, samples were subsampled to the lowest count across samples by randomly sampling 10,000 times and taking the median value of each iteration. Any OTU that had <1% of the read counts was eliminated from the heatmap and phylogenetic analysis. Heat maps were generated using the gplots package from statistical software R version 3.1.2. Maximum-likelihood phylogenetic trees were constructed in RAxML v8.1 using the Whelan and Goldman (WAG) protein substitution matrix with the GAMMA model of rate heterogeneity and 1000 bootstraps (Stamatakis, 2014; Whelan & Goldman, 2001). The best (WAG) protein substitution matrix was fitted using ProtTest v3.4 (Abascal, Zardoya, & Posada, 2005; Darriba, Taboada, & Posada, 2011).   21  2.3 Results and discussion 2.3.1 Bioinformatic pipeline  Because most bioinformatic tools available for short read length amplicon sequences have been designed and validated for use with 16s or 18s rDNA data, not viral sequences, initial analysis presented several hurdles.  First, default settings for some scripts, such as split_libraries.py, denoiser.py, and Fraggenescan were not appropriate for our sequences. In general, the default settings for most scripts performed well with viral sequences, with allowances made for specific sequencing platforms and runs (Gustavsen et al., 2014).  For example, preliminary analysis of a control reaction in which one viral RNA-dependent RNA polymerase gene was cloned, PCR amplified, and 454 Titanium sequenced revealed that the default Titanium chemistry settings for denoiser.py (97% identity for binning, default threshold cut-offs) resolved these sequences into one OTU or “bin”.  Increasing the percent identity to 100%, incorrectly created 2 OTU bins for this control (Gustavsen et al., 2014).  While many of the scripts used in previous amplicon deep sequencing analyzes (e.g. UCHIME) were suitable, custom reference databases where needed for spurious PCR product and chimera detection and removal.    There was also the potential for contamination with spurious PCR products and chimeras.  However, these contaminants represented just 1.9% of the sequences.  After de-replication, the 351870 sequences became 12414 bins.  Across all primer-sample combinations, decontamination removed 3438 bins (6450 sequences), chimera identification removed 39 bins (74 sequences), translation removed 726 bins (12785 sequences), 100% identity collapsed 279 bins together, and singletons represented 3977 bins (3977 sequences) leaving 3955 bins (328584 sequences. 93% of 22  starting total) for alignment and OTU clustering (Tables 2 & 3).  Sequencing was heavily skewed in all data sets to g23 reverse sequences which comprised 63% of sequences.    Table 2-2 Sequence retention   Sample Primer SequencesSplit libraries ContaminantsRemoved ChimerasRemoved atTranslationSingletons RemovedAfter 100% AAClusteringSequences RemovedIn Alignment& TrimmingSequences RetainedIn Full AnalysisSaanich PODO-F 395 58 0 13 23 57 244Oxic PODO-R 704 92 0 11 17 62 522AVS-F 12002 122 0 13 126 37 11704AVS-R 6195 42 0 404 141 20 5588G23-F 9073 12 7 54 319 25 8656G23-R 70251 126 20 1151 261 2513 66180Saanich PODO-F 1214 704 0 132 9 350 19Anoxic PODO-R 660 413 0 7 5 205 30AVS-F 6754 737 0 0 104 321 5592AVS-R 4855 404 0 2 135 109 4205G23-F 7169 38 0 1 280 4 6846G23-R 68315 1224 20 1941 304 1321 63505GOB1 PODO-F 505 130 0 12 13 19 331PODO-R 586 18 0 1 5 10 552AVS-F 2444 300 0 54 45 133 1912AVS-R 2351 288 0 77 34 249 1703G23-F 5351 6 11 81 197 25 5031G23-R 17205 37 1 618 291 105 16153GOB2 PODO-F 5638 51 0 34 10 45 5498PODO-R 6176 15 0 33 18 76 6034AVS-F 11963 610 3 117 65 380 10788AVS-R 5347 282 0 128 67 296 4574G23-F 10351 14 1 73 304 37 9922G23-R 25165 31 3 686 401 8 24036GOB3 PODO-F 6876 161 0 306 38 126 6245PODO-R 4168 24 0 12 28 296 3808AVS-F 8831 307 5 135 81 122 8181AVS-R 5000 166 0 212 77 94 4451G23-F 6919 5 0 66 263 4 6581G23-R 39407 33 3 6411 316 181 32463Grand Total 351870 6450 74 12785 3977 32135423    Table 2-3 Bins tracking Sample PrimerBins denoiserCentroids+singletonsContaminantsRemoved ChimerasRemoved atTranslationBins Collapsed100% AA ClusteringSingletons RemovedAfter 100% AAClusteringBins Removed orCollapsed in Alignment,Trimming & Final ClusteringBins RetainedIn Final AnalysisSaanich PODO-F 78 37 0 2 5 23 6 5Oxic PODO-R 106 59 0 4 3 17 14 9AVS-F 325 106 0 4 3 126 49 37AVS-R 263 40 0 2 12 141 44 24G23-F 488 12 7 3 12 319 30 104G23-R 880 65 6 42 22 261 216 268Saanich PODO-F 186 155 0 5 0 9 15 2Anoxic PODO-R 246 226 0 1 1 5 11 2AVS-F 644 443 0 0 5 104 54 38AVS-R 443 222 0 2 7 135 48 29G23-F 412 34 0 1 13 280 16 68G23-R 1011 227 11 26 34 304 92 316GOB1 PODO-F 153 112 0 6 0 13 9 13PODO-R 45 18 0 1 0 5 11 10AVS-F 359 208 0 31 3 45 27 45AVS-R 288 172 0 22 0 34 23 37G23-F 329 6 3 42 2 197 6 73G23-R 700 36 1 58 105 291 21 188GOB2 PODO-F 147 46 0 9 0 10 46 36PODO-R 125 14 0 4 11 18 55 23AVS-F 700 357 3 36 2 65 129 108AVS-R 479 207 0 27 1 67 104 73G23-F 497 11 1 31 2 304 24 123G23-R 878 31 3 79 4 401 48 312GOB3 PODO-F 253 125 0 8 5 38 38 39PODO-R 128 24 0 6 6 28 43 21AVS-F 512 259 2 28 3 81 66 73AVS-R 354 148 0 27 3 77 56 43G23-F 483 5 0 48 6 263 25 136G23-R 902 33 2 171 9 316 72 29924   Figure 2-1 Rarefaction curves for the three amplified gene products. Viral families Phycodnaviridae, Myoviridae and Podoviridae were represented, as surveyed using the family-specific primers AVS, g23, and PODO, respectively. The sequences from each family approaches a plateau signifying that the sequencing was adequate to capture the viral community within the sample.    Pyrosequencing of marker genes representing three viral families from five sample locations off the coast of British Columbia revealed greater diversity than in previous studies (Clasen & Suttle, 2009; Filee et al., 2005; Labonte et al., 2009). The tail end of the rarefaction 25  curves leveled-off for Phycodnaviridae and Podoviridae (Figure 1), indicating adequate sequencing to observe almost all of the OTUs amplified from the samples. For myovirus amplicons, the rarefaction curves leveled off for each of the samples; however, the overall curve did not level off. These observed data suggest that the T4-like myoviruses were sampled at enough depth at each sample site, but because the sequences from each sample were so different, each sample added many more OTUs. Resulting Phylogenetic analysis of the most abundant OTUs revealed a number of previously unknown evolutionary groups within the Phycodnaviridae and Podoviridae. As well, the calculated richness and diversity among the five locations differed for the marker genes representing the three viral families. For example, Shannon’s diversity indices showed the most diversity for the Myoviridae at all sample locations, while the Podoviridae were the least diverse in all sites except for GOB1. This is not surprising, as different genetic markers would not be expected to display the same genetic variation. In this case, it is expected that the genetic variation in the gene encoding the major capsid protein of the Myoviridae would be higher than that for DNA polymerase, which was used as the genetic marker for the Phycodnaviridae and Podoviridae. In this study, the primer sets representing each family were chosen based on previous studies showing that these were good genetic markers for the families, and the availability of comparable data in the literature. Similarly, the forward and reverse primers would not necessarily recover the same genetic variation. In this study, AVS reverse was chosen because the amplified sequence included the region encoding the highly conserved catalytic site of the polymerase, “YGDTDS”; a sequence without this motif is likely dysfunctional and can be discarded. The sequences associated with the forward PODO primer and the reverse primer for g23 were chosen for analysis, because after processing the sequences for quality these primers recovered more read counts than their counterparts, and hence offered 26  more resolution for downstream analysis. This study is the first comparison of richness and relative abundance of different virus families across marine locations. The details and implications of these observations are discussed below.    Figure 2-2 Shannon’s diversity indices for marker sequences. Representative families include Phycodnaviridae, Myoviridae, and Podoviridae from 5 samples. Overall the capsid gene sequences for the Myoviridae had the highest diversity in all samples while the DNA polymerase gene fragment for the Podoviridae was the least diverse.   00.511.522.533.544.55Phycodnaviridae Myoviridae PodoviridaeShannon's Diversity IndicesGOB1 GOB2 GOB3 Saanich Annoxic Saanich Oxic27  2.3.2 Phycodnaviridae    Figure 2-3 Top 50 OTUs from the Phycodnaviridae. Shown as stacked histograms for each of the five samples individually and combined. The total evenness combining all samples was skewed towards two dominant OTUs of similar abundance (1145 and 1113 sequences), followed by four OTUs at approximately half that abundance (694-602 sequences per OTU). 28   Figure 2-4  Phycodnaviridae phylogenetic tree and heat map showing relative abundance. (Left) A maximum likelihood tree with 1000 bootstraps showing the phylogenetic relationship of the Phycodnaviridae DNA polymerase sequences from the present study compared to those from other studies. (Right) A heatmap showing the relative abundance of the top 1% OTUs and their placement on a phylogenetic tree. Most of the abundant OTUs are shared among most sites. SA = Saanich Anoxic; SO = Saanich Oxic.  After samples were subsampled to the sample with the lowest number of reads, there were 127 Phycodnaviridae OTUs representing 8510 sequences. GOB2 was the richest sample 29  with 73 OTUs, while Saanich 10m was the least rich with 24 OTUs. When all samples were grouped together, 13% of the Phycodnaviridae sequences fell within the first and second most abundant OTUs while 2.9% of the sequences fell into OTUs with 10 or fewer sequences.  Assuming very low or no OTUs that could be entirely sequencing errors, these “rare” OTUs are thus numerous and represent significant community richness, but encompass relatively few community members. The two most abundant OTUs had different distributions, with the most abundant OTU being present in all samples and dominant in a few; whereas, the second most abundant OTU was largely confined to the GOB1 sample (Figure 3).  Since the top homologues for the Phycodnaviridae OTUs in the nr database, based on BLAST, were to environmental sequences, phylogenetic analysis was used to reveal the closest relatives of the most abundant OTUs. A few OTUs were related to viruses infecting the genera Bathycoccus and Micromonas (Figure 4). In particular, about a third of the most abundant OTUs formed a large clade of sequences that branched with, but were distantly related to viruses infecting Micromonas spp. Since dsDNA viruses infecting eukaryote phytoplankton are strain specific and have phylogenies that are typically congruent with their host (Clasen & Suttle, 2009), the new clade likely infects a single species of phytoplankton that remains to be identified.   30  2.3.3 Myoviridae  31  Figure 2-5 Frequency histograms of the 50 most abundant Myoviridae OTUs.  Abundances are uneven with the two dominant OTUs containing 10197 and 8880 sequences, respectively, dropping sharply to 5830 sequences, followed by a general decline in OTU abundance with rank.  Figure 2-6 Myoviridae phylogenetic tree and heat map showing relative abundance. (Left) A rooted maximum likelihood tree with 1000 bootstraps showing the phylogenetic relationship of Myoviridae capsid-gene sequences from this study, as well as other studies. (Right) A heatmap showing the relative abundance of the top 1% OTUs and their placement on a phylogenetic tree. SA = Saanich Anoxic; SO = Saanich Oxic.  32   In all samples, the g23 sequences from the Myoviridae had the most diversity with 1233 OTUs representing 80,646 sequences. The most abundant cluster represents 13% of the sequences with 3.8% of the sequences falling in rare clusters with 10 or fewer sequences. The community composition was highly variable with the most abundant OTU in each sample being different among stations (Figure 2-5). Moreover, the ten most abundant clusters in GOB1 are unique to that location. The most abundant OTU overall when combining samples was observed only at Saanich, but was far more abundant in the oxic surface waters that the low oxygen deeper waters (Figure 5). Likewise, shared OTUs between GOB samples had very different abundances among samples, that is, OTUs that were abundant at one location were rare or undetectable at the other sites (Figure 2-5). Specifically, two different patterns were observed in the sample-specific rank abundance curves. In GOB1 and GOB2 there were four OTUs that were much more abundant than the others; whereas, at the GOB3 and Saanich sites, only one OTU was dominant. These very uneven distributions are consistent with recent lytic events in which infection of specific bacterial taxa produce large bursts of phage progeny. An alternative explanation that phages with wide host ranges seems less likely, as there are few examples of such phage, and if they existed they might be expected to be abundant across a range of environments.    Although few of the dominant OTUs were common among the samples, many fell within the same phylogenetic groups; however some, including two OTUs from the Saanich oxic sample that grouped with Synechococcus phage (Figure 2-6), were very distant from the dominant OTUs in other samples. In contrast, GOB3 shared all of its abundant OTUs (13) with GOB2, and 10 of the 13 OTUs are shared with Saanich oxic.  Six of these OTUs closely grouped with marine environmental samples previously collected at geographically proximate sites off the Coast of British Columbia (Filee et al. 2005). Although the Saanich oxic and anoxic samples 33  did not share many abundant OTUs (Figure 2-6), the anoxic sample was much more uneven and most of its dominant OTUs were closely related to those in the oxic sample. Lastly, GOB1 contained one very abundant OTU that fell within freshwater environmental sequences from other studies, but was distantly related to other OTUs, consistent with GOB1 being freshwater influenced (Table 1). 2.3.4 Podoviridae  Figure 2-7 Top 50 OTUs from the Podoviridae are shown as stacked histograms.  The three most abundant OTUs were shared by all three GOB samples 34   Figure 2-8 Podoviridae phylogenetic tree and heat map showing relative abundance. (Left) A maximum likelihood tree with 1000 bootstraps showing the phylogenetic relationship of the Podoviridae in the GOB samples compared to those from other studies. (Right) A heatmap showing the relative abundance of the top 1% OTUs and their placement on a phylogenetic tree. Our Podoviridae OTUs were the most phylogenetically similar to previous environmental samples from the Strait of Georgia, not far from the sampling sites for the present study.   The DNA polymerase genes from members of the Podoviridae showed the least sequence diversity in all samples. After subsampling, 17 OTUs were identified for the forward primer, representing 912 sequences. So few OTUs were observed in the Saanich Inlet samples that they were excluded from further analysis. There was considerable overlap among the Podoviridae sequences, and 42% of all sequences were recruited into the most common OTU, while 5.3% of sequences fell into OTUs that contained 10 sequences or less (Figure 2-77). In contrast to the 35  patterns for the Myoviridae and Phycodnaviridae, the most abundant OTU at GOB1 and 2 was the same, and was the second most abundant cluster at GOB3, while the most abundant OTU in GOB3 was the second most abundant OTU in GOB1 and 2.  Similar Podoviridae richness was observed among sites, and there were similar numbers of OTUs in each sample (Figure 2-7). GOB1 & 2 have similar rank abundance curves with one dominant OUT, followed by one to four sequences with abundances about half that of most abundant OTU before abundances per OTU dwindle. GOB3 was the least even of the samples and had two abundant OTUs of just over 100 sequences with the rest of the OTUs less than 15 sequences. The very uneven distribution of sequences is again consistent with recent lytic events affecting relatively few taxa. Phylogenetic analysis of the most abundant Podoviridae DNA polymerase gene OTUs revealed that although many were closely related to each other they clustered with environmental sequences for which there are no cultured representatives (Figure 2-8). However, Podoviridae DNA polymerase sequences from an earlier study in the Strait of Georgia (Labonte, Reid, & Suttle, 2009) were highly related and grouped into the same clade with the OTUs from the GOB samples. Both the similarity of sequences across samples in the present study and when compared to the earlier study, suggests that the podovirus communities are more stable over space and time than those of the phycodnaviruses and myoviruses.    A recent study suggested that there is less overlap in marine microbial community diversity with increasing geographic distance (Winter, Matthews, & Suttle, 2013). Although the sample size in the present study is very small, the results did not support this hypothesis; the richness and phylogenetic groups in GOB2 were more similar to GOB3 rather than to GOB1, which was much closer.  The Shannon Diversity Indices for the Phycodnaviridae and 36  Myoviridae in the GOB1 sample was also lower than for other sites indicating a different viral community structure. These differences were likely because GOB1 was more influenced by freshwater than the other sites, as indicated by the much lower salinity, as well as higher temperature. This is also supported by the observation that the most abundant OTU in GOB1 fell within a large cluster of freshwater Myoviridae environmental sequences. Distribution patterns of viral taxa are driven by host distributions and hence environmental parameters (Clasen & Suttle, 2009; Winter et al., 2013). The results presented here show that for three families of viruses the patterns of distribution varied, consistent with different distributions of the underlying host communities. In Saanich Inlet, viral richness and evenness in the three viral families varied between the surface (10m) and bottom (200m) waters. This is not surprising, as environmental conditions and the host communities vary drastically with depth in Saanich Inlet, because of low deep-water oxygen concentrations (Hawley, Brewer, Norbeck, Pa a-Toli, & Hallam, 2014). This is reflected in very few OTUs with similar abundances being shared between depths. However, it is intriguing that these communities share any commonalities. It seems unlikely that many host taxa would persist in both environments, but there are a number of factors that could have led to shared taxa between the surface and deep waters. This includes seasonally mixing of the surface and bottom waters, as well as sinking of viruses inside infected cells, or attached to particles. Until viruses can be brought into culture that are representative of these environmental OTUs, it will be difficult to unravel the factors that are controlling their distribution and abundance and whether some OTUs are derived from rare viruses that are “seedbanks” that can become abundant, or whether some are always present, slowly replicating but persistent. 37  2.4 Conclusions  Pyrosequencing of PCR amplicons targeted to marker genes of viral families that are important players in the sea allowed the richness and relative abundance of OTUs of three families of viruses to be analyzed in depth. The overall diversity and distribution of taxa within each of the families differed across the five locations surveyed. Based on phylogenetic analysis, some OTUs within the Myoviridae and Podoviridae belong to previously unknown phylogenetic groups. Our results also indicate that the distribution and abundance of viral taxa within a family varies markedly across locations. The numerical dominance of only a few viral taxa within each sample is consistent with the occurrence of recent lytic events affecting only a few taxa; whereas, the limited overlap of taxa among samples implies that lytic events tend to be local. Lastly, these results show that high throughput sequencing of marker genes is a powerful and robust method to quantifiably explore viral diversity.  38  Chapter 3: Temporal and Spatial Variation of Bacteriophage Composition in an Agriculturally-influenced Freshwater Watershed  3.1 Introduction Outbreaks of disease from contaminated drinking water, or even from bathing at beaches is a problem of increasing public health concern. Traditionally, culture-based methods have been used to determine the microbial pathogens that infect people, which can contaminate or be associated with water. However, such methods typically require specialized media, only target a subset of the microbes negatively affecting human health, and can be slow and expensive to apply. As a result, there has been a shift towards DNA-based technologies.  The polymerase-chain reaction (PCR) amplifies DNA to very high concentrations from a small amount of template DNA that has at least two regions of highly conserved sequence motifs. Most commonly it is used to target the gene that encodes 16S ribosomal RNA (rRNA), although other targets can be used. PCR has become a common method for detecting pathogens in water, and can be done quantitatively (qPCR) to provide high specificity, sensitivity and speed (Toze, 1999). However, PCR-based assays can only be applied to DNA of known sequence; hence, such assays often target the 16S rRNA gene found in all prokaryotes, or genes that are associated with specific pathogens. If PCR is targeted to the 16S rRNA gene, then there must be a way to screen and select the PCR products for the sequences of interest. Alternatively, if a gene associated with a specific pathogen is used as the target, the assay will have a limited scope for detection. In either case, specific sequence information must be available to detect the pathogen(s) of interest.  39  An alternative to using PCR targeting specific DNA sequences is to use shotgun high-throughput sequencing. Shotgun sequencing has the advantages that no a priori knowledge of specific sequences is required, and that the sequences recovered are unbiased and should reflect the overall makeup of the community. High-throughput sequencing generates massive amounts of data, which present bioinformatic challenges. As well, being able to map the data back to a specific pathogen still requires sequence information from the organism of interest. Analysis of the data can be particularly challenging because a large proportion of the sequences may not be represented in databases. This is particularly problematic for viruses, which are very poorly represented in sequence databases (Rosario, Nilsson, Lim, Ruan, & Breitbart, 2009). Furthermore, few waterborne disease outbreaks have accompanying information about microbial community composition (Craun et al., 2010), which could be used to help interpret metagenomic datasets. Consequently, high-throughput sequencing has typically been used in environmental studies to help understand the metabolic potential of the microbial community, and to explore the sequence space of the associated assemblage of viruses (Gilmour et al., 2010).  Despite being poorly represented in databases, viruses may still provide a tool for determining whether a particular water sample may be contaminated by pollutants. Viruses are the most abundant biological entity in aquatic ecosystems (Bergh et al., 1989; Suttle, 2005) with 106 to 107 viruses in each milliliter of seawater or freshwater (Bergh et al., 1989; Demuth, Neve, & Witzel, 1993; Suttle, 2005). Most of these viruses are not pathogens of humans or taxa of economic consequence, but infect environmental bacteria. Moreover, given that viruses infecting bacteria (phages) are highly host-specific, that many are produced from a single infected bacterium, and that more phage are produced from more rapidly growing cells, it means that the composition of the phage community reflects that of the “active” underlying bacterial 40  community on which the phage replicate. Consequently, the composition of the phage community can serve as a proxy for the “active” bacterial community in the water being sampled, or from a source that is flowing into the sampled environment.    One of the most well characterized groups of viruses infecting bacteria are the Myoviridae, a family of dsDNA phages. Members of this family have high genetic diversity and occur in habitats from oceans to the Sahara desert (Fancello et al., 2012; Wichels et al., 1998). Many studies have revealed a high degree of spatial and temporal diversity in myoviruses from marine and fresh waters (Chow & Fuhrman, 2012; Comeau, Short, & Suttle, 2004; Filee et al., 2005) with the presence and abundance of particular clades varying by geographic location and season (Chow & Fuhrman, 2012; Filee et al., 2005). Most of these studies have been undertaken by examining changes in the genetic diversity of a subset of myoviruses related to the phage T4, which infects E. coli. Most of these studies have been carried out by examining changes in the sequence diversity of genes coding either the vertex portal protein (g20) or the major capsid protein (g23).  In this study, monthly sampling of a watershed system with agricultural influence provided the opportunity to follow changes in the composition of myoviruses over an annual cycle. An upstream “pristine” site, a midstream site with agricultural runoff, and a downstream site provided a natural laboratory to determine the impact of agricultural runoff on the community composition of myoviruses, as well as an opportunity to determine if g23 might be an effective biomarker for agricultural pollution. Combining PCR amplification of g23 with high-throughput Illumina sequencing provided the resolution to examine temporal and spatial changes at each site, and revealed that the phage community as revealed by g23 was very 41  sensitive to the influences of agricultural influences, and that a number of operational taxonomic units (OTUs) may be indicators of agricultural runoff.  3.2 Material and methods 3.2.1 Environmental data collection Before each sample collection a YSI Professional Plus, handheld multiparameter instrument (YSI Inc., Yellow Springs, OH) was used to make in situ measurements of dissolved oxygen (mg/L), specific conductivity (μS/Cm), total dissolved solids (mg/L), salinity (practical salinity units or PSU), pressure (mmHg), and pH. Turbidity (nephelometric turbidity units or NTU) was measured using a VWR turbidity meter (model No. 66120-200, VWR, Radnor, PA) and water flow (m3/s) was determined in situ using a Swoffer 3000 current meter (Swoffer Instruments, Seattle, WA). Total coliform and Escherichia coli counts were also determined using the Colilert-24 testing procedure (IDEXX Laboratories, Westbrook, ME). Chemical analysis included dissolved chloride (mg/L) and ammonia (mg/L) using automated colorimetric (SM-4500-Cl G) and phenate methods (SM-4500-NH3 G) (APHA 2005). Additionally, orthophosphate, nitrite and nitrate were analyzed as described by Murphy and Riley (1962) and Wood et al. (1967), respectively.  3.2.2 Sample collection Monthly, from March 2012 to April 2013, 40-L water samples were collected in sterile plastic carboys upstream, downstream, and at the site of agriculture influence (Figure 1). Samples were pre-filtered on-site using a 105-μm mesh-size polypropylene filter (SpectrumLabs, Rancho Dominguez, CA), and kept at 4°C for transport to the laboratory for processing and storage within 2 h of the last sample collection.  42   Figure 3-1 An illustration of the sampling sites at the watershed.  The river weaves around various farms as it flows from its relatively pristine source. Samples were taken from upstream, agriculturally-influenced and downstream sites along the river. Each site is few kilometers apart.   3.2.3 Concentration of viral particles  Bacteria and protists were removed by filtration through a 0.22-µm pore-size membrane filter. The viral size material in the filtrate was concentrated from 40L to ~450 mL, using a regenerated cellulose Prep/Scale® tangential flow ultrafiltration cartridge (Millipore Corporation, Billerica, MA) with a 30-kDa molecular-weight cutoff (Suttle et al., 1991). The virus particles were further concentrated by ultracentrifugation (4h, 121,000g, 4°C), and the pellets re-suspended in 1X PBS to a final volume of approximately 5 to 6 ml, and incubated overnight at 4°C with constant agitation (180 rpm) on a multipurpose rotator (Thermo Scientific, Waltham, MA).  43  3.2.4 Nucleic acid extraction The concentrated viruses were treated with 1X RNAsecure (Life Technologies, Carlsbad, CA) and 5 U of DNase I (Epicentre Biotechnologies, Madison, WI). This reaction was terminated by adding 10 mM EDTA (pH 8.0) and incubating for 15 min at 65 °C. Total nucleic acids were extracted from the viral fraction using the NucliSens easyMAG system (bioMérieux, Craponne, France). Nucleic acids were further precipitated using 10% 3 M sodium acetate, 2 volumes of 100% ethanol, and 5 μl of linear acrylamide. Samples were stored at -80 oC overnight, and then spun down at 17,000g for 30 min at 4 °C. Supernatants were carefully discarded. Pellets were washed with 70% ice cold ethanol, air-dried and re-suspended in 10 mM Tris·Cl, pH 8.5. Concentration, purity, and average size of nucleic acids were assessed with a Qubit dsDNA High Sensitivity NanoDrop spectrophotometer (NanoDrop Technologies, Inc., Wilmington, DE), and an Agilent High Sensitivity DNA kit (Agilent Technologies, Inc., Santa Clara, CA), respectively.  3.2.5 Amplification of g23 encoding the viral major capsid protein The major capsid protein gene (g23) was amplified from DNA extracts and controls by PCR with primers MZ1A1bis and MZ1A6 (Filee et al., 2005). Each PCR reaction consisted of 1.5 mM MgCl2, 0.2 mM nucleotides, 0.4 μM of primers, 1.25 U of Hot Start Polymerase (Promega Corporation, Fitchburg, WI), and 1:10 dilution of template DNA, and molecular grade water in a 50 μl volume. PCR conditions were 94°C for 1.5 min, 35 cycles of 45 s at 94°C, 60 s at 50°C, and 60 s at 72°C, and a final cycle of 5 min at 72°C. Amplicons were run in duplicate in a 1.5% agarose/0.5X TBE gel stained with 1X GelRed (Biotium, Inc., Hayward, CA). PCR products were purified with a QIAQuick PCR Purification Kit (Qiagen Sciences, Germantown, MD) according to the manufacturer‘s instructions. 44  3.2.6 Sequencing and bioinformatics analysis  To characterize the watershed samples, libraries of g23 amplicons were prepared using the NEXTflex ChIP-Seq Kit (BIOO Scientific, Austin, TX) and were gel size-selected as per manufacturer’s instructions. Amplicon libraries were 250-bp paired-end sequenced on a MiSeq platform (Illumina, Inc., San Diego, CA).   The raw reads were processed as follows: Cutadapt’s default parameter with a maximum error rate to allow 2 mismatches was used to remove any contaminating adapters from the sequenced reads (Martin 2011). The sliding window approach in Trimmomatic v0.30 (four bases with at least an average score of 15) was used to trim low quality regions within sequences (Bolger, Lohse, & Usadel, 2014). Five samples had <9000 sequences remaining after trimming and were removed from further analysis. The remaining sequences were translated into amino-acids using Fraggenescan v1.16 with the Illumina 5% error model (Rho, Tang, and Ye 2010). Then, “derep” in USEARCH (v7.0.992) was used to “de-replicate” or collapse amino-acid sequences that were 100% identical into one sequence (Edgar, 2010). “Cluster-smallmem” in USEARCH was then used to cluster the sequences at 95% identity (Edgar, 2010). The reads were subsampled to 10,000 reads, the sample with the lowest read count, using the vegan package in R v3.1.2, (Oksanen et al., 2015; R Core Team, 2014). Random resampling was performed for 10,000 times and the median value of all iterations was chosen.  3.2.7 OTU analysis  Potential indicator species were selected using the indicator species analysis package in R (De Cáceres, Legendre, & Moretti, 2010). Sample 79 was removed prior to indicator-species analysis because although the environmental data were similar to other upstream sites, its OTU distribution was most similar to the agricultural influenced sites, which may have introduced 45  bias. The results were further manually selected to have only OTUs that had specificity (proportion of negatives which were correctly identified) and sensitivity (proportion of positives which were correctly identified) above 70%. Using these criteria, MAFFT v7 was used to align 25 OTUs with full-length gp23 sequences in the uniprot database (Consortium, 2014; Katoh, Misawa, Kuma, & Miyata, 2002; Katoh & Standley, 2013). A maximum-likelihood tree was generated with the full-length gp23 sequences using RaXML v8 with 1000 bootstrap replicates (Stamatakis, 2014). The shorter 25 OTU sequences picked from the indicator species analysis were compared to the full-length gp23 sequences, and placed onto the full-length gp23 tree using the EPA (evolutionary placement algorithm) in RaXML v8 (Stamatakis, 2014).   Statistical analysis on the data was performed using statistical packages in R v3.1.2. The principal coordinate analysis (PCA) on the environmental data was done using “prcomp” and visualized using ggbiplot. The heatmap to visualize the distribution of OTUs were created using heatmap.plus, and the phylogenetic tree was done in phyloseq (McMurdie & Holmes, 2013).  3.3 Results and discussion  There were 5,167,285 sequences recovered from 39 samples, which was reduced to 1,289,982 unique sequences after quality control and merging identical sequences. After clustering at 95% identity using USEARCH these were binned into 32,933 OTUs (Edgar, 2010). Singletons were removed, as many appeared to be errors from PCR or sequencing, leaving 59,014 OTUs.  The “abundant” OTUs that comprised >5% of the reads made up 3% (200) of the total OTUs represented 40% of the total sequences. The subsampled and normalized OTU data were used for the subsequent analyses. 46   Figure 3-2 Shannon’s diversity indices of the 3 sample sites over time.  The y axis is the Shannon’s diversity index and x axis is the month.     The diversity of myoviruses was dynamic across locations and time, with values for Shannon’s Diversity Index varying by as much as six-fold at some locations (Figure 2). The changes in diversity within and among locations was complex. For example, although the myoviruses at the downstream site were often the most diverse, this site also had the lowest calculated diversity (September, represented in Shannon’s diversity indices). Ultimately, patterns of diversity in myoviruses would be expected to reflect those found in the underlying bacteria serving as potential hosts. However, changes in the patterns of bacterial diversity with respect to eutrophication and productivity are unclear (Fuhrman et al., 2008; Smith, 2007), which likely reflects the observations that the changes in diversity among sites does not follow a consistent pattern. For example, in July the agricultural site had by far the lowest diversity; whereas, in September and January it was the downstream and upstream sites, respectively, in which diversity was lowest. Differences in diversity temporally were also not consistent, although over 00.511.522.533.5Apr May Jun Jul Aug Sep Oct Nov Dec Jan Feb MarShannon's Diversity Indices Upstream Influenced Downstream47  the 13-month period, diversity tended to be lowest during August, September and October, which suggests that the diversity of the underlying bacterial community was also lower.   Environmental parameters over the 13 months (Table 1) were statistically tested for differences among the sample locations. According to the PCA analysis, just under 70% of the variation could be explained, with the first and second principal coordinates explaining 47.6% and 20.9% of the variance, respectively, and the environmental parameters upstream clearly separated from those downstream (Figure 3). Differences among the locations were usually visually evident, as well, with the agriculturally influenced site being the most turbid and the upstream site the least turbid. Dissolved oxygen was one of the main factors separating the upstream site from the other locations. This agrees with observations at the sampling locations. Seasonal differences were not separated by PCA; when the data are grouped by three month periods, there was no clear separation of the data (Figure 4).  48   Table 3-1 Environmental data  SampleID Site Month Temperature_CTotalRainOnSamplingDay_mmPressure_mmHgDO_mgPerLSpecificConductivity_uSPerCmTDS_mgPerLSalinity_PSUpH TurbidityAvg_NTUDissolvedChloride_mgPerLAmmonia_mgPerLFlowRate_m3PerSecTotalFecalColiform_per100mlEcoli_per100mlg23.012 InfluencedFeb-Apr 12.8 12.4 759.4 10.27 302.2 196.3 0.14 6.86 20.2 45 0.603 1.071481 4106 146g23.013 DownstreamFeb-Apr 12.8 12.4 759.4 14.05 270.9 176.2 0.13 7.37 16.095 9.3 0.136 4.039529 12997 4884g23.015 Upstream May-Jul 8.5 0 758.8 22.81 106.7 69.6 0.05 7.12 0.485 0.54 0.006 0.111126 157.6 54.7g23.016 InfluencedMay-Jul 13.5 0 768.5 8.78 289.8 188.5 0.14 6.88 20.5 13 0.449 0.852529 1313 86g23.023 Upstream May-Jul 8.8 10.2 745.9 12.09 112.7 73.5 0.05 7.06 0.505 1.5 0.005 0.06 261.3 2g23.024 InfluencedMay-Jul 12.6 10.2 756.1 4.07 312.4 202.8 0.15 6.89 18.075 13 0.785 0.913255 1467 135g23.031 Upstream May-Jul 13.2 NA 752.8 28.51 109.4 70.8 0.05 7.44 0.71 1.1 0.0088 0.09 1553.1 547.5g23.032 InfluencedMay-Jul 19.7 NA 762.3 15.29 314.1 204.1 0.15 7.09 13.24 12 0.358 0.29 1046 41g23.033 DownstreamMay-Jul 19.9 NA 762.5 18.12 311.6 202.8 0.15 7.5 9.335 12 0.0587 3.553986 12033 97g23.039 Upstream Aug-Oct 14.16 0 752.2 10.2 139 90 0.07 7.33 0.18 0.96 0.0096 0.03 44 16.8g23.040 InfluencedAug-Oct 20.79 0 762.2 5.4 314 204 0.15 7.12 6.48 14 0.0305 NA 35 10g23.041 DownstreamAug-Oct 20.99 0 762.7 8.66 160 212 0.16 7.45 3.255 15 0.0234 1.874568 17 31g23.047 Upstream Aug-Oct 11.8 0 754 12.17 145.5 94.2 0.07 7.67 0.38 1.6 0.005 0.01 727 12.1g23.048 InfluencedAug-Oct 15.8 0 764.2 1.82 347.2 225.6 0.17 7.79 5.475 15 0.0182 NA 2254 52g23.049 DownstreamAug-Oct 17.1 0 764.7 8.89 337.8 219.7 0.16 7.73 3.09 15 0.0244 1.8 3130 496g23.055 Upstream Aug-Oct 10.8 0 757.8 14.34 145 94.2 0.07 7.74 0.475 0.94 0.005 0.02 727 63g23.056 InfluencedAug-Oct 12.7 0 767.4 2.64 364.2 236.6 0.18 7.19 7.355 17 0.0236 NA 1100 0.5g23.057 DownstreamAug-Oct 13.9 0 767.4 13.94 332.1 215.8 0.16 7.9 3.915 15 0.0346 NA 1333 135g23.063 Upstream Nov-Jan 9.4 0.2 760.9 12.7 94 61.1 0.04 6.84 0.75 0.9 0.0087 0.02 3654 0.5g23.064 InfluencedNov-Jan 11.9 0.2 771.4 1.6 341.9 222.3 0.16 6.58 10.995 14 0.716 0.84 3076 31g23.065 DownstreamNov-Jan 11.6 0.2 771.3 6.42 304.5 198.3 0.15 7.26 16.265 11 0.294 6.22 12033 598g23.071 Upstream Nov-Jan 6.9 7.8 750.6 13.48 81 52.7 0.04 6.96 1.66 12 0.302 0.07 117.8 0.5g23.072 InfluencedNov-Jan 8.5 7.8 760.9 3.46 272.3 176.8 0.13 6.84 19.56 8.7 0.114 1.4 24192 393g23.073 DownstreamNov-Jan 8.1 7.8 761.5 8.38 218.6 142.3 0.1 7.26 24 1 0.0068 6.93 8664 345g23.080 InfluencedNov-Jan 6 16 756.9 4.9 274.4 178.1 0.13 6.6 37.35 11 1.16 1.35 NA NAg23.081 DownstreamNov-Jan 5.5 16 757 10.48 236.1 153.4 0.11 7.04 28.35 9 0.283 20.33 NA NAg23.087 Upstream Feb-Apr 5.7 4 754.4 12.24 96.3 62.4 0.04 7.52 0.605 1.1 0.0267 0.04 26.2 0.5g23.088 InfluencedFeb-Apr 7.7 4 764.6 2.7 314 204.1 0.15 6.78 18.73 13 0.838 0.71 NA NAg23.089 DownstreamFeb-Apr 7.2 4 764.8 5.05 291.5 189.2 0.14 7.15 45.3 12 0.315 14.75 NA NAg23.097 InfluencedFeb-Apr 7.1 0 766.1 5.16 260.1 169 0.12 6.75 39.5 9.1 0.581 1.41 5172 717g23.098 DownstreamFeb-Apr 6.8 0 765.7 9.66 204.7 133.3 0.1 7.2 46.8 7.2 0.219 18.23 5172 703g23.104 Upstream Feb-Apr 6.6 1 753.7 13.01 95.7 62.4 0.04 7.24 0.755 1.3 0.0115 0.06 118.7 11g23.106 DownstreamFeb-Apr 9.1 1 764.9 8.62 231.1 150.2 0.11 7.28 39.7 7.8 0.256 12.09 16690 3350g23.079 Upstream Nov-Jan 4.2 16 746.5 15.56 79.9 52 0.04 6.76 21.5 1.6 0.005 0.08 NA NA49   Figure 3-3 Thirteen months of environmental data are graphed using a principal coordinate analysis (PCA) and visualized by sample location.  PC1 and PC2 explains a combined of 68.5% of the variance. Based on the PCA, upstream sites are clearly separated from the sites downstream, which largely overlapped.   50   Figure 3-4 Principal coordinate analysis of the environmental data visualized by seasons.  Seasons are not separated based on environmental data.    Whether comparing spatial or temporal data, the OTU distributions in samples from the two downstream sites are very different compared to those from upstream, but the downstream sites are indistinguishable from each other Figure 3-5. These observations are consistent with others showing diversity varies spatially and seasonally in aquatic virus communities (Angly et al, 2004; Short and Suttle 2003; Payet and Suttle 2014, Filippini, Buesing, and Gessner 2008). Although OTU distributions vary spatially and temporally, spatial variation overrules temporal variation in this study. A few OTUs were abundant in all thirteen months within the upstream sample, but not in the downstream sites. This suggests higher variability in the OTU composition at the downstream sites, which is borne out by the results shown on the heatmap (Figure 5).  51   Since location trumped season in determining the viral community composition, indicator species analysis was used to determine if some OTUs could potentially be used as indicator taxa to differentiate the sample sites. Indicator species analysis selected 81 OTUs with an alpha <0.05, from which 25 OTUs were selected that had specificity and sensitivity >70%. These OTUs were analyzed to determine if their phylogenetic affiliation was location specific. For example, g23 sequences associated with myoviruses infecting Enterobacteriaceae cluster into a distinct phylogenetic grouping, as do those infecting cyanobacteria (Filee et al., 2005). It is reasonable to speculate, for example, that phage infecting Enterobacteriaceae and cyanobacteria might be more prevalent in an environment with significant agricultural runoff.     Figure 3-5 OTUs that represented more than 5% of any sample are visualized in a heatmap.  Both rows and columns of the heatmap are sorted based on the Bray-Curtis similarity of the OTUs. Monthly patterns can also be observed within the sample locations.  52    Figure 3-6 Phylogenetic analysis of g23 sequences from representative isolates and 25 OTUs from this study. Sample locations, dates and abundances of OTUs are shown on the maximum likelihood tree that was generated using RaXML v8.1’s evolutionary placement algorithm (EPA).    From the phylogenetic analysis, the representative sequences from all 25 OTUs grouped within or near to cyanophages (Figure 6). None of the OTUs fell within the sequences from phage infecting enterobacteria or other pathogens. It is not surprising that the sequences fall within cyanobacteria, which are widespread in pristine and polluted marine and fresh waters.   Considering that the downstream sites were clearly affected by agricultural influences based on the environmental data, none of the abundant OTUs were assigned to phage that were related to viruses that infect bacteria associated with fecal and other contamination. Moreover, phage within a sample site were not more closely related to each other than among sites, 53  indicating that the different environments do not clearly select for evolutionarily distinct groups of bacteria. Yet, it was encouraging that the composition of the phage communities in the pristine environment were clearly distinct from those downstream from the agricultural inflow, suggesting that marker genes for phage communities can be used as bio-indicators of different environmental conditions. 3.4 Summary  The temporal and spatial distribution of T4-like myoviruses, as indicated by g23 sequences in a freshwater watershed was observed at three different sample locations over a 13-month period. The environmental data showed clear differences between the relatively pristine upstream sites to the agriculturally-influenced downstream sites. This corresponded to differences in the g23 OTU composition among sites. This likely reflects the environmental differences selecting for different host communities for the viruses. Although, temporal changes in the OTU composition of the communities were less pronounced than between the upstream and downstream sites, there were monthly variations in the OTU composition at each location. The most abundant OTUs at upstream and downstream sites were related to phage infecting cyanobacteria; whereas, none were related to known viruses that infect pathogenic bacteria, such as enterobacteria. Nonetheless, the pronounced and predictable differences between pristine upstream viral communities and agriculturally influenced downstream communities hold promise for the potential use of g23 sequences as potential bio-indicators of pollution.    54  Chapter 4: Conclusion  Viruses are the most abundant biological entities on earth. However much of their diversity and how it varies spatially and temporally is still unknown. Using targeted conserved genes to access diversity has been a common practice for prokaryotes (16S rRNA gene) and protists (18S rRNA gene) (Field et al., 1988; Weisburg et al., 1991). Since viruses do not share a common conserved gene, various genes encoding conserved proteins have been used to examine viral diversity within specific groups of viruses including DNA polymerase and capsid proteins (Chénard & Suttle, 2008; Clasen & Suttle, 2009; Filee et al., 2005; Labonte et al., 2009). The development of next generation sequencing capable of producing millions of sequences in a single run have enabled these genes to be surveyed to great depth. In this thesis I created and modified existing software to make a custom bioinformatics pipeline tailored to the sequencing technology for the purpose of exploring the diversity of viruses in marine and freshwater environments.   In the second chapter, the diversity within three families of viruses, Phycodnaviridae, Myoviridae and Podoviridae was investigated at five different sampling locations off the coast of British Columbia, Canada. Pyrosequencing was used to produce 351,870 raw reads. After the data pass through the bioinformatics pipeline, the results revealed greater diversity in these families compared to previous studies (Clasen & Suttle, 2009; Filee et al., 2005; Labonte et al., 2009), and uncovered new phylogenetic groups. All families showed spatial variation, but being geographically proximate did not mean closer diversity. Perhaps the most important contributions of this study is that it is the first to examine the diversity of three families of viruses within the same samples. The results showed different patterns of richness and evenness 55  among the families across locations, implying spatial differences in the composition of the underlying community of host organisms for each of the families.   In the third chapter, the diversity of a single family of viruses, Myoviridae¸ was observed over a 13-month period at upstream, agriculturally-influenced and downstream sampling sites along a stream. The results show clear differences in OTU distribution at the upstream site relative to the two downstream sites, demonstrating clear spatial variation among the communities. Within each sample location, seasonal changes in OTU distribution were also present with some OTUs being abundant for 3 months, before declining, while other OTUs would become more abundant. A few dominant OTUs were found throughout the 13 months at the pristine site, but none persisted at the downstream sites. This shows the composition of the family of Myoviridae varies spatially and temporally in this freshwater watershed system. Indicator species analysis identified several OTUs that were associated with the pristine site indicating the potential of g23 OTUs to be bio-indicators of environmental conditions.   This research added important knowledge to the poorly known world of viral diversity. For the first time the taxonomic distribution of three families of viruses was examined across a range of spatially and environmentally separated samples. As well, this is the largest effort to date to examine the diversity of myoviruses in a freshwater environment, revealing predictable spatial differences and strong seasonal dynamics.  As sequencing technology continues to improve, more sequences with higher quality and longer lengths should increase the quality of comparable data, allowing more environments and potentially other groups of viruses to be investigated. Combined with more high resolution sampling of the virus and host communities may reveal environmental drivers affecting the 56  composition of viral communities, and thereby lead to powerful bio-indicators of environmental conditions.   57  Bibliography Abascal, F., Zardoya, R., & Posada, D. (2005). ProtTest: Selection of best-fit models of protein evolution. Bioinformatics, 21(9), 2104–2105. doi:10.1093/bioinformatics/bti263 Ackermann, H.-W. (2007). 5500 Phages examined in the electron microscope. Archives of Virology, 152(2), 227–43. doi:10.1007/s00705-006-0849-1 Adriaenssens, E. M., & Cowan, D. a. (2014). Using signature genes as tools to assess environmental viral ecology and diversity. Applied and Environmental Microbiology, 80(15), 4470–80. doi:10.1128/AEM.00878-14 Amann, R. I., Ludwig, W., & Schleifer, K. H. (1995). Phylogenetic identification and in-situ detection of individual microbial cells without cultivation. Microbiol. Rev., 59(1), 143–169. Andersson, A. F., & Banfield, J. F. (2008). Virus population dynamics and acquired virus resistance in natural microbial communities. Science (New York, N.Y.), 320(5879), 1047–50. Retrieved from http://www.sciencemag.org/content/320/5879/1047.abstract Angly, F. E., Felts, B., Breitbart, M., Salamon, P., Edwards, R. A., Carlson, C., … Rohwer, F. (2006). The marine viromes of four oceanic regions. PloS Biol., 4(11), 2121–2131. Balzer, S., Malde, K., & Jonassen, I. (2011). Systematic exploration of error sources in pyrosequencing flowgram data. Bioinformatics (Oxford, England), 27(13), i304–9. doi:10.1093/bioinformatics/btr251 Bench, S. R., Hanson, T. E., Williamson, K. E., Ghosh, D., Radosovich, M., Wang, K., … Wommack, K. E. (2007). Metagenomic characterization of Chesapeake Bay virioplankton. Applied and Environmental Microbiology, 73(23), 7629–7641. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/17921274 Bergh, O., Borsheim, K. Y., Bratbak, G., & Heldal, M. (1989). High abundance of viruses found in aquatic environments. Nature, 340(6233), 467–468. Retrieved from <Go to ISI>://A1989AK28500058 Bettarel, Y., Sime-Ngando, T., Amblard, C., Carrias, J.-F., & Portelli, C. (2003). Virioplankton and microbial communities in aquatic systems: a seasonal study in two lakes of differing trophy. Freshw Biol, 48, 80–822. Bolger, A. M., Lohse, M., & Usadel, B. (2014). Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics, 30(15), 2114–2120. doi:10.1093/bioinformatics/btu170 58  Bratbak, G., Heldal, M., Thingstad, T. F., Riemann, B., & Haslund, O. H. (1992). Incorporation of viruses into the budget of microbial C-transfer - a first approach. Mar Ecol Prog Ser, 83(2-3), 273–280. Breitbart, M., Miyake, J. H., & Rohwer, F. (2004). Global distribution of nearly identical phage-encoded DNA sequences. FEMS Microbiology Letters, 236(2), 249–256. doi:10.1016/j.femsle.2004.05.042 Breitbart, M., & Rohwer, F. (2005). Here a virus, there a virus, everywhere the same virus? Trends Microbiol., 13(6), 278–284. Retrieved from <Go to ISI>://000229997000009 Breitbart, M., Salamon, P., Andresen, B., Mahaffy, J. M., Segall, A. M., Mead, D., … Rohwer, F. (2002). Genomic analysis of uncultured marine viral communities. Proceedings of the National Academy of Sciences of the United States of America, 99(22), 14250–14255. doi:10.1073/pnas.202488399 Brum, J. R., Steward, G. F., Jiang, S. C., & Jellison, R. (2005). Spatial and temporal variability of prokaryotes, viruses, and viral infections of prokaryotes in an alkaline, hypersaline lake. Aquat. Microb. Ecol., 41(3), 247–260. Brum, J., Schenck, R., & Sullivan, M. (2013). Global morphological analysis of marine viruses shows minimal regional variation and dominance of non-tailed viruses. The ISME Journal, 7(9), 1738–1751. doi:10.1038/ismej.2013.67 Brussaard, C. P. D. (2004). Optimization of procedures for counting viruses by flow cytometry. Appl. Environ. Microbiol., 70(3), 1506–1513. Retrieved from <Go to ISI>://000220154800032 Brussow, H., Canchaya, C., & Hardt, W. D. (2004). Phages and the evolution of bacterial pathogens: From genomic rearrangements to lysogenic conversion. Microbiol. Mol. Biol. Rev., 68(3), 560+. Caporaso, J. G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F. D., Costello, E. K., … Knight, R. (2010). QIIME allows analysis of high- throughput community sequencing data Intensity normalization improves color calling in SOLiD sequencing. Nature Publishing Group, 7(5), 335–336. doi:10.1038/nmeth0510-335 Chen, F., & Suttle, C. A. (1995). Amplification of DNA polymerase gene fragments from viruses infecting microalgae. Appl. Environ. Microbiol., 61(4), 1274–1278. Chen, F., Suttle, C. A., & Short, S. M. (1996). Genetic diversity in marine algal virus communities as revealed by sequence analysis of DNA polymerase genes. Appl. Enivron. Microbiol., 62(8), 2869–2874. 59  Chen, F., Wang, K., Huang, S., Cai, H., Zhao, M., Jiao, N., & Wommack, K. E. (2009). Diverse and dynamic populations of cyanobacterial podoviruses in the Chesapeake Bay unveiled through DNA polymerase gene sequences. Environ. Microbiol., 11(11), 2884–2892. Chénard, C., & Suttle, C. a. (2008). Phylogenetic diversity of sequences of cyanophage photosynthetic gene psbA in marine and freshwaters. Applied and Environmental Microbiology, 74(17), 5317–5324. doi:10.1128/AEM.02480-07 Chibani-chennoufi, S., Bruttin, A., Dillmann, M. L., Brussow, H., Bru, H., & Brussow, H. (2004). Phage-host interaction: an ecological perspective. J. Bacteriol., 186(12), 3677–3686. doi:10.1128/JB.186.12.3677 Chow, C. E. T., & Fuhrman, J. a. (2012). Seasonality and monthly dynamics of marine myovirus communities. Environmental Microbiology, 14, 2171–2183. doi:10.1111/j.1462-2920.2012.02744.x Clasen, J. L., Hanson, C. a., Ibrahim, Y., Weihe, C., Marston, M. F., & Martiny, J. B. H. (2013). Diversity and temporal dynamics of Southern California coastal marine cyanophage isolates. Aquatic Microbial Ecology, 69, 17–31. doi:10.3354/ame01613 Clasen, J. L., & Suttle, C. A. (2009). Identification of Freshwater Phycodnaviridae and Their Potential Phytoplankton Hosts, Using DNA pol Sequence Fragments and a Genetic-Distance Analysis. Appl. Enivron. Microbiol., 75(4), 991–997. Clerissi, C., Grimsley, N., Ogata, H., Hingamp, P., Poulain, J., & Desdevises, Y. (2014). Unveiling of the diversity of Prasinoviruses (Phycodnaviridae) in marine samples by using high-throughput sequencing analyses of PCR-amplified DNA polymerase and major capsid protein genes. Applied and Environmental Microbiology, 80(10), 3150–60. doi:10.1128/AEM.00123-14 Clokie, M. R. J., Millard, A. D., & Mann, N. H. (2010). T4 genes in the marine ecosystem: studies of the T4-like cyanophages and their role in marine ecology. Virology Journal, 7. Comeau, A. M., & Krisch, H. M. (2008). The capsid of the T4 phage superfamily: The evolution, diversity, and structure of some of the most prevalent proteins in the biosphere. Molecular Biology and Evolution, 25(7), 1321–1332. Comeau, A. M., Li, W. K. W., Tremblay, J.-E., Carmack, E. C., & Lovejoy, C. (2011). Arctic Ocean Microbial Community Structure before and after the 2007 Record Sea Ice Minimum. PLOS ONE, 6(11). Comeau, A. M., Short, S., & Suttle, C. A. (2004). The use of degenerate-primed random amplification of polymorphic DNA (DP-RAPD) for strain-typing and inferring the genetic similarity among closely related viruses. J. Virol. Methods, 118(2), 95–100. Retrieved from <Go to ISI>://000221207300003 60  Consortium, T. U. (2014). UniProt: a hub for protein information. Nucleic Acids Research, 43(October 2014), D204–D212. doi:10.1093/nar/gku989 Cottrell, M. T., Suttle, C. A., Cottrelp, M. T., Suttle, C. A., Cottrell, M. T., & Suttle, C. A. (1995). Dynamics of a lytic virus infecting the photosynthetic marine picoflagellate Micromonas pusilla. Limnol. Oceanogr., 40(4), 730–739. Craun, G. F., Brunkard, J. M., Yoder, J. S., Roberts, V. a., Carpenter, J., Wade, T., … Roy, S. L. (2010). Causes of outbreaks associated with drinking water in the United States from 1971 to 2006. Clinical Microbiology Reviews, 23(3), 507–528. doi:10.1128/CMR.00077-09 Culley, A. I., Asuncion, B. F., & Steward, G. F. (2009). Detection of inteins among diverse DNA polymerase genes of uncultivated members of the Phycodnaviridae. The ISME Journal, 3(4), 409–18. Retrieved from http://dx.doi.org/10.1038/ismej.2008.120 Culley, A. I., Lang, A. S., & Suttle, C. A. (2003). High diversity of unknown picorna-like viruses in the sea. Nature, 424(6952), 1054–1057. Retrieved from <Go to ISI>://000184984200043 Darriba, D., Taboada, G. L., & Posada, D. (2011). Supplementary material for “ ProtTest 3 : fast selection of best-fit models of protein evolution ” Summary : Availability :, 1–4. doi:10.1093/bioinformatics/btr088 De Cáceres, M., Legendre, P., & Moretti, M. (2010). Improving indicator species analysis by combining groups of sites. Oikos, 119(February), 1674–1684. doi:10.1111/j.1600-0706.2010.18334.x DeLong, E. F., Preston, C. M., Mincer, T., Rich, V., Hallam, S. J., Frigaard, N. U., … Karl, D. M. (2006). Community genomics among stratified microbial assemblages in the ocean’s interior. Science, 311(5760), 496–503. Demuth, J., Neve, H., & Witzel, K.-P. (1993). Direct electron microscopy study on the morphological diversity of bacteriophage populations in Lake Plubsee. Applied and Environmental Microbiology, 59(10), 3378–3384. Djikeng, A., Kuzmickas, R., Anderson, N. G., & Spiro, D. J. (2009). Metagenomic Analysis of RNA Viruses in a Fresh Water Lake. PLOS ONE, 4(9). Edgar, R. C. (2010). Search and clustering orders of magnitude faster than BLAST. Bioinformatics, 26(19), 2460–2461. doi:10.1093/bioinformatics/btq461 Edwards, R. A., & Rohwer, F. (2005). Viral metagenomics. Nat. Rev. Microbiol., 3(6), 504–510. Fancello, L., Trape, S., Robert, C., Boyer, M., Popgeorgiev, N., Raoult, D., & Desnues, C. (2012). Viruses in the desert: a metagenomic survey of viral communities in four perennial ponds of the Mauritanian Sahara. The ISME Journal, 359–369. doi:10.1038/ismej.2012.101 61  Field, K., Olsen, G., & Lane, D. (1988). Molecular Phylogeny of the Animal Kingdom. Science. Filee, J., Tetart, F., Suttle, C. A., Krisch, H. M., File, J., Filee, J., … Krisch, H. M. (2005). Marine T4-type bacteriophages, a ubiquitous component of the dark matter of the biosphere. Proc. Natl. Acad. Sci. U.S.A., 102(35), 12471–12476. Filippini, M., Buesing, N., & Gessner, M. O. (2008). Temporal dynamics of freshwater bacterio- and virioplankton along a littoral-pelagic gradient. Freshwater Biology, 53, 1114–1125. doi:10.1111/j.1365-2427.2007.01886.x Fuhrman, J. A. (1999). Marine viruses and their biogeochemical and ecological effects. Nature, 399(6736), 541–548. Retrieved from <Go to ISI>://000080778400045 Fuhrman, J. A., & Schwalbach, M. (2003). Viral influence on aquatic bacterial communities. Biol. Bull. (Woods Hole), 204(2), 192–195. Retrieved from <Go to ISI>://000182460500012 Fuhrman, J. a., Steele, J. a., Fuhrman, J. a, Steele, J. a, Schwalbach, M. S., Hewson, I., … Brown, J. H. (2008). A latitudinal diversity gradient in planktonic marine bacteria. Proceedings of the National Academy of Sciences of the United States of America, 105(22), 7774–8. doi:10.1073/pnas.0803070105 Galand, P. E., Casamayor, E. O., Kirchman, D. L., & Lovejoy, C. (2009). Ecology of the rare microbial biosphere of the Arctic Ocean. Proceedings of the National Academy of Sciences of the United States of America, 106(52), 22427–32. Retrieved from http://www.pnas.org/cgi/content/abstract/106/52/22427 Garcia-Heredia, I., Martin-Cuadrado, A.-B., Mojica, F. J. M., Santos, F., Mira, A., Antón, J., & Rodriguez-Valera, F. (2012). Reconstructing Viral Genomes from the Environment Using Fosmid Clones: The Case of Haloviruses. PLoS ONE, 7(3), e33802. Retrieved from http://dx.plos.org/10.1371/journal.pone.0033802 Gilmour, M. W., Graham, M., Van Domselaar, G., Tyler, S., Kent, H., Trout-Yakel, K. M., … Nadon, C. (2010). High-throughput genome sequencing of two Listeria monocytogenes clinical isolates during a large foodborne outbreak. BMC Genomics, 11, 120. doi:10.1186/1471-2164-11-120 Gimenes, M. V, Zanotto, P. M. de A., Suttle, C. A., da Cunha, H. B., & Mehnert, D. U. (2012). Phylodynamics and movement of Phycodnaviruses among aquatic environments. ISME JOURNAL, 6(2), 237–247. Gobler, C. J., Hutchins, D. A., Fisher, N. S., Cosper, E. M., Sanudo-Wilhelmy, S. A., Fisher, L. N. S., … Sanudo-Wilhelmy, S. A. (1997). Release and bioavailability of C, N, P, Se, and Fe following viral lysis of a marine chrysophyte. Limnol. Oceanogr., 42(7), 1492–1504. 62  Gustavsen, J. a., Winget, D. M., Tian, X., & Suttle, C. a. (2014). High temporal and spatial diversity in marine RNA viruses implies that they have an important role in mortality and structuring plankton communities. Frontiers in Microbiology, 5(December), 1–13. doi:10.3389/fmicb.2014.00703 Hawley, a. K., Brewer, H. M., Norbeck, a. D., Pa a-Toli, L., & Hallam, S. J. (2014). Metaproteomics reveals differential modes of metabolic coupling among ubiquitous oxygen minimum zone microbes. Proceedings of the National Academy of Sciences, 111(31), 11395–11400. doi:10.1073/pnas.1322132111 Hemminga, M. A., Vos, W. L., Nazarov, P. V, Koehorst, R. B. M., Wolfs, C. J. A. M., Spruijt, R. B., & Stopar, D. (2010). Viruses: incredible nanomachines. New advances with filamentous phages. European Biophysics Journal : EBJ, 39(4), 541–50. Retrieved from http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2841255&tool=pmcentrez&rendertype=abstract Huang, S., Wilhelm, S. W., Jiao, N., & Chen, F. (2010). Ubiquitous cyanobacterial podoviruses in the global oceans unveiled through viral DNA polymerase gene sequences. ISME JOURNAL, 4(10), 1243–1251. Jacquet, S., Heldal, M., Iglesias-Rodriguez, D., Larsen, A., Wilson, W., & Bratbak, G. (2002). Flow cytometric analysis of an Emiliana huxleyi bloom terminated by viral infection. Aquat. Microb. Ecol., 27, 111–124. Jarvis, A. W. (1984). Differentiation of lactic streptococcal phages into phage species by DNA-DNA homology. Appl. Environ. Microbiol., 47(2), 343–349. Jia, Z., Ishihara, R., Nakajima, Y., Asakawa, S., & Kimura, M. (2007). Molecular characterization of T4-type bacteriophages in a rice field. Environ. Microbiol., 9(4), 1091–1096. Jiang, S. C., & Paul, J. H. (1998). Significance of lysogeny in the marine environment: Studies with isolates and a model of lysogenic phage production. Microb. Ecol, 35(3), 235–243. Retrieved from <Go to ISI>://000073385400003 Jürgens, K., Pernthaler, J., Schalla, S., & Amann, R. (1999). Morphological and compositional changes in a planktonic bacterial community in response to enhanced protozoan grazing. Applied and Environmental Microbiology, 65(3), 1241–1250. Katoh, K., Misawa, K., Kuma, K., & Miyata, T. (2002). MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Research, 30(14), 3059–3066. doi:10.1093/nar/gkf436 63  Katoh, K., & Standley, D. M. (2013). MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Molecular Biology and Evolution, 30(4), 772–780. doi:10.1093/molbev/mst010 Kristensen, D., Mushegian, A., Dolja, V., & Koonin, E. (2012). New dimensions of the virus world discovered through metagenomics. Changes, 29(1), 997–1003. doi:10.1016/j.biotechadv.2011.08.021.Secreted Kunin, V., Engelbrektson, A., Ochman, H., & Hugenholtz, P. (2010). Wrinkles in the rare biosphere: pyrosequencing errors can lead to artificial inflation of diversity estimates. Environmental Microbiology, 12(1), 118–23. Retrieved from http://www3.interscience.wiley.com/journal/122580005/abstract Labonte, J. M., Reid, K. E., & Suttle, C. A. (2009). Phylogenetic Analysis Indicates Evolutionary Diversity and Environmental Segregation of Marine Podovirus DNA Polymerase Gene Sequences. Appl. Enivron. Microbiol., 75, 3634–3640. Logares, R., Sunagawa, S., Salazar, G., Cornejo-Castillo, F. M., Ferrera, I., Sarmento, H., … Acinas, S. G. (2013). Metagenomic 16S rDNA Illumina tags are a powerful alternative to amplicon sequencing to explore diversity and structure of microbial communities. Environmental Microbiology, 16, 2659–2671. doi:10.1111/1462-2920.12250 Marston, M. F., & Sallee, J. L. (2003). Genetic diversity and temporal variation in the cyanophage community infecting marine Synechococcus species in Rhode Island’s coastal waters. Appl. Environ. Microbiol., 69(8), 4639–4647. doi:10.1128/AEM.69.8.4639-4647.2003 Marston, M. F., Taylor, S., Sme, N., Parsons, R. J., Noyes, T. J. E., & Martiny, J. B. H. (2013). Marine cyanophages exhibit local and regional biogeography. Environmental Microbiology, 15, 1452–1463. doi:10.1111/1462-2920.12062 Martinez, J. M., Schroeder, D. C., Larsen, A., Bratbak, G., & Wilson, W. H. (2007). Molecular dynamics of Emiliania huxleyi and cooccurring viruses during two separate mesocosm studies. Appl. Enivron. Microbiol., 73(2), 554–562. McMurdie, P. J., & Holmes, S. (2013). Phyloseq: An R Package for Reproducible Interactive Analysis and Graphics of Microbiome Census Data. PLoS ONE, 8(4). doi:10.1371/journal.pone.0061217 Middelboe, M., Hagstrom, A., Blackburn, N., Sinn, B., Fischer, U., Borch, N. H., … Lorenz, M. G. (2001). Effects of bacteriophages on the population dynamics of four strains of pelagic marine bacteria. Microb. Ecol., 42(3), 395–406. Retrieved from <Go to ISI>://000171961000018 64  Middelboe, M., & Lyck, P. G. (2002). Regeneration of dissolved organic matter by viral lysis in marine microbial communities. Aquat. Microb. Ecol., 27(1), 187–194. doi:10.3354/ame027187 Moineau, S., Pandian, S., & Klaenhammer, T. R. (1994). Evolution of a lytic bacteriophage via DNA acquisition from the Lactococcus lactis chromosome. Applied and Environmental Microbiology, 60(6), 1832–1841. Ogunseitan, O. A., Sayler, G. S., & Miller, R. V. (1992). Application of DNA probes to analysis of bacteriophage distribution patterns in the environment. Appl. Environ. Microbiol., 58(6), 2046–2052. Oksanen, A. J., Blanchet, F. G., Kindt, R., Legendre, P., Minchin, P. R., Hara, R. B. O., … Wagner, H. (2015). vegan: Community Ecology Package. R package version 2.2-1. http://CRAN.R-project.org/package=vegan. Pagarete, a., Chow, C. E. T., Johannessen, T., Fuhrman, J. a., Thingstad, T. F., & Sandaa, R. a. (2013). Strong seasonality and interannual recurrence in marine myovirus communities. Applied and Environmental Microbiology, 79(August), 6253–6259. doi:10.1128/AEM.01075-13 Park, Y., Lee, K., Lee, Y. S., Kim, S. W., & Choi, T.-J. (2011). Detection of diverse marine algal viruses in the South Sea regions of Korea by PCR amplification of the DNA polymerase and major capsid protein genes. Virus Research, 159(1), 43–50. Poorvin, L., Rinta-Kanto, J. M., Hutchins, D. A., & Wilhelm, S. W. (2004). Viral release of iron and its bioavailability to marine plankton. Limnol Oceanogr, 49(5), 1734–1741. Proctor, L. M., & Fuhrman, J. A. (1990). Viral mortality of marine bacteria and cyanobacteria. Nature, 343, 60–62. R Core Team. (2014). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/. Reeder, J., & Knight, R. (2010). Rapid denoising of pyrosequencing amplicon data: exploiting the rank-abundance distribution. Nature Methods, 7(9), 668–669. doi:10.1038/nmeth0910-668b.Rapid Rho, M., Tang, H., & Ye, Y. (2010). FragGeneScan: Predicting genes in short and error-prone reads. Nucleic Acids Research, 38(20), 1–12. doi:10.1093/nar/gkq747 Riemann, L., & Middelboe, M. (2002). Stability of bacterial and viral community compositions in Danish coastal waters as depicted by DNA fingerprinting techniques. Aquat. Microb. Ecol., 27(2), 219–232. doi:10.3354/ame027219 65  Rodriguez-Brito, B., Li, L., Wegley, L., Furlan, M., Angly, F., Breitbart, M., … Rohwer, F. (2010). Viral and microbial community dynamics in four aquatic environments. ISME JOURNAL, 4(6), 739–751. Rodriguez-Valera, F., Martin-Cuadrado, A.-B., Rodriguez-Brito, B., Pasić, L., Thingstad, T. F., Rohwer, F., & Mira, A. (2009). Explaining microbial population genomics through phage predation. Nature Reviews. Microbiology, 7(11), 828–36. doi:10.1038/nrmicro2235 Rosario, K., Nilsson, C., Lim, Y. W., Ruan, Y., & Breitbart, M. (2009). Metagenomic analysis of viruses in reclaimed water. ENVIRONMENTAL MICROBIOLOGY, 11(11), 2806–2820. Sandaa, R. A., & Larsen, A. (2006). Seasonal variations in virus-host populations in Norwegian coastal waters: Focusing on the cyanophage community infecting marine Synechococcus spp. Appl. Environ. Microbiol., 72(7), 4610–4618. Sanger, F., Nicklen, S., & Coulson, A. R. (1977). DNA Sequencing With Chain-Terminating Inhibitors. Proceedings of the National Academy of Sciences, 74(12), 5463–5467. Schoenfeld, T., Liles, M., Wommack, K. E., Polson, S. W., & Mead, D. (2010). Functional viral metagenomics and the next generation of molecular tools. Trends in Microbiology, 18(1), 1–19. doi:10.1016/j.tim.2009.10.001.Functional Schoenfeld, T., Patterson, M., Richardson, P. M., Wommack, K. E., Young, M., & Mead, D. (2008). Assembly of viral metagenomes from Yellowstone hot springs. Appl. Enivron. Microbiol., 74(13), 4164–4174. Shelford, E. J., Middelboe, M., Møller, E. F., & Suttle, C. a. (2012). Virus-driven nitrogen cycling enhances phytoplankton growth. Aquatic Microbial Ecology, 66, 41–46. doi:10.3354/ame01553 Short, S. M., & Short, C. M. (2008). Diversity of algal viruses in various North American freshwater environments. Aquat. Microb. Ecol., 51(1), 13–21. Short, S. M., & Short, C. M. (2009). Quantitative PCR reveals transient and persistent algal viruses in Lake Ontario, Canada. Environ. Microbiol., 11(10), 2639–2648. Short, S. M., & Suttle, C. A. (1999). Use of the polymerase chain reaction and denaturing gradient gel electrophoresis to study diversity in natural virus communities. Hydrobiologia, 401, 19–32. Retrieved from <Go to ISI>://000083276500004 Short, S. M., & Suttle, C. A. (2002). Sequence analysis of marine virus communities reveals that groups of related algal viruses are widely distributed in nature. Appl. Enivron. Microbiol., 68(3), 1290–1296. 66  Short, S. M., & Suttle, C. A. (2003). Temporal dynamics of natural communities of marine algal viruses and eukaryotes. Aquat. Microb. Ecol., 32(2), 107–119. Smith, V. H. (2007). Microbial diversity-productivity relationships in aquatic ecosystems. FEMS Microbiology Ecology, 62(2), 181–186. doi:10.1111/j.1574-6941.2007.00381.x Stamatakis, A. (2014). RAxML version 8: A tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics, 30, 1312–1313. doi:10.1093/bioinformatics/btu033 Steward, G. F., Montiel, J. L., & Azam, F. (2000). Genome size distributions indicate variability and similarities among marine viral assemblages from diverse environments. Limnol. Oceanogr., 45(8), 1697–1706. doi:10.4319/lo.2000.45.8.1697 Sullivan, M. B., Waterbury, J. B., & Chisholm, S. W. (2003). Cyanophages infecting the oceanic cyanobacterium Prochlorococcus. Nature, 424(6952), 1047–1051. Retrieved from <Go to ISI>://000184984200041 Suttle. (2005). Viruses in the sea. Nature, 437(7057), 356–361. doi:10.1038/nature04160 Suttle. (2007). Marine viruses--major players in the global ecosystem. Nature Reviews. Microbiology, 5(10), 801–12. doi:10.1038/nrmicro1750 Suttle, C. A. (2007). Marine viruses - major players in the global ecosystem. Nat Rev Microbiol, 5(10), 801–812. Suttle, C. A., & Chan, A. M. (1990). Marine cyanophages infecting oceanic and coastal strains of Synechococcus : abundance , morphology , cross-infectivity and growth characteristics. Suttle, C. A., Chan, A. M., & Cottrell, M. T. (1991). Use of Ultrafiltration to Isolate Viruses From Seawater which are Pathogens of Marinepytoplankton. Appl. Enivron. Microbiol., 57(3), 721–726. Retrieved from http://aem.asm.org/content/57/3/721.abstract Tarutani, K., Nagasaki, K., & Yamaguchi, M. (2000). Viral impacts on total abundance and clonal composition of the harmful bloom-forming phytoplankton Heterosigma akashiwo. Appl. Environ. Microbiol., 66(11), 4916–4920. Thingstad. (1998). A theoretical approach to structuring mechanisms in the pelagic food web. Hydrobiologia, 363, 59–72. Thingstad, T. F., Lignell, R., Thingstad, F., Lignell, R., Thingstad, T. F., & Lignell, R. (1997). Theoretical models for the control of bacterial growth rate, abundance, diversity and carbon demand. Aquat. Microb. Ecol., 13(1), 19–27. Torrella, F., & Morita, R. Y. (1979). Evidence by Electron Micrographs for a High Incidence of Bacteriophage Particles in the Waters of Yaquina Bay , 37(4), 774–778. 67  Toze, S. (1999). PCR and the detection of microbial pathogens in water and wastewater. Water Research, 33(17), 3545–3556. doi:10.1016/S0043-1354(99)00071-8 Venter, J. C., Remington, K., Heidelberg, J. F., Halpern, A. L., Rusch, D., Eisen, J. A., … Smith, H. O. (2004). Environmental genome shotgun sequencing of the Sargasso Sea. Science, 304(5667), 66–74. Retrieved from <Go to ISI>://000220567900037 Wang, K., & Chen, F. (2008). Prevalence of highly host-specific cyanophages in the estuarine environment. Environ Microbiol, 10(2), 300–312. Waterbury, J. B., & Valois, F. W. (1993). Resistance to cooccurring phages enables marine Synechococcus communities to coexist with cyanophages abundant in seawater. Appl. Environ. Microbiol., 59(10), 3393–3399. Retrieved from <Go to ISI>://A1993MA35300033 Weinbauer, Bonilla-Findji, O., Chan, A. M., Dolan, J. R., Short, S. M., Imek, K., … Suttle, C. a. (2011). Synechococcus growth in the ocean may depend on the lysis of heterotrophic bacteria. Journal of Plankton Research, 33, 1465–1476. doi:10.1093/plankt/fbr041 Weinbauer, M. G. (2004). Ecology of prokaryotic viruses. FEMS Microbiol. Rev., 28(2), 127–181. doi:10.1016/j.femsre.2003.08.001 Weinbauer, M. G., Winter, C., & Hofle, M. G. (2002). Reconsidering transmission electron microscopy based estimates of viral infection of bacterio-plankton using conversion factors derived from natural communities. Aquat. Microb. Ecol., 27(2), 103–110. Retrieved from <Go to ISI>://000174729600001 Weisburg, W. G., Barns, S. M., Pelletie, D. a, & Lane, D. J. (1991). 16S ribosomal DNA amplification for phylogenetic study. J. Bacteriol., 173(2), 697–703. Wells, L. E., & Deming, J. W. (2006). Significance of bacterivory and viral lysis in bottom waters of Franklin Bay, Canadian Arctic, during winter. Aquat. Microb. Ecol., 43(3), 209–221. Whelan, S., & Goldman, N. (2001). A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Molecular Biology and Evolution, 18, 691–699. doi:10.1093/oxfordjournals.molbev.a003851 Wichels, A., Biel, S. S., Gelderblom, H. R., Brinkhoff, T., Muyzer, G., & Schutt, C. (1998). Bacteriophage diversity in the North Sea. Appl. Environ. Microbiol., 64(11), 4128–4133. Retrieved from <Go to ISI>://000076694200002 Wilhelm, S. W., & Suttle, C. A. (1999a). Viruses and nutrient cycles in the sea, 49(10), 781–788. Wilhelm, S. W., & Suttle, C. A. (1999b). Viruses and Nutrient Cycles in the Sea aquatic food webs, (October). 68  Williamson, S. J., Rusch, D. B., Yooseph, S., Halpern, A. L., Heidelberg, K. B., & Al, E. (2008). The Sorcerer II Global Ocean Sampling Expedition: Metagenomic characterization of viruses within aquatic microbial samples. PloS One, 3(1), 0. Wilson, W. H., Fuller, N. J., Joint, I. R., Mann, N. H., Bell, C. R., Brylinsky, M., & Johnson-Green, P. (1999). Analysis of cyanophage diversity in the marine environment using denaturing gradient gel electrophoresis. Halifax, Canada. Wilson, W. H., Schroeder, D. C., Allen, M. J., Holden, M. T. G., Parkhill, J., Barrell, B. G., … Ghazal, P. (2005). Complete genome sequence and lytic phase transcription profile of a Coccolithovirus. Science, 309(5737), 1090–1092. Retrieved from <Go to ISI>://000231230100050 Winget, D. M., & Wommack, K. E. (2008). Randomly amplified polymorphic DNA (RAPD)-PCR as a tool for assessment of marine viral richness. Appl Environ Microbiol, 74, 2612–2618. Winter, C., Herndl, G. J., & Weinbauer, M. G. (2004). Diel cycles in viral infection of bacterioplankton in the North Sea. Aquat. Microb. Ecol., 35(3), 207–216. Winter, C., Matthews, B., & Suttle, C. a. (2013). Effects of environmental variation and spatial distance on bacteria, archaea and viruses in sub-polar and arctic waters. The ISME Journal, 7(8), 1507–18. doi:10.1038/ismej.2013.56 Wommack, K. E., & Colwell, R. R. (2000). Virioplankton: Viruses in aquatic ecosystems. Microbiol. Mol. Biol. Rev., 64(1), 69–114. doi:10.1128/MMBR.64.1.69-114.2000 Wommack, K. E., Ravel, J., Hill, R. T., Chun, J., & Colwell, R. R. (1999). Population dynamics of Chesapeake Bay virioplankton: Total-community analysis by pulsed-field gel electrophoresis. Appl Environ Microbiol, 65(1), 231–240. Wommack, K. E., Ravel, J., Hill, R. T., & Colwell, R. R. (1999). Hybridization analysis of Chesapeake Bay virioplankton. Appl Environ Microbiol, 65(1), 241–250. Wommack, K., Ravel, J., Hill, R. T., & Chun, J. (1999). Population Dynamics of Chesapeake Bay Virioplankton : Total- Community Analysis by Pulsed-Field Gel Electrophoresis †, 65(1), 231–240. Zaikova, E., Walsh, D. A., Stilwell, C. P., Mohn, W. W., Tortell, P. D., & Hallam, S. J. (2010). Microbial community dynamics in a seasonally anoxic fjord: Saanich Inlet, British Columbia. Environmental Microbiology, 12(1), 172–91. Retrieved from http://www.ncbi.nlm.nih.gov/pubmed/19788414  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0167157/manifest

Comment

Related Items