UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Sturcture and function of tomato ringspot virus RNA1 and RNA2 Rott, Michael E. 1993

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


831-ubc_1993_spring_phd_rott_michael.pdf [ 7.89MB ]
JSON: 831-1.0086241.json
JSON-LD: 831-1.0086241-ld.json
RDF/XML (Pretty): 831-1.0086241-rdf.xml
RDF/JSON: 831-1.0086241-rdf.json
Turtle: 831-1.0086241-turtle.txt
N-Triples: 831-1.0086241-rdf-ntriples.txt
Original Record: 831-1.0086241-source.json
Full Text

Full Text

STRUCTURE AND FUNCTION OF TOMATO RINGSPOT VIRUS RNA1 AND RNA2byMICHAEL ERIC R017B.Sc., The University of British Columbia, 1986A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHYinTHE FACULTY OF GRADUATE STUDIES(Department of Plant Science)We accept this thesis as conformingto the required standardTHE UNIVERSITY OF BRITISH COLUMBIAFebuary 1993© Michael Eric Rott,1993this thesis in partial fulfilment of the requirements for an advancedUniversity of British Columbia, I agree that the Library shall make itfor reference and study. I further agree that permission for extensiveIn presentingdegree at thefreely availablecopying of this thesis for scholarly purposes may be granted by the head of mydepartment or by his or her representatives. It is understood that copying orpublication of this thesis for financial gain shall not be allowed without my writtenpermission.(Signature) Department of ^c ;c.f.( The University of British ColumbiaVancouver, CanadaDate ^DE-6 (2/88)AbstractComplementary DNA (cDNA) to tomato ringspot (TomRSV) nepovirus RNA1 and RNA2was synthesized and cloned. Overlapping cDNA clones corresponding to over 99% of theTomRSV genome were obtained and used to determine the nucleotide sequence of RNA1and RNA2. The 5' termini of RNA1 and RNA2, which were not present in any cDNAclone analyzed, were determined by using viral RNA as a template in dideoxynucleotidesequencing reactions. TomRSV RNA1 is 8,214 nucleotides (nt) in length, excluding the 3'poly(A) tail, and contains a single long open reading frame (ORF) which accounts for 80%of the nucleotide sequence and has the capacity to encode a polyprotein of 244 kDa.Comparisons between the deduced TomRSV RNA1-encoded polyprotein sequence withthose encoded by the RNA1 components of the nepoviruses tomato blackring (TBRV),grapevine chrome mosaic (GCMV) and grapevine fanleaf (GFLV), the B RNA componentof the plant comovirus cowpea mosaic virus (CPMV), and the genomic RNAs from theplant potyvirus tobacco etch (TEV) and the animal poliovirus, identified a similarly orderedset of conserved amino acid sequence domains in common with these other viruses. Theputative functions for the conserved amino acid sequence domains are, in order from the N-terminus, a protease co-factor, NTP-binding domain, VPg, protease, and an RNA-dependent RNA polymerase. Possible cleavage sites have been identified between each ofthe conserved domains which would release each of the functional domains from the largepolyprotein. TomRSV RNA2 is 7273 nt in length excluding the 3' poly(A) tail and containsa single long ORF, accounting for 78% of the nucleotide sequence, with the capacity toencode a polyprotein of 207 kDa. The TomRSV coat protein gene was localized to the 3'end of the ORF by comparison of the TomRSV coat protein amino acid composition to theamino acid composition of all regions of the deduced RNA2 encoded polyprotein sequenceand by comparisons with the amino acid sequences of the TBRV, GCMV and GFLV coatiiproteins which are encoded by their respective RNA2 components. Sequences potentiallyinvolved in viral cell-to-cell movement were also localized on the TomRSV RNA2polyprotein sequence, N-terminal of the putative coat protein sequence that may alsoinclude a set of three tandem repeats of 38 to 53 amino acids. Extensive nucleotidesequence similarity accounting for almost 35% of the total TomRSV genomic RNA wasobserved between the 3' termini and between the 5' termini of TomRSV RNA1 and RNA2.Eighty-eight percent of the 5' terminal 907 nt of TomRSV RNA1 and RNA2 are identicaland include both coding and noncoding sequences while the 3' terminal 1533 nt of RNA1and RNA2 are identical and noncoding. It is possible that the similar sequences at bothends of TomRSV RNA1 and RNA2 are a result of recombination between these twogenomic RNA components during viral replication. Examination of potential translationinitiation sites on TomRSV RNA identified two in-frame AUG triplets at position 78 and441. Expression from both sites were assayed in vitro and in protoplasts which identifiedAUG78 as the site of translation initiation. Full-length cDNA clones corresponding toTomRSV RNA1 and RNA2 were constructed and fused to the bacteriophage T7 RNApromoter. RNA transcripts synthesized from constructs in vitro using 17 RNA polymerasewere tested for infectivity by inoculation onto Chenopodiunt amaranticolor and assaying forviral symptoms. Transcripts were found to be noninfectious.iiiTable of ContentsAbstract^ iiList of Tables viiiList of Figures^ ixList of Abbreviations xiiAcknowledgements^ xviiIntroduction  11.1^Single-Stranded Positive-Sense RNA Viruses^ 21.1.1 General Overview^  21.1.2 Viral Gene Functions  81.1.3 Viral RNA Regulatory and Recognition Sequences^ 101.1.4 Classification^  141.1.5 Evolution  171.2^Nepoviruses^  221.3^Tomato Ringspot Virus^  251.4^Research Objectives  27Materials and Methods^  292.1^TomRSV Propagation and Purification^  292.2^Virion RNA Extraction^  302.2.1 Denaturing Agarose Gel Electrophoresis of RNA^ 312.3 Construction of Complementary DNA Clones Corresponding to TomRSVRNA1 and RNA2^  322.3.1 Synthesis of Complementary DNA^  332.3.2 Cloning of Tailed cDNA^  342.3.3 Cloning of Blunt-Ended cDNA  35i v2.3.4 Polymerase Chain Reaction (PCR) Method^ 352.4^Colony Filter Hybridization^  362.5^Northern Blots^  372.5.1 Northern Blots using Diazotized Paper^ 372.5.2 Northern Blots using ZetaProbeTM Membrane^ 382.6^Southern Blots^  392.7^Nucleic Acid Probes  402.7.1 Nick-Translated Probes^  402.7.2 Random Primed cDNA Probes^  402.7.3 RNA Probes^  412.8^Sequencing^  412.8.1 Preparation of Subclones for Sequencing^ 422.8.2 DNA and RNA Sequencing^  442.8.3 Sequence Analysis^  462.9^In Vitro Translation Analysis  472.9.1 In Vitro Transcriptions^  472.9.2 Translations^  482.9.3 Sodium Dodecyl Sulfate Polyacrylamide Gel Electrophoresis(SDS-PAGE)^  482.10 Transient Expression of cDNA Constructs in Protoplasts^ 492.10.1 Site-Directed Mutagenesis^  492.10.2 Transfection of Protoplasts  502.10.3 GUS Assays^  51Results^  52^3.1^Synthesis and Mapping of cDNA Clones to TomRSV RNA1 and RNA2 ^ 52^3.2^Nucleotide Sequence Similarity Between RNA1 and RNA2^ 563.3^Sequence and Primary Structure of RNA1 and RNA2  57v3.4 Nucleotide Sequence Comparisons Between RNA1 and RNA2^ 683.5^Nucleotide Sequence Repetition Within RNA2^ 713.6^Open Reading Frames^  713.7^Noncoding Regions  803.7.1 5' Noncoding Region^  803.7.2 3' Noncoding Region  803.8^Analysis of the RNA2 coding Regions^  843.8.1 Location of the Putative Coat Protein Coding Region^ 843.8.2 Analysis of the Protease Cleavage Site for the CoatProtein Sequence^  863.8.3 5' Terminal Coding Region^  903.9 Amino Acid Sequence Similarity Between TomRSV RNA1 and RNA2Polyproteins^  923.10 Analysis of the TomRSV RNA1 Coding Region^ 923.10.1 RNA-Dependent RNA Polymerase  953.10.2 Cysteine Protease^  973.10.3 NTP-Binding Domain  973.10.4 VPg^  1003.10.5 Protease Co-Factor^  1013.10.6 N-Terminal Region  1013.10.7 Proteolytic Processing of the RNA1-encoded Polyprotein^ 1043.11 Genomic Organization of TomRSV RNA1 and RNA2^ 1063.12 Translation Initiation of TomRSV RNA^  1063.12.1 Plasmid Constucts^  1083.12.2 Site of Initiation In Vitro  1133.12.3 Site of Initiation in Protoplasts^  1133.13 Synthesis of Full-Length cDNA Clones to TomRSV RNA^ 118vi3.13.1 Synthesis of a Full-Length Clone of TomRSV RNA2^ 1183.13.2 Synthesis of a Full-Length Clone of TomRSV RNA1^ 1233.13.3 In Vitro Translation of pMRD14 and pMR10^ 1233.13.4 Inoculation of RNA Transcripts onto Plants  127Discussion^  1284.1^Structure and Function of TomRSV^  1284.1.1 Structure and Function of TomRSV RNA1^ 1284.1.1.1 Cysteine Protease^  1304.1.1.2 NTP-Binding Domain  1334.1.1.3 RNA-Dependent RNA Polymerase Domain^ 1354.1.1.4 Protease Co-Factor^  1364.1.1.5 VPg^  1374.1.1.6 N-Terminal Coding Region^  1384.1.2 Structure and Function of TomRSV RNA2  1394.1.2.1 Coat Protein^  1434.1.2.2 Movement Protein  1444.1.2.3 N-Terminal Amino Acid Sequences^ 1464.1.3 Polyprotein Processing^  1464.2^Nucleotide Sequence Similarity Between RNA1 and RNA2^ 1474.3^Amino Acid Sequence Similarity Between TomRSV RNA1 and RNA2.... 1494.4 RNA Recombination^  1504.5^Translation Initiation  1544.6 Full-Length cDNA Clones of RNA1 and RNA2^ 155Bibliography^  160viiList of Tables1. Nucleotide composition of^5' and 3' noncoding regions^ 812. Amino acid composition analysis of the TomRSV coat protein^ 853. Spectrophotometric determination of GUS activity for plasmidconstructs^in^triplicate^ 1164. Sizes of selected nepovirus RNA2 components^ 140viiiList of Figures1.1^Genomic organization of members of the picornavirus-like supergroup ^ 161.2^Genomic organization of members of the Sindbisvirus-like supergroup.... 183.1^Partial restriction enzyme map of TomRSV cDNA clones^ 533.2^Northern blot of TomRSV virion RNA probed with sense and antisenseRNA transcripts^  553.3 Northern blot of TomRSV virion RNA probed with selected TomRSVcDNA clones^  583.4^Southern hybridization analysis of TomRSV RNA1 and RNA2 clones^ 593.5^Diagrammatic representation of hybridization tests shown in Fig. 3.4 ^ 603.6^Strategy used to sequence RNA1 and RNA2^  623.7^Nucleotide sequence and amino acid translation of TomRSV RNA1^ 633.8^Nucleotide sequence and amino acid translation of TomRSV RNA2^ 663.9^Nucleotide sequence alignment of the 3' termini of TomRSV RNA1and RNA2^ 693.10 Nucleotide sequence alignment of the 5' termini of TomRSV RNA1and RNA2^ 703.11 Alignment of the three tandemly repeated nucleotide sequenceswithin RNA2^  723.12 Map of TomRSV RNA1 open reading frames^ 733.13 Map of TomRSV RNA2 open reading frames  743.14 Diagramatic representation of the coding and noncoding regions of RNA1and RNA2^ 753.15 Nucleotide sequence context for the first four AUG triplets inTomRSV RNA1^ 773.16 Analysis of the first 800 nt of RNA2 using "TESTCODE"^ 78ix3.17 Analysis of the first 800 nt of RNA2 by the "absolute positional basepreference method^  793.18 Nucleotide sequence similarity between the 3' noncoding regions ofTomRSV and CLRV^  833.19 Comparison of the TomRSV coat protein amino acid compositionand the amino acid composition of the RNA2 polyprotein sequence^ 873.20 Alignment of the amino acid sequences of the TomRSV, TBRV,GCMV and GFLV coat proteins^  883.21 Amino acid sequence alignment of regions N-terminal the the TomRSVand GFLV coat proteins^  913.22 Alignment of the amino acid sequences of the three tandem repeatsin the TomRSV RNA2 polyprotein sequence^  933.23 Amino acid sequence alignment of the N-terminal regions of theTomRSV RNA1 and RNA2 polyproteins^  943.24 Amino acid sequence alignment of the RNA-dependent RNApolymerase domains^  963.25 Amino acid sequence alignment of the viral cysteine protease domains^ 983.26 Amino acid sequence alignment of the "A" and "B" sites that formthe NTP-binding domain^  993.27 Amino acid sequence alignment of the protease co-factor domains^ 1023.28 Amino acid sequence alignment of the N-terminal regions of TomRSV RNA1and RNA2 polyproteins and the TBRV and GCMV RNA1 polyproteins ^ 1033.29 Analysis of the TomRSV RNA1 polyprotein sequence for proteasecleavage sites^  1053.30 Genomic organization of TomRSV RNA1 and RNA2^ 1073.31 Diagram showing the construction of pMRD14A  1093.32 Diagram summarizing the construction of p35SGUSO4^ 110x3.33 Summary of relevant sequences in mutant constructs used to identifythe translation initiation site in TomRSV RNA^ 1143.34 In vitro translation product of synthetic RNA transcripts generatedfrom pMRD14A^  1153.35 Graph showing relative rates of GUS activity for plasmid constructsin protoplasts^  1173.36 Construction of a full-length TomRSV RNA2 cDNA clone^ 1203.37 Construction of a full-length TomRSV RNA1 cDNA clone^ 1243.38 In vitro translation of synthetic RNA transcripts corresponding toTomRSV RNA1 and RNA2^  1264.1 Genomic organization of TomRSV compared to other members ofthe picornavirus supergroup^  1294.2 Comparison of the RNA2 components of TomRSV, GFLV, TBRVand GCMV^  141xiLists of AbbreviationsA^adenosine in the context of nucleotide sequenceA^alanine in the context of amino acid sequenceA260^absorbance at 260 nmA280^absorbance at 280 nmAIMV^alfalfa mosaic virusAPT^aminothiophenolAsn^asparagineATP^adenosine-5'-triphosphateBE^borate EDTABMV^brome mosaic virusBNYVV^beet necrotic yellow vein virusby^base pairsBSA^bovine serum albuminBRL^Bethesda Research LaboratoriesC^cytidine in the context of nucleotide sequenceC^cysteine in the context of amino acid sequence°C^degrees CelsiusCaMV 35S cauliflower mosaic virus 35ScDNA^complementary DNACi^Curie3cpro^polio 3C proteaseCIP^calf intestinal phosphataseCI^cylindrical inclusionCLRV^cherry leafroll viruscomo-^comovirusxi iCPMV^cowpea mosaic virusC-terminal^carboxy-terminalCys^cysteinedATP^deoxyadenosine triphosphatedCTP^deoxycytidine triphosphateddATP^dideoxyadenosine triphosphateddCTP^dideoxycytidine triphosphateddGTP^dideoxyguanosine triphosphateddTTP^dideoxythymidine triphosphatedGTP^deoxyguanosine triphosphate3-D^three dimensionalDI^defective interferingdITP^deoxyinosine triphosphateDNA^deoxyribonucleic aciddi IP^deoxythymidine triphosphateD1T^dithiothreitolE^glutamateE. coll.^Escherichia coliEDTA^ethylenediaminetetraacetic acidEtBr^ethidium bromideF^phenylalanineg^gramG^guanosine in the context of nucleotide sequenceG^glycine in the context of amino acid sequenceGCMV^grapevine chrome mosaic virusGFLV^grapevine fanleaf virusGUS^0-glucuronidaseH histidineHS P^heat shock proteinI^isoleucineK lysine in the context of amino acid sequenceK thousand in the context of sizekDa^kilodalton1^literL leucineL-agar^Luria agarLB^Luria brothmicrom^milliM^methionine in the context of amino acid sequenceM^molarmin^minutemol wt.^molecular weightMr^relative molecular weightmRNA^messenger RNA4-MUG^4-methlylumbelliferyl 13-D-glucuronic acidn^nanoNDP^nucleoside diphosphatenepo-^nepovirusNOS-ter^nopaline synthase termination sequencensP^nonstructural proteinnt^nucleotideN-terminal^NH2-terminalNTP^nucleoside triphosphatex i voligo^oligonucleotideORF^open reading frameP^prolinePAGE^polyacrylamide gel electrophoresisPCR^polymerase chain reactionPEG^polyethylene glycolPi^inorganic phosphatepNPG^p-nitrophenyl 0-D-glucuronidepoly(A)^polyadenylatepoty-^potyvirusPPV^plum pox virusQ^glutamineR arginineRCMV^red clover mosaic virusRDRP^RNA-dependent RNA polymeraseRNA^ribonucleic acidRRV^raspberry ringspot virusS^serineSBMV^southern bean mosaic virusSDS^sodium dodecyl sulfateSer^serineSF^superfamilySLRV^strawberry latent ringspot virusss^single-strandedSSC^sodium chloride/sodium citrateT^thymidineTCA^trichloroacetic acidX VTBRV^tomato blackring virusTEV^tobacco etch virusTMV^tobacco mosaic virusTNF^tumor necrosis factorTomRSV^tomato ringspot virusTris^hydroxymethyl amino methanetRNA^transfer RNATRSV^tobacco ringspot virusTRV^tobacco rattle virusTYMV^turnip yellow mosaic virusU uridineUV^ultraviolet^ valineVP1^poliovirus coat protein 1VP2^poliovirus coat protein 2VP3^poliovirus coat protein 3VPg^viral genome-linked proteinW^tryptophan(+)^positive senseS2^omegax viAcknowledgmentsI would first like to express my sincere appreciation to Dr. D'Ann Rochon for hersupervision, financial assistance and the opportunity to work in her lab. I would also liketo thank Dr. Jack Tremaine for making it making it possible to begin this study and forcritical review of this thesis and other papers. Thanks also to the other members of mycommittee, Dr. Brian Ellis, Dr. Frank Tufaro, Dr. Joan McPherson and Dr. Caroline Astellfor their helpful advice and suggestions in these studies. I would also like to thank themany people at the Agriculture Canada, Vancouver Research Station, both former andpresent directors Dr. Marvin Wientraub and Dr. Dean Struble, respectively for use offacilities and supplies, Dr. Dick Stace-Smith for the TomRSV isolate and assistance inpropagation of the virus, Dr. Bob Martin for generously supplying TomRSV virion RNA,and Bill Ronald for computer assistance. Many thanks to Angus Gilchrist and LawrenceLee for technical assistance in sequencing.This work was supported in part by NSERC operatating grants awarded to J. H.Tremaine and D. M. Rochon.x viiINTRODUCTIONThe modern study of plant viruses began in the late nineteenth century when it was shownthat an infectious agent from tobacco could pass through a bacteria-proof filter (Mayer,1886; Beijerinck; 1898). It was Beijerinck (1989) and Baur (1904) who first used the term"virus" to distinguish diseases produced by filterable agents from those caused by bacteria.Between the period of 1900 to 1935, virus diseases were described on the basis ofmacroscopic symptoms, cytological anomalies detected by light microscopy, host rangeand method of transmission. By the mid 1930s, techniques became available for theisolation and chemical characterization of the first plant virus, tobacco mosaic virus (TMV),which was identified as a rod-shaped nucleoprotein (Stanley, 1935). The relative stabilityand very high concentration of this virus in infected plants were crucial to the success ofthese early studies. Attention was focused mainly on the protein component of viruses upuntil the classic experiments of Hershey & Chase (1952), which identified the DNAfraction of a bacterial virus as the infectious component. These experiments were repeatedusing TMV (Gieger & Schramm, 1956; Frankel-Conrat & Williams, 1955; Frankel-Conrat, 1956) where it was demonstrated that the naked RNA component was infectiousand that the protein served a protective role. Since that time, rapid advances in technologyhave made it possible to routinely determine nucleic acid sequences and to predict theirtranslation products. This has resulted in an information explosion on the molecularbiology of small single-stranded RNA viruses. The compact size of viral RNA genomesmakes it practical to determine the entire nucleic acid sequence once the RNA has beenreverse transcribed into complementary DNA and cloned. Open reading frames can then beidentified and the nucleotide sequences, as well as the amino acid sequences of the putativetranslation products, analyzed using a variety of commercially available algorithms. Theamino acid sequence of the putative translation products can be further compared to otherIntroduction^2viral and cellular protein sequences. Sequence similarities to other viral and cellularproteins with known or putative function can suggest a possible function for the newlydetermined amino acid sequence and provide a focus for further functional studies.Relationships between different viral-encoded amino acid sequences have also been used inthe study of viral evolution. In addition, it is now possible to produce infectious synthetictranscripts in vitro from cDNA clones that correspond to the complete viral RNA genome.These techniques, along with a number of other powerful methods to manipulate DNA invitro, creates an enviable situation for molecular virologists to perform controlledexperiments in order to determine various aspects of viral function and pathogenesis.1 . 1^Single-Stranded Positive-Sense Plant RNA VirusesViruses can be broadly classified on the basis of their genomic nucleic acid which is eitherDNA or RNA, double-stranded or single-stranded and of positive or negative polarity.This introduction will focus mainly on positive (+), or messenger sense, single-stranded(ss) RNA viruses that infect plants, with some reference to selected (+)ssRNA viruses thatinfect animals. Of the many different viruses infecting plants that have been identified,approximately 76% are (+)ssRNA viruses (Zaitlin & Hull, 1987), making them the mostsuccessful of the plant viruses.1.1.1^General OverviewA general outline of the infectious cycle for a typical (+)ss RNA plant virus is given,followed by a brief overview of virus structure and functions and the different stages of theinfectious cycle. The virus particle, in the form of a nucleoprotein complex, enters the cellIntroduction^3of the host plant following wounding of the plant at the site of entry. Upon entry into thecell, the viral RNA becomes accessible to host cell ribosomes and is translated. Some ofthese newly synthesized, viral specific proteins then function to replicate the viral RNAwith replication and translation then continuing concurrently within the cell generatingmultiple copies of the viral RNA and its translation products. Virus particles can then beassembled from the newly synthesized viral structural proteins and (+)ssRNA. The viruscan then move to an adjoining cell either in the form of virus particles or some otheralternative nucleoprotein complex. Upon entry into the plant vascular system, the virus inthe form of virus particles, is rapidly spread throughout the plant. Virus from infectedplant tissue can then be transmitted mechanically to an uninfected host plant via feedinginsects, infected farm machinery, humans or other animals from the infected plant tocontinue the cycle.Virus Architecture Among plant viruses the mature particle, termed the virion,may have two common structures, either rod-shaped or spherical; however, plant virusesare also bacilliform or have an outer bilipid membrane envelope (for reviews see Harrison,1983; Rossmann & Johnson, 1989). Encapsidated within the virus particle is the viralnucleic acid. The shell of the particle, known as the capsid, is composed of many smallviral-encoded protein molecules (structural or coat proteins) of usually only one or twodifferent types arranged in a characteristic three dimensional pattern. The coat proteins ofrod-shaped viruses are packed into a helical array, surrounding, and tightly associatedwith, the genomic RNA coiled within. The coat proteins of spherical viruses are arrangedwith cubic symmetry to form an icosahedron composed of 60 identical subunits. Thus far,the 60 repeating subunits of most spherical capsids associated with plant viruses have beenfound to consist of three copies of a single coat protein species, however, in thecomoviruses there is one copy each of two distinct proteins and in the nepoviruses there isa single copy of one large protein (see Rossman & Johnson, 1989). The capsid helps toIntroduction^4protect the viral RNA from degradation and may also be involved in movement of the virusparticle from cell-to-cell or from host-to-host (Hull, 1989).Virion RNA Encapsidated within the virion particle is the viral RNA genomewhich may consist of a single contiguous strand of RNA or it may be divided amongseveral RNA molecules. Multi-segmented genomes are often separately encapsidated,requiring entry of all components into the cell for complete infection to occur. At the 5'terminus of the RNA, a variety of structures can be found including the m7G5 'ppp5 'Xp3 'cap (where X can be any nucleotide A, C, G or U), a di- or triphosphate, or a smallcovalently linked protein (VPg) (Davies & Hull, 1982). Possible 3' terminal structuresinclude a long polyadenylate tract, a hydroxyl group or a tRNA-like structure that can bespecifically aminoacylated (Davies & Hull, 1982). In general, the genomes of RNA virusesare small and must be used very efficiently, therefore the coding and regulatory sequencesmay overlap. Examples of this are the overlapping open reading frames (ORFs) foundamong many plant viruses including the furovirus beet necrotic yellow vein virus, thetombusvirus cucumber necrosis, the luteoviruses, and the unclassified maize chloroticmottle (Rochon & Johnston, 1991 and references therein). The 5' and 3' noncodingregions must also be utilized efficiently. For example, the 5' terminal noncoding regionmust provide not only a ribosome binding or entry site and possible translational enhancersequences, but also cis acting replication signals that are recognized on the negative strand,as well as encapsidation initiation sites (for review see Matthews, 1991).Transmission and cell entry There is no evidence for the presence of specific cellreceptors utilized by plant viruses to gain entry into the cell as is the case for animalviruses. Instead, plant virus transmission usually involves some means for introducing thevirus through a wound made into a plant cell (for a review on transmission see Mandahar,1990). Many viruses are (experimentally) mechanically transmitted by the direct rubbingof leaves with a virus inoculum. In nature this mechanism is utilized by a few virusesincluding tomato mosaic virus which can be transmitted by the rubbing of adjacent leavesIntroduction^5caused by wind and/or cultural practices (Broadbent, 1976) and by Andean potato latentvirus whose normal mode of spread is suggested to be by direct plant contact (Jones &Fribourg, 1977). Many plant viruses are transmitted by a specific vector such as an insect,fungus or nematode which feeds on the infected plant and as a result, acquires the virusparticle, which is then transmitted to another plant following subsequent feeding.Transmission by fungal zoospores can occur in a persistent manner in which the virus isborne internally, as is the case for lettuce big vein virus (Campbell, 1985), ornonpersistently on the surface of the zoospore, as is the case for tobacco necrosis virus (seeMandahar, 1990). Insect vectors include leafhoppers, planthoppers, whiteflys, beetles,mites, mealybugs, thrips and aphids. Of these, aphids are perhaps the most significant (seeHarris & Maramorosch, 1977). Aphid-borne viruses are of two types: noncirculative,which include the carla-, clostero-, cucumo-, poty-, and alfalfa mosaic virus, andcirculative, which include the luteoviruses, pea enation mosaic and carrot mottle virus (seeHarris & Maramorosch, 1977). Migratory ectoparasitic nematodes can also transmitviruses ( for review see Taylor, 1972). Both nepo- and tobravirus particles are retained in anonpersistent mannner and are absorbed to various internal parts of the nematode feedingapparatus. In addition, some viruses such as the nepoviruses are also transmitted throughseed and pollen (see Mandahar, 1981).Translation strategies Upon entry of the virus into the plant cell, the positive sensessRNA genome is uncoated, recognized by cell ribosomes and translated. For TMV acotranslational disassembly mechanism has been proposed (Wilson, 1984) which has alsobeen proposed for a number of other (+)ssRNA viruses (for review see Wilson, 1985).Unlike eukaryotic messenger RNAs (mRNAs), viral RNAs are often polycistronic. Sinceeukaryotic ribosomes efficiently recognize AUG initiation codons which are 5' proximal (see Kozak, 1989) a number of different translation strategies may be employed in order toexpress viral genes which are internally located (for reviews on translation strategies seeDavies & Hull, 1982; Dougherty & Hiebert 1985). The following is a brief description ofIntroduction^6the strategies which are known to exist among plant viruses. (i) Generation ofsubgenomic mRNAs. 3' co-terminal and colinear subgenomic length mRNAs may begenerated during infection. In brome mosaic virus it has been demonstrated thatsubgenomic mRNA synthesis occurs by internal initiation of transcription on the (-) senseRNA transcript (French & Ahlquist, 1988). (ii) Read-through of "leaky" terminationcodons. A downstream ORF may be accessed for translation by readthrough of atermination codon thus producing a larger functional polypeptide. (iii) Proteolyticprocessing. A large polyprotein may be translated from a long ORF containing two ormore cistrons and then be specifically cleaved to release the mature protein products. (iv)Ribosomal frame-shifting. During translation, ribosomes may bypass a stop codon byshifting reading frames thereby producing a larger fusion polypeptide containing sequencesfrom a otherwise inaccessible region of the viral genome. (v) Internal initiation. Adownstream AUG codon may be accessed by internal ribosome binding to sequencessurrounding or preceding an internally located AUG codon. This mechanism has onlyrecently been demonstrated for the plant comovirus, cowpea mosaic virus (Verver et al.,1991). (vi) Leaky ribosomal scanning. Ribosomes may scan past the 5' proximal startsite due to an unfavorable context surrounding the AUG codon and instead initiatetranslation at the next downstream AUG codon (vii) Segmented genome. IndividualmRNAs may be present as separate RNA species which are independently replicated.Many viruses use various combinations of these different translation strategies to expresstheir genomes. It is also likely that these mechanisms exist not only to enable translation ofotherwise inaccessible cistrons, but also to regulate the timing and levels of expression ofparticular viral genes. For example, the existence of a coat protein gene on a subgenomicmRNA ensures that the coat protein is not expressed until sometime after the initial roundsof replication have occured. Similarly, at a later time during infection, an increase in coatprotein subgenomic mRNA synthesis could potentially increase the level of coat proteinIntroduction^7required to encapsidate a single genomic RNA (e.g. 180 copies of coat protein are requiredto encapsidate a single genomic RNA for many of the spherical plant viruses).Viral RNA replication Replication of (+)ssRNA viruses involves two stages, thesynthesis of a negative strand RNA copy using the virus genomic RNA as a template, andthe synthesis of progeny virion RNA using the negative strand RNA as a template. Theprotein composition of the viral replicase and the interaction of the replicase with the viralRNA template is poorly understood and is presently an active area of research (for reviewsee Quadt & Jaspers, 1989). Replication is catalyzed by a viral encoded RNA-dependentRNA polymerase (RDRP) which, together with other viral and cellular proteins, forms thereplicase complex. The replicase complex has been isolated from cucumber mosaic virusinfected plants and consists of two virus-encoded polypeptides (la and 2b) and one hostpolypeptide that is capable of catalyzing the synthesis of both stages of replication (Hayes& Buck, 1990). The 2a polypeptide contains sequences characteristic of other known andputative RDRP (Koonin, 1991) while the la polypeptide contains sequences characteristicof nucleic acid helicases (Gorbalenya et al., 1989b) which are thought to unwind thedouble-stranded RNA intermediate of replication.Virus movement through the plant The movement of plant viruses throughoutthe plant has received a great deal of attention in recent years (for a review see: Atabekov &Taliansky, 1990; Hull, 1989). Two types of movement have been identified, shortdistance or cell-to-cell movement which is relatively slow and long distance movementthroughout the plant which is relatively fast. Long distance movement occurs followingentry of the virus into the vascular system of the plant with values of 8 cm/hour recordedfor TMV in tobacco stems (Capoor, 1949). Slow movement is via the parenchyma, whichfor TMV is estimated at the rate of 8 1.1m/hour (Uppal, 1934). Movement of viruses inplants was thought to be genetically passive on the part of the virus, however, recently anumber of specific viral proteins have been identified that are involved in viral movement.Two different models have been put forth to describe plant virus cell-to-cell movement.Introduction^8For the tobamovirus, TMV, it is thought that cell-to-cell movement occurs as aribonucleoprotein complex consisting of the 30K protein, genomic RNA and possibly theviral coat protein, through enlarged plasmodesmata (Citovsky et al., 1990; Wolf et al.,1989). For the comovirus CPMV it is though that virus particles move through virusspecific tubular structures passing from one cell to another, through the cell walls ofinfected tissue (van Lent et al., 1990; van Lent et al., 1991).1.1.2 Viral Gene FunctionsThe function(s) of a number of plant virus gene products have been characterized. Someviral proteins appear to have more than one function and it is likely, that as viral proteinsbecome better characterized, this trend will continue due in part to the compact nature andlimited coding potential of RNA virus. The functions of common gene products identifiedto date can be classified into the following groups:Structural proteins Structural proteins refer to the capsid proteins that make up theshell of the virion, the matrix proteins of those viruses with an outer lipoprotein membrane,as well as the proteins which make up the outer membrane. As mentioned previously, amajor role for the capsid protein is to protect the viral RNA from ribonuclease attack.Capsid proteins also have a demonstrated role in viral cell-to-cell (van Lent et al., 1991)and long distance movement (see Hull, 1989), the specificity of interplant transmission viavectors (see Mandahar, 1990) and even a regulatory role in replication (Jaspers, 1985).The 3D structures of the capsids and individual capsid proteins of several icosahedral plantviruses including southern bean mosaic sobemovirus (Abad-Zapatero et al., 1980), tomatobushy stunt tombusvirus (Harrison et al., 1978), turnip crinkle carmovirus (Hogle et al.,1986) and CPMV comovirus (Stauffacher et al., 1987) have been determined at highresolution by X-ray crystallography. Results from these studies indicate that the capsidsIntroduction^9and protein subunits of these viruses have a strikingly similar topology based on a 60Tnstructure (where T is the triangulation number [1,3,4,7.. required for quasi-equivalence]and n is the number of different coat proteins) and an eight-stranded anti-parallel [3-barrel,despite the lack of amino acid sequence similarity between these proteins (Rossmann &Johnson, 1989).Transmission As mentioned above, the viral capsid protein is important for thetransmission of viruses by their vectors. In addition, the potyviruses encode a specifichelper factor that is required for aphid transmission (Pirone, 1981; de Majia et al., 1985).Pro teases Those viruses that express their genome via the synthesis of a largepolyprotein encode the protease(s) involved in the specific cleavage of the polyproteinprecursor(s) to release the mature viral protein products (Hellen et al., 1989). Cleavage byviral proteases is highly specific and generally occurs at only a few different dipeptidesequences (for review see Palmenberg, 1990). For the potyvirus tobacco etch (TEV), aconsensus sequence surrounding the cleavage site has been defined for efficient cleavageby the 49K protease (Carrington & Dougherty, 1988).Movement The TMV 30K non-structural protein has been demonstrated to potentiatethe movement of TMV either by binding the viral RNA and thereby enabling its transportthrough narrow plasmadesmata channels (Citovsky et al., 1990) and/or by modifyingplasmadesmata to enable the transport of viral ribonucleoprotein (Wolf et al., 1989). A58/48K protein encoded by the comovirus, cowpea mosaic virus (CPMV), has beendemonstrated to facilitate cell-to-cell movement, perhaps by forming tubular structureswhich penetrate the cell wall and act as channels for the movement of CPMV particles (vanLent et al., 1990). As mentioned above, most viral capsid proteins have been demonstratedto be important in systemic virus movement through the plant vascular system and, in somecases, in efficient cell-to-cell movement.RNA-dependent RNA polymerase All plant viruses sequenced to date have beenfound to encode the RNA-dependent RNA polymerase (RDRP) component of the viralIntroduction^10replicase (Koonin, 1991). As mentioned in the preceding section, the RDRP is the catalyticcomponent of the replication complex involved in synthesis of the new RNA molecule.The BMV RDRP has also been shown to be involved in the synthesis of subgenomicRNAs (Miller et al., 1985).Helicase Helicase proteins are essential for nucleic acid replication and are anothercomponent of the replicase complex (see preceding section). They are presumed to unwindthe double-stranded RNA replicative intermediate during RNA synthesis (Gorbalenya etal., 1989b). These proteins were initially identified on the basis of a conserved nucleotidebinding motif present in known cellular helicases (Gorbalenya et al., 1985). Directexperimental evidence for helicase activity among RNA viruses, however, has only beendemonstrated for the plum pox potyvirus CI protein (Lain et al., 1990).VPg Some viruses have a protein, termed a VPg (viral protein genome linked),covalently linked to the 5' end of the viral RNA . In poliovirus it is thought that the VPgcan act as a primer during positive strand RNA synthesis, in the form of VPg-pUpUppp(Crawford & Baltimore, 1983), which can anneal to the 3' terminus of the negative strand.Synthetic poliovirus VPg has also been shown to autocatalytically cleave and becomecovalently linked to the 5' end of negative sense poliovirus RNA from a replicationintermediate (Tobin et al., 1989). In poliovirus, the VPg is removed before translation(Fernandez-Munoz & Darnell, 1976).1.1.3 Viral RNA Regulatory and Recognition SequencesRegulatory and recognition sequences in viral RNA genomes are located in the 5' and 3'noncoding regions, as well between coding sequences and even within coding sequences.5' Terminal structures As stated in section 1.1.1, many plant virus genomes contain a5' cap which is similar to that present on almost all cellular eukaryotic mRNAs. Unlike theIntroduction^11cap structure found on cellular eukaryotic mRNAs, plant virus RNA genomes contain capstructures in which the first two nucleotides are not methylated (see Matthews, 1991). Thefunction of the cap can be related to infectivity of the virus, stability of the viral RNA andthe efficiency with which the viral RNA is translated. The cap structure has beenassociated with increased infectivity in studies using both synthetic capped and uncappedRNA transcripts derived from full-length cDNA clones corresponding to a number of viralRNA genomes. Uncapped transcripts of TMV and brome mosaic virus (BMV) (Meshi etal., 1986; Janda et al., 1987) were found to be much less infectious than cappedtranscripts, while transcripts corresponding to RNA4 of alfalfa mosaic virus (A1MV) werecompletely noninfectious unless capped (Loesch-Fries et al., 1985). It is possible that thecap structure may protect the 5' termini of the RNA from exonuclease degradation. Thismay explain why capped synthetic transcripts of CPMV are more infectious than uncappedtranscripts, even though CPMV RNA does not normally have a 5' cap but is linked to aVPg (Vos et al., 1988a). The efficiency with which viral RNA is translated may also bedependent on sequences adjacent to the cap structure. The nucleotides adjacent to the capstructure can influence the binding affinity of cap recognition proteins and of the translationinitiation complex to mRNA (Godefroy-Colburn et al., 1985). As mentioned in section1.1.2, many RNA genomes which are not capped contain a covalently linked protein at the5' end (VPg). The VPg is thought to act as a primer during RNA synthesis and may, inaddition, affect translational efficiency and infectivity. Removal of the VPg by proteasedigestion decreases the efficiency of southern bean mosaic virus (SBMV) RNA translationin vitro (van Vloten-Doting & Neeleman, 1982). The VPg may also be essential for theinfectivity of some viruses, such as nepoviruses, but is not essential for the infectivity ofothers, as is the case for members of the comovirus group (Mayo et al., 1982).3' terminal structures^The 3' poly(A) sequence in plant viruses is encoded eitherby a poly (U) sequence in the negative strand, as for CPMV (Lomonossoff et al., 1985), oris post-transcriptionally added, as in the case for beet necrotic yellow vein virus (BNYVV)Introduction^12(Jupin et al., 1990). Addition of a 3' poly (A) sequence to a number of viral RNAs hasbeen shown to increase their translational stability in vivo (Huez et al., 1983). The stabilityof 3' poly(A) tailed mRNA may be mediated by a poly(A) binding protein (Bernstein et al.,(1989). A number of plant virus RNAs terminate in tRNA-like endings which are notpolyadenlyated. It was first shown for turnip yellow mosaic virus (TYMV) that the tRNA-like ending could be aminoacylated (Pinck et al., 1970) and this has since beendemonstrated for a number of other viruses. The tRNA-like structure may functionallysubstitute for a poly(A) tail, as has been shown for TMV (Gallie et al., 1991), and enhancethe expression and stability of the RNA. The tRNA-like ending also appears to play a rolein the initiation of negative strand synthesis (see below). Furthermore, it was suggested byRao et al., (1989) that the tRNA-like ending could function as a telomere during thereplication of BMV RNA. They showed that changes in the CCAoll ending could byrepaired in vivo. Previously, it was shown that the TYMV 3' tRNA ending wasrecognized by tRNA nucleotidyltransferase (Litvak et al., 1970).Sequences involved in translation Some of the factors that affect the translation ofviral RNA have already been discussed and include the 5' cap structure, 3' poly(A) and 3'tRNA-like structure (see above). There are, however, other sequences which can alsoaffect translation. Important determinants for translational efficiency are the nucleotidessurrounding the AUG initiation site (Kozak, 1984; Liitcke et al., 1987). Translationenhancer sequences have been identified in the 5' noncoding regions of several plant RNAviruses. A well known example is the TMV omega factor (Q factor) (Gallie et al., 1987).The f2 factor is highly organized, consisting of multiple repeated elements (Gallie et al.,1988). It has been suggested that translation enhancement produced by the TMV andtobacco etch virus (TEV) 5' sequences is due to the high AU content of these sequencesand subsequent reduced secondary structure, which may facilitate ribosome binding (Sleatet al., 1988; Carrington & Freed, 1990). As discussed above, viruses can expressinternally located ORFs by read-through of leaky termination codons. SequencesIntroduction^13surrounding the UAG codon of TMV and other viruses that utilize a read-throughmechanism are similar to each other and may affect the degree of read-through (Bouzoubaet al., 1987; Miller et al., 1988). Translational frame-shifting in the luteovirus potatoleafroll is also regulated by sequences surrounding the frameshift site (Prufer et al., 1992).Sequences involved in RNA synthesis Recognition sequences (promoters)required for viral RNA replication are most likely located at the 3' ends of both the positiveand negative RNA strands while promoters required for subgenomic RNA synthesis arelocated internally in the negative strand (Miller et al., 1985). As mentioned above, the 3'tRNA-like endings of BMV and TMV contain promoter signals which are necessary fornegative strand RNA synthesis (Ahlquist et al., 1984; Ishikawa et al., 1988). Viruses withmultipartite genomes have a high degree of nucleotide sequence similarity between their 5'termini and between their 3' termini, suggesting the existence of conserved replicationpromoter sequences at these sites. However, very little sequence complementarity betweenthe 5' and 3' termini of the same viral RNA segment suggests differences in therequirements of the replicase complex for (+) and (-) sense RNA synthesis. Differencesbetween the two promoters may have an effect on the amounts of (+) and (-) strand RNAsynthesized during infection, eg. the (+) sense promoter may be much stronger than the (-)sense promoter. In protoplasts infected with BMV, the ratio of (+) to (-) strands is 100:1(Marsh et al., 1990). The possibility that a host factor may be involved in promoterrecognition was suggested after it was discovered that the 5' terminal region of BMV RNAresembles the internal control regions of tRNA genes (Marsh & Hall, 1987). The internalsubgenomic promoter of BMV RNA3, which gives rise to RNA4 by internal initiation ofthe negative strand, has been extensively studied (Miller et al., 1985; Marsh et al., 1988;French & Ahlquist, 1988). The internal promoter is completely different from the 3'promoter which also suggests that additional host factors may be involved in subgenomicRNA synthesis.Introduction^14Virus assembly^Packaging of viral RNA into virion particles may be initiated byspecific interactions between sequences present on the RNA molecule and the viral coatprotein during particle assembly. In TMV, packaging of viral RNA is initiated bysequences located within an ORF ca. 1 kb from the 3' end (Zimmern, 1977; Jonard et al.,1977) whereas for the potexviruses, the origin of assembly is located within the 5'noncoding region (AbouHaidar & Bancroft, 1978).1.1.4^ClassificationViruses have traditionally been named on the basis of the host plant from which they werefirst isolated and by the symptoms they produced. However, it wasn't until 1966 that astandard nomenclature was agreed upon by the International Committee for Taxonomy ofViruses (reviewed in Matthews, 1985a,b). According to these agreements, (+) ssRNAplant viruses were initially characterized according to the presence or absence of an outermembrane envelope, the number of genomic segments and particle morphology. Some 32different groups or families have been identified based on additional criteria including thestructure and organization of the genomic RNA, the coat protein structure, physiochemicalproperties of the virus particle, serological relationships, cytopathology and method oftransmission. With the rapid accumulation of genomic sequence data from many virusesbelonging to the different groups, it has become apparent that many of these viruses areevolutionarily related (for reviews see: Goldbach, 1987; Zimmern, 1986; Strauss &Strauss, 1988; Habili & Symons, 1989). Surprisingly, sequence comparisons groupedtogether icosahedral and rod-shaped viruses, viruses with monopartite and multipartitegenomes and even viruses that infect plants with those that infect animals. Based on aminoacid sequence comparisons, Goldbach (1987) defined two supergroups which he called thepicornavirus and Sindbisvirus [Alphavirus]-like supergroups.Introduction^15Picornavirus-like supergroup Members of the picornavirus-like supergroup includethe animal picornaviruses and the plant como-, poty- and nepoviruses. These viruses differstructurally, with the picorna-, como- and nepoviruses particles having icosahedralsymmetry while potyviruses particles are rod-shaped. In addition, the picorna- andpotyvirus genomes are monopartite while the como- and nepovirus genomes are bipartite.However, all members of the supergroup have in common a 5' terminal VPg, a 3' poly(A)tail and express their genomes via the synthesis of a large polyprotein which is then cleavedat specific sites by a protease. Further, the genomic organization of these viruses is verysimilar and includes a set of genes which are present in a similar order and that code fornonstructural proteins with significant amino acid sequence similarity. The genomicorganization and regions of amino acid sequence similarity of the picornavirus polio,comovirus CPMV, nepovirus tomato blackring (TBRV) and potyvirus tobacco etch virus(TEV) are shown in Fig. 1.1. The set of four peptides at the carboxy terminus of thepoliovirus RNA and CPMV B RNA-encoded polyproteins appear to be part of amembrane-bound replication complex (Takegami et al., 1983; Goldbach & van Kammen,1985). The corresponding regions in the nepo- and potyviruses probably have similarfunctions. Three of these peptides contain sequences that match those of well characterizeddomains which have the following (putative) functions; a nucleotide binding site(Gorbalenya et al., 1985), a viral protease (Bazan & Fletterick, 1988), and an RNA-dependent RNA polymerase (Kramer & Argos, 1984). A more complete description canbe found in section 4.1.1 of the discussion.Sindbisvirus-like supergroup This supergroup includes the animal Sindbis virus,and the plant tobra-, alfalfa mosaic, cucumo-, bromo-, tobamo- and, initally, thecarmoviruses (Goldbach, 1987). The carmoviruses have subsequently been included in aseparate supergroup (see below). The viruses of the Sindbisvirus-like supergroup aremore diverse than those of the picornavirus-like supergroup. The tobamoviruses are rod-shaped while the other members have icosahedral particles. Sindbisvirus and thetrans?^CP^ co-factor^VPg ^TBRV^VPg- ^I^1^I--AAA^VPA411:iga_IN^-initeili: AAA^(nepovirus) NTP^pro^poltrans — CP —^ co-factor^VPg^CPMV^VPg-j^I^I 1—AAA^ VPg-g112 :. Lg2aa_z_.1-1111M -AAA(comovirus) NTP^pro^polCP^VPgPolio^ ii53LAilliP AAA^(picornavirus) pro NTP^pro^polVPg CPTEV^ VPg-^Iml^I^all mg^I-AAA^(potyvirus) HC-pro NTP^pro^polFig. 1.1^A comparison of the genomic organization of representative members of the nepo-, como-, picorna- andpotyvirus groups. The single long ORFs on each RNA is represented by the open rectangle. Similar shading indicatesamino acid sequence similarity between viruses. Abbreviation; TBRV, tomato blackring virus; CPMV, cowpeamosaic virus; TEV, tobacco etch virus; coat protein (CP), transport protein (trans), NTP-binding domain (NTP),protease (pro), RNA-dependent RNA polymerase (pol), helper component (HC) and polyadenylate (AAA). Symbolsprotease co-factor, D NTP-binding domain, El protease domain and IN RNA-dependant RNA polymerase.Introduction^17tobamoviruses are monopartite, the tobraviruses are bipartite and the bromo-, cucumo- andalfalfa mosaic viruses are tripartite. alphaviruses, tobamoviruses, and tobraviruses utilizeread-through of leaky termination codons to express parts of their genome. Sindbis virusalso utilizes a polyprotein strategy and all members of the supergroup produce subgenomicmRNAs. All Sindbis virus-like RNAs are capped at their 5' ends. The genomicorganization and regions of amino acid sequence similarity of the bromovirus BMV,tobamovirus TMV, tobravirus tobacco rattle virus (TRV), Sindbisvirus and alfalfa mosaicvirus (AIMV) are shown in Fig. 1.2. All of these viruses code for nonstructural proteinswith conserved amino acid sequences characteristic of a nucleotide binding domain and anRNA-dependent RNA polymerase, in common with the picornavirus-like supergroup.However, the overall sequence similarity between the picorna- and Sindbis virus-likeproteins containing the conserved domains are very low. With an increase in the number ofviral sequences available for comparison, additional members belonging to eachsupergroup as well as at least one other distinct supergroup have been identified (Habili &Symons, 1991; Koonin et al., 1991). These comparisons were based solely on theconserved amino acid sequences present in both known and putative RNA-dependent RNApolymerases and helicase proteins (Habili & Symons, 1989) or on the RNA-dependentRNA polymerase domain alone (Koonin et al., 1991). Tentative members of this thirdsupergroup include the plant carmoviruses, tombusviruses, necroviruses, maize chloroticmottle, dianthoviruses and the animal virus Hepatitis C.1.1.5^Evolution of (+)ss RNA VirusesSimilarities in amino acid sequence observed between different RNA viruses may be aresult of either convergent or divergent evolutionary processes (for reviews see: Goldbach,1986; Strauss & Strauss, 1988; Zimmern, 1986). It is possible that proteins with functionsAIMV CAP -===:ME' CAP^CAPIEBEa(alfalfa mosaic virus)^ NTP^pol^trans? CPBMV^CAP -1==MIW CAP-4:=2220E- CAP^(bromovirus) NTP^pol^trans? CPSindbis^CAP^ i1^11^_I-(alphavirus) NTP^pol structural proteinsTMV(tobamovirus)CAP -1:::=MIREIDEI-NTP^pol trans CPTRV^CAP ^CAP-0-(tobravirus) NTP^poi trans? CPFig. 1.2 A comparison of the genomic organization of representative members from the, alpha-, tobamo- ,tobravirus, bromovirus groups and alfalfa mosaic virus. ORFs are represented by the open boxes. Similarshading indicates amino acid sequence similarity between viruses. Abbreviations: AIMV, alfalfa mosaic virus;BMV, brome mosaic virus; TMV, tobacco mosaic virus; TRV, tobacco rattle virus; CP, coat protein; trans, cell-to cell movement function; ■ RNA-dependent RNA polymerase domain (pol); • , Nucleotide binding domain(NTP).I■100Introduction^19critical for virus replication could have arisen independently several times, but because ofthe nature of their role, a very limited number of amino acid sequences are possible. If theobserved similarities are solely the result of convergent evolution, it is unclear why theseconserved genes are found encoded in the same order along the genome or why othersimilarities in genomic structure and translation strategies are also present among differentmembers of each supergroup. It would seem likely that similarities in amino acidsequence, genomic organization and structure, within each supergroup are the result ofcommon ancestry. The differences between viruses within each supergroup could be dueto further evolutionary processes (see below). Whether the replicative proteins, from(+)ssRNA viruses from the different supergroups, diverged from a single progenitor virus,or arose independently from one another, is uncertain.Origin of viral genes Although it is clear that viruses within each supergroup arerelated to one another on the basis of conserved amino acid sequences, the origin of thesesequences is speculative. Each of the three supergroups identified by Habili & Symons,(1991), retains a putative helicase and polymerase which are similar within each group.Among prokaryotic and eukaryotic organisms, nucleic acid helicases have been groupedinto two large superfamilies SF1 and SF2 ( Gorbalenya & Koonin, 1989a). The putativehelicases of the Sindbisvirus-like supergroup appear similar to SF1 helicases, while thoseof the picomavirus-like supergroup resemble the SF2 type helicases. The third supergroupidentified by Habili & Symmons, (1991) has characteristics of both but it's members arealso missing many of the conserved motifs typical of a helicase. Although the polymerasemotifs of all three supergroups are distinct (Habili & Symmons, 1991; Koonin et al.,1991), they retain a minimal amount of identity with each other, as well as with the putativeRNA polymerases of double stranded RNA viruses (Breunn, 1991; Koonin et al., 1989)negative strand RNA viruses(Poch et al., 1989; Delarue et al., 1990), reverse transcriptases(Kramer & Argos, 1984; Poch et al., 1989; Delarue et al., 1990; Xiong & Eickbrush,1990) and viral and cellular DNA-dependent DNA polymerases (Argos, 1988; Delarue etIntroduction^20al., 1990). Icosahedral viruses from all three supergroups have structural proteins with asimilar three dimensional folding pattern and orientation within the virion shell, despite thelack of amino acid sequence conservation between the different viral structural proteins (forreview see Rossmann & Johnson, 1989). It has also been shown that a mammaliancellular protein, tumor necrosis factor, has a three dimensional structure very similar to thatof the icosahedral virus coat proteins (Jones et al., 1989). Known and putative cysteineproteases of the picomavirus-like supergroup and SBMV are similar to the trypsin subclassof cellular serine proteases and distinct from other cellular serine and cysteine proteases(Neurath, 1984; Bazan & Fletterick, 1988; Bazan & Fetterick, 1989). It is generallyaccepted that the cellular serine and cysteine proteases have evolved independently(Neurath, 1984). The preceding examples demonstrate possible evolutionary linksbetween viral and cellular proteins which have a similar known or putative function. Inparticular, the putative viral helicases and proteases show similarities to distinct subgroupsof cellular helicases and proteases but not others. These observations suggest that at leastsome viral genes may have a cellular origin.RNA recombination Despite the conservation of amino acid sequences between manydifferent viruses, most virus groups can still be differentiated on the basis of auxiliarygenes which appear to be unique to one or more groups. The presence of additional genesand the shuffling of gene order may be the result of RNA recombination. Recombinationinvolves the exchange of genetic information between RNA molecules and appears to haveplayed an important role during viral RNA evolution (for review see Lai, 1992; Strauss &Strauss, 1988; Zimmern, 1986) . It is now widely believed that viral RNA genomes areconstructed by modular evolution, with individual genes or groups of genes mixed andmatched by the process of RNA recombination. For example, this would explain how ahighly conserved set of genes can be linked to two very different structural genes givingrise to viruses with a spherical particle (eg. the picornavirus, comoviruses and nepovirus)and another with rod-shaped particles (eg. the potyviruses) (Goldbach, 1987). RNAIntroduction^21recombination can be classified into two types (King et al., 1987; King, 1988).Homologous recombination involves two identical or very similar RNA molecules withextensive sequence identity in the region of crossover, such that the recombinant RNAmolecule retains the same sequence and structural organization as the parental RNAmolecules. Nonhomologous recombination, which is more common, can occur betweenany two RNA sequences and does not require identity of sequence at the crossover site. Itis generally assumed that recombinant RNA molecules are produced by a copy-choicemechanism in which the viral replicase complex disassociates from the template RNAmolecule and then reinitiates with a second RNA molecule using the partially synthesizedRNA strand as a primer (King et al., 1987; King, 1988). In the case of homologousrecombination, extensive complementarity between the template and the primer RNA isrequired. Such complementarity is not necessary for nonhomologous recombination. Thisis the essential difference between the two types of RNA recombination and is most likely aproperty of the RNA polymerase since viruses that undergo nonhomologous RNArecombination rarely, if ever, undergo homologous recombination, and vice versa (King,1988) It is speculated that nonhomologous RNA recombination is the mechanism for themodular evolution of viral genes from ancestral viruses and possibly also the mechanismfor the incorporation of cellular genes into viral genomes (King, 1988).Other evidence suggests that the ancestors of present day RNA viruses may predatethe cellular world and even predate DNA or proteins. A revitalized theory amongevolutionary biologists postulates that life may have arisen from an RNA world (Cech &Bass, 1986; Joyce, 1989; Benner et al., 1989). The impetus for this theory stems fromthe remarkable finding that RNA can have enzymatic properties in addition to their codingand replicative functions. Indeed, many of the findings in this area of RNA catalysis havebeen made from studies of a small satellite RNA of the plant nepovirus, tobacco ringspot(Prody et al., 1986; Buzayan et al., 1986). This theory circumvents the problem of whichcame first, DNA or protein; information or function. The answer appears to be RNA. ThisIntroduction^22idea is further strengthened by the fundamental role of RNA in such basic processes astranslation (Noller, 1984). It has also been speculated that the tRNA-like structures foundat the ends of many RNA viruses (and which can be specifically aminoacylated) arereplication signals, fossil relics from an ancient RNA world (Weiner & Maizels, 1987).1.2 NepovirusesThe term nepovirus is derived from nematode transmitted polyhedral virus. As the nameimplies, definitive members of this group were originally classified on the basis of theirsmall spherical particles and the fact that they could be vectored through the soil bynematodes. There are 18 definitive members, of which tobacco ringspot virus (TRSV) isthe type member (Harrison & Murant, 1977). TRSV has been the most well characterizedof these viruses in many initial studies, but more recent molecular studies have concentratedon tomato blackring (TBRV), grapevine chrome mosaic (GCMV) and grapevine fanleafvirus (GFLV).Virus particles Stace-Smith et al. (1965) discovered the multicomponent nature of thenepovirus group with the purification of three bands in sucrose density gradients fromTRSV infected tissue. These were labelled Top (T), Middle (M) and Bottom (B)components. The three components were serologically identical and consisted of smallspherical particles ca. 28 nm in diameter. The T particles were empty shells, whereas the Bparticles contained 42% RNA and were infectious. It was later shown that M particlescontained a single RNA species (RNA2) which was noninfectious, while B particlescontained two RNAs (RNA1 and RNA2), the larger of which (RNA1) was infectiouswhile the other [which was the same size as the RNA obtained from the M particles(RNA2)] was not (Diener & Schneider, 1966). It was later postulated by Murant et al.,(1972) that for the nepovirus raspberry ringspot (RRV), the M particles contained oneIntroduction^2 3molecule of RNA2 and the B particles contained either one molecule of RNA1 or twomolecules of RNA2. The structure of the RRV particle was analyzed by Mayo et al.,(1971) and was proposed to be have an icosahedral structure constructed from multiplecopies of a single protein subunit of ca. 54000 Mr, based on electron microscope studiesand polyacrylamide gel electrophoresis. These authors also calculated that each virioncontained 57-69 protein subunits by comparing the total weight of the empty shell to that ofthe protein subunit, factoring in the partial specific volume of the virion shell. This is inagreement with the simple icosahedral T=1 structure proposed for nepoviruses (Rossman& Johnson, 1989).Genomic RNA The nepovirus bipartite, positive sense, single-stranded RNA genome,consists of RNA1, with Mr ca. 2.8 x 106 and RNA2, which can vary in size from ca. 1.4 -2.4 x 106 (Harrison & Murant, 1977; Murant et al., 1981). Martelli (1975) proposed thatthe nepoviruses could be divided into 3 subgroups based on the size of their RNA2components. The first group have small RNA2 molecules with a M r of 1.4 - 1.5 x106 , twoof which are encapsidated into each virion particle. The second group have RNA2molecules of 1.5 -1.6 x 106 Mr, and the third group have RNA2 molecules that are greaterthan 1.6 x 106 Mr . Component particles from the second and third groups contain only asingle molecule of RNA2. Both RNA components of RRV and TBRV are required for theproduction of a complete infection in plants (Harrison et al., 1972; Randeles et al., 1977),however, RNA1 can replicate independently of RNA2 in protoplasts (Robinson et al.,1980). Based on the study of pseudo-recombinants of RNA1 and RNA2 derived fromdifferent strains of RRV and TBRV, it has been inferred that RNA1 contains thedeterminants for host range, seed transmission and symptomatology while RNA2 containsdeterminants for serological reactivity, nematode transmission and some symptoms as well.Both RNAs contribute to virulence (Harrison et al., 1974; Hanada & Harrison, 1977;Harrison & Murant, 1977b). Viral RNAs of TRSV and TBRV were shown to require aprotease sensitive structure for infectivity (Harrison & Barker, 1978). It was later shownIntroduction^24from radioiodination studies that the nepoviruses TRSV, TBRV, RRV and strawberrylatent ringspot (SLRV) RNA are linked to a small protein (VPg) at their 5' terminus (Mayo,1982). The presence of a 3' polyadenylate sequence was detected in TRSV, TBRV, RRV,SLRV and tomato ringspot (TomRSV) RNA by the binding of genomic RNA to oligo(dT)cellulose (Mayo et al., 1979). Sequence analysis of the RNA1 and RNA2 components ofTBRV (Meyer et al., 1986; Greif et al., 1988), GCMV (Le Gall et al., 1989; Brault et al.,1989) and GFLV (Ritzenthaler et al., 1991; Sergini et al., 1990), indicated that each RNAcontains a single long ORF resulting in the translation of a large polyprotein. In vitro andin vivo translation studies on these viruses indicated that the large polyproteins are post-translationally cleaved into smaller distinct polypeptides (Demangeat et al., 1990, 1991, inpress). The genomic organization of TBRV is shown in Fig 1.2. RNA2 codes for the coatprotein and putative movement protein (Meyer et al., 1986) while RNA1 codes for theputative helicase, protease, RNA-dependent RNA polymerase and the VPg (Greif et al.,1988).Replication The replication of nepoviruses is poorly understood. In TRSV infectedcucumber tissue, virus titer increases rapidly during the first 3 days following inoculationand is at a maximum after 5 days (Rezain & Francki, 1973). Before and during the periodof rapid viral RNA synthesis, an increase in viral specific RNA-dependent RNApolymerase activity and the presence of dsRNA is observed (Rezain & Francki, 1973;Rezain & Francki, 1974). The dsRNA isolated from infected tissue hybridized to allregions of the viral RNA and was presumed to be a replicative intermediate, however, itwas heterogeneous and of lower molecular weight than expected. High molecular weightdsRNA was eventually extracted from bean infected with TRSV (Schneider et al., 1974).Assays for RNA-dependant RNA polymerase activity and TRSV-specific dsRNA from cellfractions indicated that replication of TRSV in cucumber occurs in the cytoplasm (Rezain etal., 1976). Incorporation of [ 14C} leucine into viral coat protein was inhibited bycyclohexamide but not chloramphenical indicating that translation occurs preferentially onIntroduction^25cytoplasmic 80S ribosomes as opposed to 70S chloroplast ribosomes (Rezain et al., 1976).A common cellular abnormality detected by electron microscopy studies of RRV, SLRVand CLRV infected leaf tissue are the presence of inclusion bodies which are usually foundnext to the nucleus. These inclusion bodies contain ribosomes, endoplasmic reticulum andmany small membrane-bound sacs (Harrison et al., 1974; Roberts & Harrison, 1970;Jones et al., 1972).1.3 Tomato ringspot virusTomato ringspot virus (TomRSV) can be found naturally in North America alongthe pacific seaboard from British Columbia to California and in pockets around the GreatLakes region but has also been disseminated world wide through infected plant material(for a general review of TomRSV see: Stace-Smith, 1984). Experimentally, TomRSV willinfect a wide variety of plant species including more than 35 dicotyledonous andmonocotyledonous plant families. TomRSV naturally infects certain ornamentals, woodyand semi-woody plants and many weeds. The most serious disease problems in easternNorth America are prunus stem pitting and apple union necrosis and decline. In thewestern region, peach yellow bud mosaic and red raspberry ringspot are prominent.Widespread infection of perennial plant species and severe decline in productivity of theseplants makes TomRSV one of the more serious viral disease problems of plants in NorthAmerica. Chronically infected plants often do not display symptoms, but productivity isreduced. Distinctive shock symptoms however, are common on the initial infection.The type strain of TomRSV was originally isolated by Price (1936) fromgreenhouse tobacco seedlings in eastern US and named tobacco ringspot virus No. 2. Ithas subsequently been renamed tomato ringspot. The name "tomato ringspot" was alsoapplied to a virus isolated from tomatoes (Samson & Imle, 1942) which has since beenIntroduction^26lost. Other strains have been isolated and include the peach yellow bud mosaic strain andthe grape yellow vein strain (see Stace-Smith, 1984).TomRSV can be transmitted by free-living ectoparasitic nematodes that inhabit thesoil and feed on roots (Taylor & Robertson, 1975). Initially, Xiphinema americanum wasidentified as the nematode involved, however it now thought that X. anzericanum is acomplex of many species and reclassification of the different Xiphinema spp. is still inprocess. The spread of TomRSV by nematodes is slow, ca. 2 meters/year in raspberryplantings (Converse & Stace-Smith, 1971). Spread can also occur through seed frominfected soybeans, strawberries, raspberries, pelargonium and dandelion as well as throughpollen in pelargonium (see Stace-Smith, 1984). TomRSV infected dandelion and otherperennial weeds can act as a reservoir for the virus which is then widely disseminated byinfected wind blown seed. The subsequent transmission of TomRSV from infected weedsto commercial crops is by nematodes (Stace-Smith, 1984).TomRSV is found distributed throughout the infected plant but in the cell isrestricted to the cytoplasm. Particles that look very much like virus particles have also beenobserved in membranous tubular structures passing between cell walls that are possiblyassociated with plasmodesmata (De Zoeten & Gaard, 1969).TomRSV is a definitive member of the nepovirus group (Stace-Smith , 1984). TheTomRSV virion is a small isometric particle, 28 nm in diameter, with an outer shellcomposed of multiple copies of a single coat protein species with a Mr of 58,000 daltons(see Stace-Smith, 1984; Allen & Dias, 1977). Typical of other nepovirus, threecomponents can be purified by sucrose density gradients from TomRSV infected tissue,which for TomRSV have the following sedimentation coefficients s20,w: 53S(T), 119S (M)and 127S(B) (Schneider et al., 1974; Allen & Dias, 1977). Buoyant densities in CsC1 at25 °C are 1.459 (B) and 1.51(M) (g/cm3 ) (Schneider et al., 1974). The T componentconsists of assembled protein subunits (no RNA) and are noninfectious. The M and Bcomponents are difficult to separate, however, infectivity is enhanced when partiallyIntroduction^2 7 .purified bands are mixed together (Schneider et al., 1974; Allen & Dias, 1977), suggestingthat both the M and B components are required for infectivity. The M and B particles areca. 41 and 41% RNA, respectively (Allen & Dias, 1977). The viral genome is composedof two distinct RNA species, RNA1 and RNA2 with Mr ca. 2.8 x106 and 2.4 x106 ,respectively (Murant et al., 1981). Treatment of TomRSV genomic RNA with proteinaseK, as well as radioiodination studies, indicate that TomRSV is covalently linked to a smallVPg protein ca. Mr 4000 that is essential for infectivity (Mayo et al., 1982). TomRSVRNA is also polyadenylated as determined by binding to an oligo (dT) cellulose column(Mayo et al., 1979).1 . 4 Research ObjectivesA greater understanding of all aspects of disease caused by viral infection is of fundamentalimportance towards the development of effective new strategies of virus control. A centralfocus in plant virology is to understand the way in which the biological properties of avirus are determined by the virus genome. The realization of these goals can be approachedin a logical and systematic manner with the aid of modern molecular biological techniques.As a prerequisite for further studies of this type, it is first necessary to obtain the completenucleic acid sequence of the viral genome. Once this has been determined, it is possible toprecisely manipulate specific regions of the viral genome, introducing a genetic element tothe study, and assay the effects both in vitro and in vivo. A clear understanding of themolecular processes of viral infection will ultimately lead to novel methods designed forthe control of viral disease. Basic information acquired from the molecular analysis ofplant viral genomes has already been successfully applied in such cases as coat proteinmediated resistance, which involves the insertion and expression of the viral coat proteingene into economically important crop plants (Lawson et al., 1990). Plants that expressIntroduction^28antisense RNA to portions of the viral genome have also been shown to be resistant toinfection by the homologous virus (Cuozzo et al., 1988).This thesis represents a study of the basic molecular properties of the two genomiccomponents of tomato ringspot virus (TomRSV) which constitutes a basis for futurestudies concerning the molecular function, pathogenesis and ultimately the control ofTomRSV infection as well. The nucleotide sequence of TomRSV RNA1 and RNA2 wasdetermined and the genomic organization predicted by computer analysis. Preliminaryexperiments on the expression of TomRSV RNA in vitro and in protoplasts wereundertaken to determine the site of translation initiation. Finally, attempts were made toproduce synthetic infectious transcripts corresponding to the two nucleic acid componentsof TomRSV (RNA1 and RNA2) for use in future experiments to determine the function ofspecific regions of the TomRSV RNA genome.MATERIALS AND METHODSThere are a number of routine procedures used in molecular biology that form one or moresteps in the methods described below. Details of these procedures can be found in thelaboratory manual "Molecular Cloning" by Sambrook et al. (1989), and are not discussedhere. These procedures include organic extraction of nucleic acids, ethanol precipitations,large and small scale plasmid DNA isolation, spin columns, TCA precipitations, agarosegel electrophoresis, ethidium bromide staining, visualization and photography of such gels,restriction enzyme digestion, subcloning, ligations, transformations and growth of bacterialcultures. Where appropriate, Sambrook et al. (1989) will be referenced throughout theMaterials and Methods section of this thesis. The preparation and handling of the varioussolutions and reagents used for the detailed methods described below, are also discussed inthis manual.2.1^TomRSV Propagation and PurificationTomRSV, obtained from Dr. R. Stace-Smith of the Agriculture Canada Research Station,Vancouver B.C., was originally isolated from infected raspberry grown in the BritishColumbia Lower Mainland area. The virus was propagated in Nicotiana clevelandii, whichis a systemic host for TomRSV. Infected leaf tissue was ground in autoclaved 10 mMsodium phosphate buffer (pH 7.2) in a sterilized mortar and pestle, and rub-inoculated ontoleaves of ca. 4 week old plants which had been dusted with fine carborundum (Alundum;Norton, Co.). Infected tissue was harvested 9-12 days post inoculation.29Materials & Methods^30TomRSV was purified from infected tissue by a modified procedure obtained fromDr. R. Stace-Smith of the Vancouver Agriculture Canada Research Station, (personalcommunications). Infected leaf tissue was homogenized in a Waring blender for 2-3minutes in 2 volumes buffer (0.05 M Na2HPO4, 0.02 M ascorbic acid, 0.02 M 13-mercaptoethanol, pH 8.0) per gram tissue in a 4 °C cold room. The homogenate wasexpressed through cheesecloth and centrifuged (8,000 rpm in a Sorvall GSA rotor for 20min; 4 °C) to remove the large particulate matter. The pellet was discarded and thesupernatant stirred at 4 °C for 60-90 min after adjusting to 1% NaCl and 8% polyethyleneglycol (PEG) 8,000 (Sigma). This was then centrifuged as above for 30 min. The pelletwas resuspended in 0.05 M sodium citrate buffer (pH 7.0) and centrifuged at 10,000 rpmin a Sorvall SS-34 rotor for 15 min at 4 °C (optional). The supernatant was adjusted to adensity of 1.5 g/cm 3 with CsC1 and centrifuged at 45,000 rpm in a Beckman 70.1 Ti rotorfor 16-20 hours at 25 °C. The opalescent band, containing the virus particles, wasvisualized against a black background and removed using a # 20G needle and syringe.The virus was dialyzed against two or three sucessive changes of two liters of 0.05 Msodium citrate buffer (pH 7.0) over a 24 hour period at 4 °C. The concentration of virus inthe purified preparation was determined by measuring the absorbance at 260 nm (the A260of a lmg/ml solution of TomRSV in a 1 cm light path is 10.0 for M and B particlescombined). Purity of the virus preparation was determined from the A260/A280 (A260/A-280of purified M and B particles combined is 1.8). Samples were stored at 4 °C until furtheruse.2 . 2^Virion RNA ExtractionVirion RNA was isolated from purified virus by extraction with phenol/chloroform in thepresence of sodium dodecyl sulfate (SDS). To 400 gl purified virus (0.5-1.0 mg) wasMaterials & Methods^31added 200 pl redistilled phenol, 200 pi chloroform/octanol (24:1), 100 pi 0.5 M Tris-HC1(pH 8.9) and 25 pl 20% SDS. The mixture was vortexed for ca. 1 min and the aqueousand organic phases separated by centrifugation for 2 mM in an Eppendorf microcentrifuge(14,000 x g) at 4 °C. The aqueous phase was drawn off and the organic phase reextractedwith 200 gl autoclaved cold deionized H2O to obtain any viral RNA which may have beentrapped in the large interphase material. The resulting aqueous phase was pooled with thatfrom the first extraction and extracted again with 300 p1 phenol and 300 µ1chloroform/octanol and once again with 600 pl chloroform/octanol. To the final aqueousphase was added 0.1 volume of 2 M sodium acetate (pH 5.8) and 2.5 volumes of absoluteethanol. The RNA was precipitated at -70 °C for 30 mM or in liquid nitrogen for 10 minand then centrifuged in a microfuge for 15 mM at 4°C. The pellet was washed with 70%ethanol, dried by inverting the tube for several min, and the pellet resuspended inautoclaved deionized H2O. The quality of the RNA was assessed by electrophoresisthrough denaturing agarose gels (see section below) and quantified spectrophotometrically(1 mg/ml solution of RNA has an A260 of 40). RNA samples were stored at -70 °C.2.2.1^Denaturing Agarose Gel Electrophoresis of RNARNA was size fractionated by electrophoresis though 1% agarose gels containingmethylmercuric hydroxide (MeHgOH) (Bailey and Davidson, 1976). 100 pl 1.0 MMeHg0H (Alfa) was added to a 1% agarose gel solution in BE buffer [10x BE buffer is400 mM boric acid, 10 mM ethylenediaminetetra-acetic acid (EDTA), pH 7.38], and thenpoured into a casting tray. RNA samples, anticipated to contain between 20 and 100 ngRNA per band after electrophoresis, were brought to a volume of 8.5 pl with autoclaveddeionized H2O and denatured with 4.0 pl sample mix (50 p1 autoclaved deionized H2O, 50p1 10x BE buffer, 50 pl glycerol, 2.5 pl 1.0 M MeHg0H and 7.5 pi saturated^Materials & Methods^32bromophenol blue). These samples were allowed to stand at room temperature for 10 minprior to loading. Electrophoresis was carried out at 10 volts/cm for 60 to 120 mM in lx BEbuffer. After electrophoresis, the gel was stained in 100 ml deionized H2O containing 0.51g/m1 ethidium bromide (EtBr) and 10 mM 13- mercaptoethanol for 5-10 min and thendestained in deionized H2O for 10-15 mM. Bands were visualized under ultraviolet light at320 nm and photographed using a Polaroid Land Camera and Kodak Royal Pan Film.Note: Since MeHg0H is extremely poisonous and highly volatile,all manipulations involving MeHg0H were carried out in a fume hood whilewearing gloves. Any apparatus which came in direct contact with MeHg0Hwas rinsed with a 0.1 M solution of 13-mercaptoethanol after use toneutralize any contamination with MeHg0H.2 . 3^Construction of Complementary DNA Clones Corresponding toTomRSV RNA1 and RNA2This section describes the variety of methods that were used to clone double-stranded DNAcopies (cDNA) corresponding to portions of the TomRSV genome. During the course ofthis work, cDNA cloning was performed on several occasions. Most notably, a number ofcDNAs, to TomRSV RNA1 and RNA2, were initially synthesized and cloned thatcorresponded to over 99% of RNA1 and 75% of RNA2. A cloned cDNA corresponding tothe 5' terminus of RNA2 was subsequently obtained by priming first-strand cDNAsynthesis using a specific oligonucleotide derived from sequence information obtained froman RNA2 cDNA clone downstream of the 5' termini. A cDNA clone corresponding to the5' terminus of RNA2 next to the bacteriophage T7 RNA promoter was synthesized andamplified by the polymerase chain reaction (PCR) and subsequently subcloned into aMaterials & Methods^33variety of constructs for use in additional experiments. Details of these procedures aredescribed in the following subsections.2.3.1^Synthesis of Complementary DNAViral RNA which was initially used to synthesis complementary DNA (cDNA) wasobtained from Dr. R. Martin of Agriculture Canada. Further cloning experiments wereperformed using RNA obtained by the procedure described in section 2.2. Double-stranded DNA (dsDNA) complementary to TomRSV RNA1 and RNA2 was synthesisedusing the the 'one tube double-stranded cDNA synthesis method' of Bethesda ResearchLaboratories (BRL) (D' Alessio et al., 1987). DNA complementary to TomRSV RNA1and RNA2 was synthesised at either 37 or 42 °C using cloned Moloney murine leukemiavirus reverse transcriptase (200 units/ug of RNA) in a 50 t1 reaction mixture containing 50mM-Tris HC1 pH 8.3, 75 mM KC1, 10 mM-dithiothreitol (DTD, 3 mM MgC12, 660 tMeach of dATP, dCTP, dGTP and dTTP, 50-100 µg/ml oligonucleotide primer and 100-300µg/ml TomRSV RNA1 and RNA2. To monitor the reaction, a 10 pl aliquot was removedand 5 tCi [a- 32 13]dATP (3000 Ci/mmole), was added. Incorporation of [a-32131dATPinto newly synthesized cDNA was determined by precipitation with trichloroacetic acid(TCA) onto Whatman filter disks (Sambrook et al., 1989) followed by scintillationcounting in Aquasol-2 (DuPont). The first-strand reaction mixture was diluted to a finalvolume of 320 41 which contained 25 mM Tris HC1 pH 8.3, 100 mM KC1, 5 mM MgC12,250 tM each of dATP, dTTP, dCTP and dGTP, 5 mM DTT, and 250 units/ml Escherichiacoli DNA polymerase I holoenzyme and incubated for 2 h at 16 °C. The reaction mixturewas treated with phenol/chloroform, the aqueous phase adjusted to 0.2 M sodium acetate(pH 5.4), and precipitated with 2.5 volumes of absolute ethanol at -20 °C overnight.Materials & Methods^34Nucleic acid was pelleted by centrifugation in a microfuge for 15 min at 4 °C and washedwith 70% ethanol.2.3.2^Cloning of Tailed cDNAThis procedure was used to clone double-strand (ds) DNA obtained from the initial cDNAsynthesis experiment. The dsDNA pellet, obtained from the procedure described in Section2.3.1 above, was suspended in 20 ill of 20 mM Tris HC1 pH 7.8, 10 mM MgC12, 20 mMKC1, 0.1 mM DTT, 0.1 mM EDTA and treated with RNase H (20 units/ml) at 37 °C for 20min to remove RNA from the termini of the double-stranded DNA. The mixture was thentreated with phenol/chloroform and unincorporated nucleotides removed by centrifugationof the aqueous phase through a Sephadex G-50 (Sigma) spin-column (Sambrook et al.,1989). Synthetic dsDNA (300 ng) obtained from the effluent was tailed with dC residuesusing terminal deoxynucleotidyl transferase (BRL) to a tail length of approximately 15 to20 residues and annealed with 100 ng of Pstl digested dG tailed pUC9 (Pharmacia)(Sambrook et al., 1989). Half of the annealing mixture was used to transform competentE. coli DH5a cells (BRL). E. coli cells containing plasmids were selected on L-agar platescontaining 100 tg/ml ampicillin. The chromogenic substrate Blue-O-GalTM (BRL) (40pg/ml) was also added to the plates to determine if colonies harbored plasmids with inserts.The procedure used for the preparation of competent cells was that of Morrison (1979).Materials & Methods^352.3.3^Cloning of Blunt-Ended cDNAThis procedure was used to clone cDNA corresponding to the 5' terminus of RNA2. First-strand cDNA was primed using 50 pt,g/m1 of oligonucleotide #3 (5'TTCTGGTTCCTCTIFCC 3'), complementary to nucleotide positions 2,201 to 2,216 ofRNA2, and 200 pg/m1 TomRSV RNA, followed by second-strand synthesis (see section2.3.1). Synthetic dsDNA (300ng), prepared as in section 2.3.1, was treated with mungbean nuclease to ensure that both ends of the dsDNA were blunt (Hammond & D'Alessio,1986) and ligated into 100 ng EcoRV digested calf intestinal phosphatase (CIF')(Boehringer Mannheim) treated Bluescript1  (Stratagene) (see Sambrook et al., 1989).One half of the annealing mixture was then used to transform competent E. coli DH5a cellsand plated as described in section^Polymerase Chain Reaction (PCR) MethodThis procedure was used to obtain a cDNA clone corresponding to an exact or near exactTomRSV 5' RNA2 terminus next to the bacteriophage T7 RNA promoter. Single-strandedcDNA, corresponding to the 5' termini of TomRSV RNA2, was synthesised using 50µg/ml of the specific oligonucleotide #3 (5' TTCTGGTTCCTCTTCC 3'), complementaryto nucleotide positions 2,201 to 2,216 of RNA2, and 100 µg/ml of TomRSV RNA by thecDNA synthesis procedure in section 2.3.1. The resulting RNA/DNA hybrid wasdenatured by boiling for 1 min and the RNA degraded by the addition of 5 of 1 mg/mlDNase free RNaseA for 30 min at 42 °C. The mixture was extracted withphenol/chloroform, precipitated with 70% ethanol in the presence of sodium acetate (pH5.8) at -20 °C for 1 h and centrifuged at 14,000 x g in a microcentrifuge for 15 min. Thepellet washed with 70% ethanol and resuspended in 10 Ill of H2O of which 5 ill was usedMaterials & Methods^3 6in the following polymerase chain reaction (PCR) mixture: 10 mM Tris-HC1 pH 9.0 (at 25°C), 50 mM KC1, 1.5 mM MgC12, 0.01% gelatin (w/v), 0.1% Triton X-100, 20 tM eachof dATP, dCTP, dGTP and dTTP, 1 pi Perfect Match (Pharmacia), 2 p g ofoligonucleotide #5 (5' TCCCCGCGGTAATACGACTCACTATAG(AT)(AT)AGCGAAAAATCTGGT 3' where the nucleotide positions enclosed by brackets aredegenerate for the nucleotides indicated), which contains the T7 RNA polymerase promoterand TomRSV 5' terminal sequences, and 2.5 units of Taq DNA polymerase (Pharmacia).The reaction mixture was placed in an Ericomp thermocycler which was programmed toundergo 20 cycles consisting of 50 °C for 5 min, 74 °C for 2 mM and 94 °C for 1 mM. Thereaction product was purified using GeneClean (BioCan), digested with Smal and SstI,electrophoresed through an 1% agarose gel and the ca. 400 base pair (bp) fragmentexcised. The DNA was then extracted from the gel slice using GeneClean, and thefragment ligated into SmallSst1 digested Bluescript. The ligation mixture was used totransform competent E. coli DH5ta cells.2 . 4^Colony Filter HybridizationA colony filter hybridization assay (Gergen et al., 1979) was used to select bacterialcolonies carrying plasmids with large inserts obtained in section 2.3.2. Colonies weregrown overnight at 37 °C on L-agar plates containing 100 ug/ml ampicillin. AutoclavedWhatmann 541 filter paper discs, cut to the size of a petri dish, were placed over thecolonies for 1 min and then removed with a single quick motion which minimized smearingof the colonies, but allowed the colonies to adhere to the filter. Filters were then processedto denature and fix the DNA to the filter by sequentially placing the filters, colony side up,for 5 min each, on stacks of Whatman number 1 filter paper in a petri dish soaked with thefollowing solutions: two stacks of 0.5 M sodium hydroxide, two stacks of 0.5 M Tris-^Materials & Methods^37HC1 pH 7.5, and two stacks of 2x SSC (SSC is 150 mM sodium chloride, 15 mMdisodium citrate, pH 7.0). Filters were washed twice in 2x SSC (1 mM each) and thentwice in 95% ethanol (1 mM each) before being air dried for 5 to 10 min. Filters were pre-hybridized for 1 h at 42 °C with continuous agitation in hybridization buffer consisting of50% deionized formamide, 5% sodium dextran sulphate (mol. wt. 500,000), 1 M NaC1,50 mM Tris-HC1 pH 7.5, 0.2% bovine serum albumin (BSA), 0.2% polyvinylpyrrolidone,0.2% Ficoll, 0.1% sodium pyrophosphate and 250 ug/ml sheared salmon sperm DNA.Probe (see section 2.7.2) was added directly to the pre-hybridization solution containingthe filters and incubated at 42 °C for 17 h with continuous agitation. Filters were thenwashed 3 times with 2x SSC, 0.1% SDS for 15 mM each at 50 °C with agitation and thenonce with 2x SSC for 15 mM at 50 °C. Filters were air dried and with the aid ofintensifying screens.2 . 5^Northern BlotsNorthern blots were performed to determine which RNA (RNA1 or RNA2) a particularcDNA clone would hybridize to, and to determine the orientation of cDNA clones relativeto viral RNA. Viral RNA was electrophoresed on a 1% denaturing agarose gel (see section2.2.2) and transferred to either diazotized paper (Alwine et al., 1977) or ZetaprobeTMmembrane (Vrati et al., 1987).2.5.1^Northern Blots using Diazotized PaperAfter electrophoresis of the RNA on a denaturing gel (see section 2.2.2), the gel wasplaced in freshly prepared solution of 50 mM NaOH, 5 mM 0-mercaptoethanol for 30 minMaterials & Methods^38with gentle agitation and then neutralized with 2 changes of 200 mM potassium phosphate,7 mM iodoacetic acid pH 6.5 for 10 min each. The gel was then equilibrated with 2changes of transfer buffer (0.03 M citrate, 0.038 M Na2HPO4 pH 4.0). A capillarytransfer apparatus, essentially as described by Southern (Southern, 1975), was set up totransfer the RNA from the gel to the diazotized paper. Aminothiophenol paper fordiazotization was prepared by Dr. D. M. Rochon (Rochon, 1985). The paper wasdiazotized by soaking in 100 ml ice cold 1.2 N HC1 containing 2.7 ml of 1% sodiun nitritefor 30 min at 4 °C at which point the paper had turned bright yellow and was then quicklywashed with cold deionized H2O and cold transfer buffer and immediately placed on top ofthe gel. Transfer of RNA was allowed to continue for 16 to 18 h. After transfer, theblotted paper was sealed in a plastic bag with hybridization solution (see section 2.4) andincubated 2 h at 42 °C. After pre-hybridization, [ 32P]-labelled probe was added (1-3ng/ml) (see section 2.7.1) and incubated a further 16-24 h at 42°C. Followinghybridization, diazotized paper was washed with 2x SSC, 0.1% SDS for 30 min at 62 °Cthen lx SSC, 0.1% SDS for 45 min at 62 °C and finally in lx SSC for 30 min at 62 °C.The washed filter was lightly blotted dry and covered with Saran Wrap. The filter was thenautoradiographed with the aid of two Lightning Plus intensifying screens (Dupont).2.5.2^Northern Blots using ZetaProbeTM MembraneAfter electrophoresis of the RNA through a denaturing gel (see section 2.2.2), the gel wasplaced in 10 mM 13-mercaptoethanol for 15 min to remove mercury and then washed withdeionized H2O. The gel was placed on a capillary transfer apparatus and overlaid with theZetaprobeTM membrane (BioRad) which had been presoaked in deionized H2O. RNA wastransferred to the membrane in the presence of 10 mM NaOH for 3-16 h (Vrati et al.,1987). After transfer, the membrane was removed and rinsed briefly in 2x SSC, 0.1%Materials & Methods^39SDS. Prehybridization, hybridization, washing and autoradiography was the same asdescribed in section^Southern BlotsPlasmid DNA was digested with the appropriate restriction enzyme(s) and electrophoresedthrough a 1% TAE (TAE is 40 mM Tris pH 8.2, 1 mM EDTA) agarose gel . Bands werevisualized under ultraviolet light (320 nm) after staining with a 0.5 ug/ml solution of EtBrfor 10 min followed by destaining in H2O for another 5-10 min. The gel was presoaked in2 volumes of 0.25 M HC1 for 15 min and then placed on a capillary transfer apparatus andoverlaid with a precut piece of Zetaprobe (BioRad) membrane which had been presoaked indeionized H2O. DNA was transferred to the membrane overnight in the presence of 0.4 MNaOH. After transfer, the membrane was removed and briefly rinsed in 2x SSC, 0.1%SDS. The filter was sealed in a heat sealable plastic bag and prehybridized 1-4 hours in 1.5x SSPE (lx SSPE is 10 mM Na2HPO4 pH 7.7, 180 mM NaCl, 1 mM EDTA), 1.0 %SDS, 0.5% BLOTTO (w/v non-fat dry milk), and 0.5 mg/ml sheared salmon sperm DNAat 68 °C with gentle agitation. [P32]-labelled probe (1-3 ng/ml) (see section 2.7.1) wasadded directly to the prehybridization solution. The contents of the bag were mixedthoroughly and incubated overnight at 68 °C with agitation. After incubation, themembrane was removed and rinsed in 2x SSC, 0.1% SDS then washed successively byvigorous agitation at room temperature for 15 min in each of the following solutions: 2xSSC/0.1% SDS, 0.5x SSC/0.1% SDS, 0.1x SSC/0.1% SDS. The membrane was thensandwiched between Saran Wrap and exposed to X-ray film for 0.2-4 hours with the aid ofintensifying screens.^Materials & Methods^402 . 7^Nucleic Acid Probes2.7.1^Nick-Translated ProbesNick-translated probes were prepared essentially as described by Sambrook et al. (1989).Linearized plasmid DNA (50-100 ng) was added to a 50 t1 reaction containing 50 mMTris-HC1 pH 7.5, 10 mM MgCl2, 1 mM DTT, 50 tg/ml nuclease free bovine serumalbumin (BSA), 600 i..tM each of dCTP, dGTP, and dTTP, 30 to 50 [a-32P1-dATP(3000 Ci/mmol), 250 pg DNase I (Sigma) and 5 to 15 units of E. coli DNA polymerase I(Pharmacia). The reaction was incubated at room temperature for 20-90 mM and thereaction monitored by TCA precipitation (Sambrook et al., 1989). At the termination of thereaction, 1 ill of 20% SDS was added and the unincorporated nucleotides removed bypassing the reaction mixture through a Sephadex G-50 (Sigma) mini-column (Sambrook etal., 1989). The probe was then adjusted to 0.1 M NaOH and boiled for 5 mM to denaturethe DNA hybrid, then quick cooled on ice before addition to the bag containing thehybridization solution and the prehybridized blot.2.7.2^Random Primed cDNA ProbesRandom primed cDNA probes were made essentially by the method of Taylor et al.,(1976). TomRSV RNA1 and RNA2 were individually purified from a 1% low-meltingtemperature agarose gel (Tautz & Renz, 1983) after electrophoresis at 1.5 volts/cm for 18 hin the presence of MeHg0H (see section 2.2.2). Gel purified RNA (ca. 200 ng) wasincubated in a 50 s1 reaction consisting of 50 mM Tris-HC1 pH 8.3, 8 mM MgC12, 8 mMDTT, 6001.1M each of dCTP, dGTP and dTTP, 25 [cc-32P1 dATP (ca. 3000 Ci/mmol),1 mg/ml salmon sperm primer DNA and 18-36 units of avian myeloblastosis virus reverseMaterials & Methods^41transcriptase. The reaction was incubated at 37 °C for 30-120 min and the progress of thereaction monitored by TCA precipitable counts. Following cDNA synthesis, the reactionmixture was adjusted to 0.4% SDS and passed over a Sephadex G-50 minicolumn. TheDNA/RNA duplex was denatured by adjusting the reaction mixture to 0.1 M NaOH andboiling for 5 min. The reaction mixture was then quick cooled on ice before adding to thehybridization bag containing the blot and hybridization buffer.2.7.3^RNA Probes32P-Labelled RNA transcripts used as probes in blotting experiments were generated fromcDNA inserts cloned behind the bacteriophage T3 or T7 RNA polymerase promoter sites inBluescript (Stratagene) using T3 or T7 DNA-dependent RNA polymerases. Transcriptionreactions were carried out in 25 ill containing 40 mM Tris-HC1 pH 7.9, 10 mM NaC1, 6mM MgC12, 2 mM-spermidine, 30 mM DTT, 0.5 mM each of ATP, CTP, and UTP, 501..tM GTP, 50 tCi [a-3213]GTP (3700 Ci/mmol), 25 units of RNasin (Promega), 10 unitsof T3 or T7 RNA polymerase (BRL) and 500 ng of linearized DNA. After 30 min at 37°C, 1 unit of RNAase-free DNase I (Promega) was added and incubated for a further 15min to remove template DNA. The probe was phenol-extracted and ethanol-precipitatedbefore adding to hybridization buffer containing the prehybridized blot.2.8^SequencingComplementary cDNA clones corresponding to TomRSV RNA1 and RNA2 were used astemplates to determine over 99% of the nucleotide sequence by the dideoxynucleotidesequencing method of Sanger et al., (1977). Viral RNA template was sequenced using aMaterials & Methods^42specific oligonucleotide primer to determine additional 5' terminal sequences not present incDNA clones.2.8.1^Preparation of Subclones for SequencingClones 035, K6, 2P6, J27, and parts of G82 and B54 (see Results section 3.3) were usedto prepare subclones to determine the sequence of TomRSV RNA1 and RNA2. Subcloneswere constructed by digestion of these clones, or sublones prepared earlier, with anappropriate restriction enzyme(s). Digests were electrophoresed through 1% agarose gelsin TAE buffer and the bands visualized under UV light after staining with EtBr. Bandscontaining the desired restriction fragments were excised from the gel and the DNArecovered from the gel slices using GeneCleanTM (BioCan) according to manufacturer'sinstructions. Alternatively, fragments less then 400 base pairs (bp) in length wereelectrophoresed through 1% low gelling temperature agarose gels, the desired band excisedand the DNA recovered from the gel slice by the "freeze -squeeze" method (Tautz andRenz, 1983). Occasionally, the ends of the purified dsDNA fragments had 5' or 3' single-stranded extensions created by restriction enzyme digestion which had to be removed priorto cloning. This was accomplished using the single-strand specific nuclease, mung beannuclease (Hammond and D'Alessio, 1986).Purified fragments were ligated into the multicloning site of the phagemidBluescript which had been digested with the appropriate restriction enzyme(s) and treatedwith calf intestinal phosphatase (Boehringer Manheim). The ligation mixture was used totransform competent E. coli DH5a cells and the cells plated on LB agar containingampicillin and Blue-O-Gal. Plates were incubated overnight at 37 °C and white coloniescontaining the cloned insert were used to inoculate 3 ml of liquid LB containing ampicillinMaterials & Methods^43and grown overnight with vigorous shaking at 37 °C. Plasmid DNA was isolated fromthese cultures by the alkaline lysis procedure (Sambrook et al., 1989), and used astemplates in dideoxynucleotide sequencing reactions.Large fragments, ligated into the multicloning site of Bluescript, were used togenerate further subclones by exonuclease III-generated unidirectional nested deletionsaccording to the conditions suggested by Stratagene (Henikoff, 1984). Plasmid DNAtemplates used in these experiments were extracted by the alkaline lysis procedure followedby precipitation with polyethylene glycol (PEG) 8000 (Sambrook et al., 1989). Briefly,plasmid DNA was digested with two restriction enzymes which cleave within themulticloning site between the cloned insert and a primer annealing site required fordideoxynucleotide sequencing. The restriction enzymes were chosen such that cleavageclosest to the primer binding site resulted in a 3' single-strand extension while cleavageclosest to the insert resulted in either a 5' extension or a blunt end. The double-digestedplasmid was then incubated with exonuclease III (BRL) which selectively digests onestrand of the double-stranded DNA containing either a 5' extension or a blunt end but notan end with a 3' extension. Since, under appropriate conditions, exonuclease III willdigest the DNA at a controlled and uniform rate (i.e. ca. 200 bp/min at 30 °C), the removalof aliquots at one minute intervals resulted in a series of nested deletions. Followingexonuclease III digestion, the ends of the plasmid DNA were blunted with mung beannuclease(BRL) and the plasmid recircularized with T4 DNA ligase (BRL). The ligatedDNA was used to transform competent E. coli cells and the transformants selected on L-agar plates containing 100 ug/ml ampicillin. Small scale plasmid DNA isolations(Sambrook et a/.,1989) from colonies selected from each time interval were sizefractionated on 1% agarose gels. Plasmids with overlapping deletions were selected andused directly as templates in sequencing reactions.Materials & Methods^442.8.2^DNA and RNA SequencingThe dideoxynucleotide chain termination method of Sanger et al. (1977) was used tosequence double-stranded plasmid DNA templates using modified T7 DNA polymerase(SequenaseTM; U.S. Biochemical Corporation) using a protocol obtained from thesuppliers, based on the method of Toneguzzo et al. (1988). Plasmid DNA (ca. 2 pg,obtained by the "alkaline lysis mini prep method', Sambrook et al., 1989), and 10 ngsequencing primer were denatured in a 40 t1 volume containing 200 mM NaOH and 0.4mM EDTA by heating at 85 to 95 °C for 5 min. Reactions were quick cooled on ice andadjusted to 200 mM ammonium acetate to which was added 2.5 volumes of absoluteethanol. The DNA was precipitated at -70 °C for 15 min and then centrifuged for 15 mM at14,000 x g in a microcentrifuge. The pellet was washed in 70% ethanol and allowed to airdry by inverting the microfuge tube, containing the pellet, on a paper towel for several min.Pellets were resuspended in 6 pl of H2O and 1.5 III of 5x SequenaseTM buffer (5xSequenaseTM buffer is 200 mM Tris-HC1 pH 7.5, 100 mM MgC12 and 250 mM NaCl) andincubated at 37 °C for 15 min to allow annealing of the primer to the denatured DNAtemplate. To this solution was added 1 p1 of 100 mM DTT, 2 gl lx labelling mix (lxlabelling mix is 1.5 pM each of dCTP, dGTP, dTTP), 2 to 5 pCi of [a- 32P]dATP (3000Ci/ mmole) and 3 units SequenaseTM in a total volume of 13 pl. If longer sequences wereto be read, then 2 pl of undiluted (5x) labelling mix was added instead of the lx labellingmix and the amont of W-3211dATP increased proportionately. Reactions were allowed tostand at room temperature for 2 to 5 min and 3.3 pl aliquots from this reaction added totubes containing 2.5 pl of 80 pM dGTP, 80 pM dCTP, 80 pM dTTP, 80 ptM dATP, 50mM NaC1 and 8 ptM of either dideoxy (dd) GTP, ddCTP, ddTTP or ddATP and incubateda further 5 to 20 min at 37 °C. During this period, newly synthesised strands wereextended further before being terminated with the incorporation of a ddNTP. To eachreaction tube was added 4 pi of a mixture containing 95% formamide, 20 mM EDTA,Materials & Methods^450.05% bromophenol blue and 0.05% xylene cyanol FF. Sequencing reactions weredenatured in this mixture by heating at 85 to 95 °C for 3 min before loading onto thesequencing gel. Sequence ambiguities resulting from compressions were resolved bysubstituting di IT for dGTP in the sequencing reaction (Tabor and Richardson, 1987). Theincorporation of dITP instead of dGTP can reduce secondary structure which can result inband compressions during electrophoresis.All portions of the RNA1 and RNA2 nucleotide sequence were determined bysequencing both strands of the cloned cDNA at least once, except for the 5' terminal 32 and28 nucleotides, respectively, which were not present in any cDNA clone analyzed. Thesenucleotides were sequenced in one direction only using a synthetic oligonucleotide primer(5' GCCTTCGATGGAACC 3') complementary to nucleotide positions 115-130 of theviral RNA), viral RNA template, and Moloney murine leukemia virus reverse transcriptasein the presence of dideoxynucleotides (Ahlquist et al., 1981). An internal region of theRNA2 sequence was also determined in this manner using a different syntheticoligonucleotide (5' TTCTGGTTCCTCTTCC 3' complementary to nucleotides 2201- 2216of RNA2) in order to confirm the nucleotide sequence determined from the cDNA clonescorresponding to this region.Sequencing reactions were electrophoresed through 6% polyacrylamide wedge gels(0.2 to 0.6 mm gradation) containing 7.7 M urea for 2-7 hours at constant power (50-55watts) according to manufacturer's contitions (BioRad). Gels were prewarmed to 45-55 °Cby applying 45-55 watts of power through the gel for 30-60 min before the sequencingreaction samples were loaded. Following electrophoresis, the gels were transferred ontoWhatman 3MM filter paper and dried under vacuum at 80 °C for 1 hour and then exposedto Kodak X-OmatTM film overnight at room temperature. On average, ca. 250 nucleotidesof sequence was obtained from each sequencing reaction using one loading although insome cases it was possible to obtain as much as 400 nucleotides of sequence from twoloadings.Materials & Methods^462.8.3^Sequence AnalysisData from the autoradiographs was digitized directly into the Gene-MasterTM (Bio-RadLaboratories) sequence analysis software package running on a Compaq 386 computer.Sequences were assembled using the "shotgun handler" program which automaticallylocates and assembles the overlapping sequences. Sequences were analyzed using anumber of programs available through Gene-MasterTm package and programs anddatabases available through the National Research Council Numeric Database Service(SND) which operates on a Digital Equipment Corporation (DEC) VAX 780 running VMS,and also the GeneWorksTM (Intelligenetics) sequence analysis package on a Macintosh Hsicomputer. Amino acid sequence alignments were generated using the Needleman-Wunschalgorithm (Needleman & Wunsch, 1970). Alignments of conserved amino acid sequencedomains in the TomRSV polyprotein sequences to those present in other viral proteins weremade by eye based on the consensus sequence of these domains. Multiple sequencealignments were made by a progressive alignment procedure which uses the Needleman-Wunsch algorithm iteratively to generate a multiple sequence alignment (Feng & Doolittle,1987) and further adjusted by eye. Amino acid sequences were compared using theTFASTA program (Pearson & Lipman, 1988) with sequences in the protein sequencelibraries of the NBRF, Swiss-Prot and Pseqlp and the nucleotide sequence libraries ofNBRF, GenBank and EMBL. The 5' region between two potential AUG initiation codonswas analyzed for a possible coding function using the program TESTSCORE (Fickett,1982) with a window size of 67. This was followed with the "absolute positional basepreference method" (Staden, 1984), using a window size of 50, to determine the codingframe. Goodness of fit between the amino acid composition of the TomRSV coat proteinand the calculated amino acid composition of segments of the TomRSV RNA2-encodedpolyprotein sequence by chi-squared analysis, was made possible using a computerprogram kindly written for this purpose by Bill Ronald and Jim Purvis of AgricultureMaterials & Methods^47Canada. The TomRSV RNA1 polyprotein was analyzed for hydrophobic regions by themethod of Kyte & Doolittle, (1982) using the parameter settings recommended by theauthors. The TomRSV RNA1 polyprotein was also analyzed for transmembrane spanningdomains by the method of Argos et al., (1982) using the parameter settings recommendedby the authors.2 . 9^In Vitro Translation AnalysisIn order to determine which of two in-frame AUG triplets in the 5' region of TomRSVRNA could act as translation initiation sites, RNA transcripts corresponding to theTomRSV RNA2 5' region and a truncated RNA2 open reading frame were translated invitro. The size of the translation products were then determined and compared with theexpected sizes of the products that would be initiated fron the two AUG sites. Details ofthese procedures are described below.2.9.1^In Vitro TranscriptionsRNA transcripts were synthesised in vitro from plasmid pMRD14A (see Results section3.12.1) using bacteriophage T7 RNA polymerase (BRL) (plasmid DNA was purified onCsC1 density gradients, Sambrook et al., 1989). Reaction conditions were thoserecommended by the supplier of the enzyme (BRL). After RNA synthesis was completed,the DNA template was degraded by adding 0.2 units RNase-free DNase I (BRL) at 37 °Cfor 15 min. Transcripts were then purified using RNaidTM (BioCan) according tomanufacturer's instructions. The amount of RNA synthesised was determinedMaterials & Methods^48spectrophotometrically. Integrity of the RNA was determined by agarose gelelectrophoresis followed by visualization under UV light after staining with EtBr.^2.9.2^TranslationsIn vitro translations were performed in the presence of [ 35S]-methionine (New EnglandNuclear; specific activity 1100 Ci/mmole) in wheat germ extract translation systems(Promega) and also rabbit reticulocyte lysate cell-free translation systems (Promega)according to manufacturer's instructions. Reactions were carried out in a 25 tl volumecontaining 2 lig of synthetic RNA and 5 tCi of [35 S]-methionine for 1 to 2 h at roomtemperature.^2.9.3^Sodium Dodecyl Sulfate Polyacrylamide Gel Electrophoresis(SDS-PAGE)Translation products were analyzed by sodium dodecyl sulfate polyacrylamide gelelectrophoresis (SDS-PAGE). Products were separated on mini 15% polyacrylamide gels(0.75 mm thick) using the discontinuous Laemmli buffer system (Laemmli, 1970) asdescribed by BioRad. Gels were then fixed in 3 changes of 30% methanol/10%o acetic acidfor 30 min each with continuous agitation and then fluorographed with Entensify (NEN)according to manufacturer's instructions. Gels were dried at 80 °C under vacuum for 1 hon Whatmann 3MM paper and then exposed to X-ray film at -70 °C overnight.^Materials & Methods^492.10 ^Transient Expression of cDNA Constructs in ProtoplastsTranslation initiation of TomRSV RNA was further assayed in protoplasts. The plasmidconstruct 35SGUSO4 and its derivatives AAUG78, AAUG441 and AAUG78 +441 (seeResults section 3.12.1 for details on these constructs) were tranfected into protoplasts andthe relative levels of expression from the two AUG triplets determined using the reportergene 13-glucuronidase (GUS) which was fused downstream and in-frame with the twosites.2 . 1 0 . 1^Site-Directed MutagenesisSite-directed in vitro mutagenesis of templates, which were eventually used for translationstudies, was performed using a modified version of the Bio-Rad "Muta-Gene Phagemid InVitro Mutagenesis" procedure based on the method of Kunkel (Kunkel et al., 1987). TheEcoRVIApal fragment from 35SGUSO4 (see Results, Fig. 3.33) was subcloned into thephagmid Bluescript and used to transform the dut - , ung - E. coli. strain BW313. Single-stranded DNA was rescued using the helper phage M13K07, as described in the Bio-Radmanual, and used as a template for in vitro mutagenesis. Mutagenic oligonucleotide primedDNA synthesis was performed as described in the Bio-Rad manual except that modified T7DNA polymerase (Sequenase; U.S. Biochemicals) was substituted in place of T4 DNApolymerase. The following oligonucleotides were used #9 (5'AACAAATGGAGGAATTCAAAAGAAAAGAAA 3' mutated sites are underlined) and#10 (5' AAGAACAACCTCaAaTGCAGTGGAGG 3' mutated sites are underlined)complementary to TomRSV RNA2 nucleotide positions 69-94 and 430-455, respecively,in the mutatgenesis reactions. Plasmids carrying the desired mutations were screened forby restriction enzyme analysis, as each introduced mutation created a unique restrictionMaterials & Methods^50enzyme recognition site. Plasmid inserts carrying the new restriction enzyme recognitionsites were then sequenced to confirm the mutated site, as well as to ensure that no othermutations had been introduced during the mutagenesis procedure.2.10.2^Transfection of ProtoplastsPlasmid constructs were transfected into Nicotiana plumbaginifolia protoplasts by themethod of Vankan et al., (1988). Briefly, N. plumbaginifolia plants were grown in sterileculture and the leaves of young plants used as a source of protoplasts. Leaves were cutinto thin strips ca. 1-2 mm and overlaid with W5 media (see Vankan et al., 1988)containing 10 mg/ml cellulase R-10 and 1 mg/ml macerozymeTM R-10 (YakaltPharmaceutical IND Co. LTD) and incubated overnight at 25 °C in the dark. Followingovernight digestion of the cell walls, protoplasts were passed through a sterile 100 micronscreen and further purified by centrifugation on a sucrose cushion for 5 min at 500 x g (seeVankan et al., 1988). Protoplast concentration was determined using a hemocytometer andadjusted to a final concentration of 2 million/ml in MaCa media (see Vanakn et al., 1988)immediately before use. Protoplasts (0.6 million) were transfected with 20 to 40 tg ofplasmid DNA in the presence of 20% polyethylene glycol (PEG) 4000 (Sigma) in a totalvolume of 0.6 ml for 1-5 min and then diluted with 4 ml of K3 media (see Vanakn et al.,1988). Protoplasts were incubated at 25 °C for 20 h in the dark, pelleted by centrifugationat 500 x g for 10 min and the supernatant discarded. The protoplasts were lysed byrepeated freezing in liquid nitrogen and thawing at 37 °C in lx GUS extraction buffer (50mM sodium phosphate, pH 7, 10 mM DTI', 1 mM EDTA, 0.1% sodium lauryl sarcosineand 0.1% Triton X-100).Materials & Methods^512.10.3^GUS AssaysThe lysed extracts were assayed for (GUS) activity using either fluorometric orspectrophotometric methods as described by Jefferson et al. (1987). Fluorometric assayswere performed using crude extracts in 1 mM 4-methylumbellifery113-D-glucuronic acid (4-MUG) (Sigma), incubated at 37 °C for 5 to 30 mM and then stopped by the addition of 0.2M sodium carbonate, pH 9.5. Fluorescence was detected by excitation with UV light (320nm). Alternatively, GUS activity was determined using the spectrophotometric substrate p-Nitrophenyl B-D-glucuronide (pNPG) (Sigma) and 415 nm wavelength light. After celllysis, the extract was centrifuged at 4 °C for 10 mM at 14,000 x g in a microcentrifuge andthe total protein concentration in the cleared supernatant determined by the method ofBradford (Bradford, 1976) using the BioRad protein assay according to the manufacturer'sinstructions. To 40 ng of total protoplast protein was added 0.01% BSA, 0.02% sodiumazide and 1 mM pNPG in a total volume of 400 p1. The reactions were incubated at 37 °Cand 50 p1 aliquots removed at precise intervals (0.5 - 1 h) and stopped with 1 M ammediol.Each construct was transfected in triplicate using the same batch of protoplasts.Spectrophotometric values from each transfection were plotted against time and the rates ofGUS activity determined using the computer program Cricket GraphTM (ComputerAssociates ) on a Macintosh computer. The average rate of GUS activity for each constructwas determined from the three replicates along with the standard deviation. The averagerate of GUS activity from protoplasts inoculated with plasmid 35SGUSO4 was abritrarilyset to 1 and the average relative rates of GUS activity for the other constructs calculated andplotted.RESULTS3.1 Synthesis and Mapping of cDNA Clones to TomRSV RNA1 andRNA2Double-stranded (ds) complementary DNA (cDNA) was synthesized from a mixture ofTomRSV RNA1 and RNA2, tailed with dC residues and annealed to Pstl digested dGtailed pUC9. The annealing mixture was used to transform competent E. coli DH5a cellswhich were then plated on L-agar containing ampicillin and Blue-O-Gal (BRL). Severalthousand white ampicillin-resistant colonies were obtained. Seven-hundred colonies werescreened in a colony hybridization test using a randomly primed cDNA probe to gel-purified RNA1. These 700 colonies, plus an additional 300 colonies, were then furtherscreened with an RNA2 probe. Colonies were selected that gave the strongestautoradiographic signals and from these colonies plasmid DNA was isolated. Plasmidswere digested with the restriction enzyme PstI and the size of the cloned cDNAs determinedby agarose gel electrophoresis. The selected colonies contained plasmids with insertsranging in size from 1.7 to 5.5 kb. Plasmids from 16 colonies that hybridized with cDNAfrom RNA1 and 10 that hybridized with cDNA to RNA2 were analyzed by restrictionenzyme mapping. Fig. 3.1 shows the restriction maps of eight inserts from these clones:B54, G82, 146, J27, J54, D48, J56 and K6. Two of the 300 colonies that hybridized toRNA2, but were not initially tested for hybridization with RNA1, were later determined byrestriction enzyme mapping and further hybridization tests to be derived from RNA1 (see146 and J27 in Fig. 3.1).The orientation of the cloned inserts relative to plasmid DNA sequences wasdetermined by restriction enzyme mapping. The 5' to 3' orientation of viral RNA relativeto the plasmid inserts was determined by hybridizing [ 32P]-labelled in vitro transcriptscorresponding to the plasmid DNA inserts to Northern-blotted virion RNA (Fig. 3.2). The52Results 53RNA18 7 6VPg 1 1 1Smal^Sstl25P6 _111SstlB54Smai^Sstl EcoRI1ISstlIHind!!!HindlIl EcoRIG82 I I146^ 1^ 1^1 1 Film IBamH1 EcoR1Ek t mH1 1EgcM_Na,!i  NMitl^1111^ I fr^PtlJ27^ Pst^sBam H1 EcoR1I^I^/*Lull^I^IRNA2 EktmHI EcoRV 7^6^5^4^3^2^1VPg ^I I I I I I I Xmnl^AAA (n)Smal^Sstl^IA atl035 ^I^1 I / \^Hindi II Hindi II Xmnl^Hindi!! Pstl^NattIA atl^Sstl Sm al Sm alI l IEcoRIN) ni^EcoRI^1,44K6 I^1^I^1^1^I EcoRV^1^EcoRI Hi^I^1nd! I I EcoRV Bantll Batn1-11 BamHIHind!!! PstlSm alI lEcoRIKP ni^I ^I EcoR1 NaziKp n1 Ec oRI^PMI ^1^I ^I I^IHindu! EcoRV EhmHl Hind!!! Pstl^NatiSm alI lEcomKp nl I EcoRl^Pstl" 1 1^I IEcoR1^Hindu!! I^EcoRVEtimH1 J54D48J565^4^3^2^11 1 I 1 1AAA 0) Xhol Pstl^Hind!!!1^1^II^IBamHl EcoRI^Nall^1^NailHindlIl ^EcoJEL1^egjResults^5 4FIG. 3.1 Partial restriction enzyme cleavage maps of TomRSV cDNA clones andtheir locations relative to viral RNA1 and RNA2. Underlined restriction enzyme sites are ina similar sequence context between cDNA clones derived from RNA1 and RNA2. Thenumbers on the thick RNA1 and RNA2 bars correspond to the approximate size inkilobases from the 3' termini of each RNA. 25P6 and 035 were clones obtained using aspecific oligonucleotide primer, respectively, whereas the remaining eight clones wereobtained by priming with oligo (dT).Results^5 5T'7^T3RNA1RNA2 Fig. 3.2 Northern blot of TomRSV virion RNA probed with syntheticsense and antisense RNA synthesized from either theT7 or T3 promoters ateither side of a subclone, derived from the cDNA clone J27. The blot waswashed at high stringency ( 0.1 xSSC 60 C) and exposed for 6 hours. usingintensifying screens.Results^56largest Pstl fragment of clone J27 was excised and subcloned into the transcription vectorBluescript and [3213]-labelled RNA transcripts corresponding to both DNA strands weregenerated from the T3 or T7 promoters using T3 or T7 DNA-dependent RNA polymerases,respectively. Only the transcript originating from one promoter (the T7 promoter)hybridized, indicating that this was the antisense strand to viral RNA. The orientation (5'to 3') of all other clones was then determined by comparison of restriction maps.Clone 035 (see Fig 3.1), which corresponds to the 5' terminal region of TomRSVRNA2, was derived using a synthetic oligonucleotide primer complementary to RNA2 at aregion ca. 400 nucleotides from the 5' end of clone K6 (see Materials & Methods). Anadditional clone corresponding to the 5' termini of TomRSV RNA1 (25P6, see Fig 3.1)was obtained using the same primer but was the result of non-specific priming. Clone25P6 was found to hybridize to both TomRSV RNA1 and RNA2 in Northern blots (notshown). However, the restriction enzyme map of 25P6 matched that of the RNA1 specificclone B54 but was distinct from that of the RNA2 specific clone 035. To confirm that25P6 was derived from RNA1, the region 5' to the HindIII site of B54 was partiallysequenced in one direction and found to be identical to the corresponding region obtainedfrom 25P6.3.2 Nucleotide Sequence Similarity Between RNA1 and RNA2After the initial cDNA cloning and mapping experiments, comparison of restriction maps ofclones corresponding to RNA1 and RNA2 showed that the order and spacing of five 3'-terminal restriction sites (Pstl, EcoRV , EcoRI and HindIII) are identical (see Fig.3.1). To determine whether RNA1 and RNA2 share nucleotide sequence similarity at their3' termini, TomRSV virion RNA was electrophoresed from a single wide well through adenaturing agarose gel and blotted to diazotized paper. Strips cut from the blot were testedResults^5 7for hybridization to nick-translated B54, G82, 146, J54 and J56. The autoradiograph (Fig.3.3) showed that plasmids G82 from RNA1 and J54 from RNA2 (which do not containthe 3' terminal region of RNA1 of RNA2) hybridized only to their respective viral RNAs.Plasmid B54, corresponding to the extreme 5' end of RNA1, hybridized to RNA1 asexpected but also hybridized to RNA2, suggesting that there is sequence similarity betweenthe 5' ends of RNA1 and RNA2 (note: clone 035, which corresponds to the 5' termini ofRNA2, was obtained after these experiments were preformed; see section 3.1). Plasmids146 and J56, which contain the 3'-terminal regions of RNA1 and RNA2, respectively,hybridized to both RNA1 and RNA2. This indicates extensive sequence identity betweenthe 3' termini of RNA1 and RNA2 which is reflected in the restriction maps in Fig. 3.1.To determine more precisely the position and extent of sequence similarity betweenthe 3' termini of RNA1 and RNA2, restriction fragments of plasmid D48 and J54 fromRNA2 were digested with various restriction enzymes, electrophoresed, blotted onto Zeta-Probe membrane (Bio-Rad) and incubated with a nick-translated probe of the entire J27plasmid (Fig. 3.4). Hybridization results are also presented diagramatically in Fig. 3.5which shows that the J27 probe hybridizes to fragments which correspond to the 3' terminiof RNA2. All fragments on the 3' side of the HindIII site, common to RNA1 and RNA2,hybridized to the J27 probe but fragments on the 5' side of the HindIII site did nothybridize. This shows that the sequence similarity extends from the 3' termini to beyondthe first EcoRI site (1.4 kb). In control tests no hybridization occurred to the insert of J54or X DNA.3.3 Sequence and Primary Structure of RNA1 and RNA2Four overlapping cDNA clones, 25P6, B54, G82 and J27 were used to determine thenucleotide sequence of TomRSV RNA1. Clones 25P6 and J27 were completely sequencedResults^5 8Fig. 3.3 Northern blot analysis of RNA1 and RNA2 of TomRSV. TomRSV virionRNA was electrophoresed under denaturing conditions, transferred to diazotized paper andprobed with TomRSV cDNA clones B54 (lane A), 146 (lane B), J54 (lane C) and G82(lane D). In a separate experiment a probe of J56 (lane E) was used. Stained agrose gelsindicated that approximately equal masses of RNA1 and RNA2 were present (lane F).Fig. 3.4 Southern hybridization analysis of TomRSV RNA1 and RNA2 clones. (a) Agarose gel electrophoresis ofrestriction enzyme-digested plasmid DNA. Plasmid J54 digested with PstI and KpnI (lane A); X DNA digested with HindIIIand EcoRI (lane B); D48 digested with Pstl and EcoRV (lane C), EcoRV and BamHI (lane D), BamHI and EcoRI (lane E),EcoRI and HindIII (lane F), Hindi11 and Kprzl (lane G); J27 digested with Pstl, EcoRV and HindIII (lane H). Fragments ofplasmids from each lane are numbered in order of increasing size. Fragment no. 4 from lanes 1, 3 and 6, no. 3 from lanes 4, 5and 7 and no. 7 from lane 8 contain pUC9 plasmid sequences. (b) An autoradiograph of a Southern blot of the gel probed withplasmid J27. The blot was washed under stringent conditions (0.06 M-Na+ 60 °C). Bands are numbered as in (a). Lane E'represents a longer exposure of lane E to show hybridization to band 1.Fig. 3.5 A diagrammatic representation of hybridization tests shown in Fig. 3.4. The horizontal bars representplasmid inserts and their restriction fragments are shaded from light to dark to indicate increasing intensitiesof hybridization. The fragments are numbered as in Fig. 3.4. Fragments containing pUC9 DNA are not shown.The following restriction sites are indicated PstI (P), HindIII (H), KpnI (K), EcoRV (EV), BamHI (B), and EcoRI (EI).Results^61while only the region 5' to the internal Pstl site present in clone G82 and the regionbetween the internal Snzal and HindIII of B54 were sequenced (refer to Fig. 3.1). TheTomRSV RNA2 sequence was determined from the two overlapping cDNA clones, K6and 035. In all cases, cDNA clones were sequenced in both directions. Technicalassistance was gratefully accepted from Lawrence Lee (summer COSEP student) duringsequence determination of the region 5' of the first EcoRI site of clone J27 (ca. 1000 nt.)and Angus Gilchrist for sequence determination of the 5' Hind1111Pstl fragment from cloneG82 (ca. 2500 nt) (refer to Fig. 3.1). Fig. 3.6 shows the sequencing strategy used in thesequence determination of RNA1 and RNA2. The nucleotide sequences of TomRSVRNA1 and RNA2 are shown in Figs. 3.7 and 3.8, respectively.Two oligonucleotides, oligo#1 5' GCCTTCGATGGCAACC 3', complementary toTomRSV RNA1 and RNA2 at nucleotide positions 115-130 and oligo#2 (5'TTCTGGTTCCTCTTCC 3'), complementary to nucleotides 2,201-2,216 of RNA2, wereused to confirm and/or extend the sequence obtained by sequencing cDNA clones. Thesequencing reactions using oligo#1 yielded 32 and 28 nucleotides at the 5' terminus ofRNA1 and RNA2, respectively, which were not present in either of the correspondingcDNA clones 25P6 and 035. These sequencing reactions resulted in two strong stoppoints which most likely corresponded to nucleotide positions 1 and 2 of TomRSV RNA1and RNA2. The first two nucleotides of both RNA1 and RNA2 are each denoted N inFigs. 3.7 and 3.8. Oligo#2 was used to prime a sequencing reaction using virion RNA asa template to confirm the presence of 3 tandem repeats present in the overlapping nucleotidesequences of both clones K6 and 035 (see section 3.4 below).TomRSV RNA1 is 8,214 nucleotides in length excluding the 3' poly(A) tail. Thecalculated molecular weight of RNA1 is 5.06 x10 6 daltons [excluding the 3' poly(A) tailand 5' VPg] and compares well with a Mr of 4.80 x106 determined by denaturing gelelectrophoresis (Murant et al., 1981). The base composition of RNA1 is 23.86% A,21.48% C, 24.82% G, and 29.81% U.RNA-2clone K6clone 035 z^---* —_ 1---^, —^__ _—— —*------.^ .----^ --.4-- ---). ---■.1--..____^ --■'-=-Z---- ----. .1-.----.. --a.4---- ----k.^ .------, ----4.----..^ ---■-■• ..___ 4----.I' --.---------------1 1^.1-4--_____.----,^ t------4.4_ ---,.._ T---'' .------. ----1.■--^ --.------ ,. --■^ '---.4--------.■ ---..---'. --Fig. 3.6 Strategy used to sequence RNA1 and RNA2. The long thick lines represent genomic RNA1 and RNA2.The shorter lines represent cDNA clones from which the sequence was determined. The short arrows representindividual subclones and the direction they were sequenced. The 5' termini of RNA1 and RNA2 were determineddirectly from virion RNA using an oligonucleotide primer and dideoxynucleotide sequence analysis.I t, 31MSS ICE AGNNAGCGAAAAAUCUGGUGAUAUUCCAACUUCUCUCAAUUCACACUUCCAUUGUGUCGUUUUGUUUUCUUUUCUUUUGAUGUCCUCCAUUUGUUUCGCCGG^G N H A R LP SK AA Y Y R A I S D R EL DR E G R F P C G C L A^91101 UGCCAACCACGCCAGGUUGCCAUCGAAGGCUGCUUACUAUCGGGCUATJUUCCGAUAGGGAGCUGGACCGCGAGGGUCGCUUCCCUUCCGGGUGUCUAGCAQY T V QA P P P AK T Q E K A VCR S A D L Q K G N V AP L K K Q^75201 CAGUAUACUGUGCAAGCCCCCCCUCCUGCCAAGACACAGGAGAAAGCCGUAGGCAGGUCCGCUGACCUCCAAAAGGGUAAUGUUCCUCCCCUUAAGAAGCR C D V V V AV SG P P P L EL V Y P AR V G Q H R L D Q P SK G^108301 AACGCUGCGAUGUUGUGGUCGCAGUCUCUGGACCUCCUCCUUUGGAGUUGGUCUACCCUGCCCGGGUAGGGCAACAUAGGUUGGACCAACCUUCAAAAGGP LA V P S AK Q T S T A M E V V L S V G E A AL TA P W L L C S^191401 UCCCUUGGCAGUUCCCUCUGCCAAGCAAACCUCCACUGCAAUGGAGGUUGUUCUUITCUGUCGGGGACGCGGCUCUUACUGCCCCCUGGCUUCUCUGCUCCY K S GI/ SS P P p PMTQRQQF AA I K RR L V Q K G Q Q I I R^175501 UACAACAGUGGAGUUUCUUCCCCCCCCCCCCCCAUGACGCAAAGGCAGCAAUUUGCUGCCAUUAAAAGGAGGCUGGUCCAGAAGGGCCAGCAAAUUAUUCEL I R ARK A AK Y A AF A ARK K AA A V A A Q K AR A EA P^208601 GCGAGCUCAUCCGAGCUCGCAAGGCGGCUAAGUAUGCCGCCUUUGCCGCCCGGAAGAAGGCGGCAGCUGUGGCUGCCCAAAAGGCACGAGCUGAGGCUCCR L A A Q K AA I AK I IRDRQLVSL PP P PPP S A A R L A^241701 GCGCCUCGCGGCCCAAAAGGCCGCAAUUGCCAAGAUCCUITCGGGAUCGGCAAUUGGUUUCCCUUCCCCCUCCUCCUCCUCCUUCUGCUGCCAGGUUGGCAA E A E L A SK S A S L Q R L K A F HR A N R V R P V L N N S F P S^275801 GCUGAGGCCGAAUUGGCCUCCAAAUCAGCCUCUCUUCAGAGGCUCAAGGCCUUUCAUAGGGCCAACCGGGUUCGCCCGGUGUUAAACAAUUCUUUTJCCCUP P 1 A C K P D P AL L ERIN L A T P S R C T V A T K R Q R D F^308901 CCCCCCCUUUGGCGUGCAAGCCAGAUCCCGCUCUUCUUGAGCGGUUGAGGCUUGCUACGCCUUCACGCUGCACCGUUGCCACUAAAAGGCAGCGGGAUUU^VA P L A T Q I R V AK CA S H Q E A Y D S C R S I L I E E W P^3911001  UGUUGUCGCCCCCCUUGCCACCCAAAUUAGAGUGGCCAAGUGUGCUUCCCAUCAGGAAGCAUAUGAUUCUUGUCGCUCCAUUCUUAUUGAGGAGUGGCCAE S R Y L F GP L S F V G D W E H V P G M L M Q Y R L C V L F S M V^3751101  GAGAGUAGGUAUCUUTJUCCGACCUCUCUCUUUUGUGGGUGAUUGGGAGCACGUGCCUGGAAUCCUCAUGCAGUACAGGCUCUGCGUGCUGUUUUCUAUGGR D V M P A L SL V A D T L HA L R SGT A P N I V F K N A M S T^4 081201  UUAGGGAUGUGAUGCCUGCGCUUUCUCUCGUAGCAGAUACAUUGCAUGCCUUGAGGAGCGGUACUGCUCCAAACAUUGUUUUUAAAAAUGCCAUGAGCACA NQ I L E C S H S S H A A Q G F G N F L SR GK SA A I NI AS^9911 301 UGCAAAUCAAAUUUUACAGUGCUCGCAUUCCUCUCAUGCACCUCAAGGUUUCGGCAAUUUUTJUGAGUCGAGGCAAGAGUGCUCCUAUUAAUUUAGCUAGUGL S S F V G E K V V S G A N H V V N K A S E V I V D K LEV P F V^47514 01 GGUCUCUCUAGUUUUGUUGGAGAGAAAGUCCUUUCUGGUGCCAAUCAUGIJUGUGAAUAAGGCAUCAGAAGUCAUUGUUGAUAAGCUUUUUGUUCCCUUUGKL L R E H F D D T I GK W I P K L L GA T Q K I EE L W R WS L^5081501  UAAAGCUUUUGCGGGAACAUUUUGACGAUACCAUAGGUAAAUGGAUUCCCAAGUUACUGGGUGCCACACAGAAAAUUGAAGAGCUGUGGCGAUGGUCGCUE W A Q N M S K K L DV S L R V L R GS AL V G V G L L L V SG I1601  UGAGUGGGCGCAGAAUAUGUCUAAGAAAUUGGACCUUUCUCUGCGCGUGCUGCGAGGUUCAGCCCUCCUUGGGGUCGGUUUACUUUUGGUAUCCGGCAUU^541LY F A E Q L L R S F G L L I V A GS F I SEIF V G G C L L A Y AG^5751701  CUUUAUUljUGCGGAGCAGUUGCUUCGCUCUUUUGGCCUGCUAAUUGUAGCAGGUUCUUUUAUUUCUAUGUUUGUAGGAGGCUGUCUAUUGGCUIJAUGCCGS M A G I FDE01,114RVRGI L C E I PELL Y L K A Q P DP F^60 81 801 GUAGUAUGGCUGGAAUUUUUGAUGAGCAGAUGAUGCGAGUCCGCGGUAUULIUGUGCGAGAUUCCCAUGCUGCUUUAUUUAAAAGCGCAGCCAGAUCCGUUF P K K S G G R A P T Q G L T D V F G V P L S I M N A I G D G L V1 901 UUUUCCUAAGAAAUCUGGUGGACGAGCCCCAACUCAGGGGCUCACUGACGUUUUUGGCGUUCCUCUGAGUAUCAUGAACGCUAUUGGCGAUGGGCUAGUG^691H H S L D T L T L M G K F GA A M D N V R K GI T C M R SF V S W L^6752001  CACCAUUCCCUUGAUACUCUUACGUUAAUGGGGAAAUUUGGUGCAGCUAUGGAUAAUGUCCGUAAGGGCAUUACCUGUAUGAGGUCAUUUGUUTTCAUGGCM E H L A L A L D K I T G K R T SF F R E I A 'IL I N F DV E K W^7082101 UUAUGGAACACUUGGCCCUAGCUCUUGAUAAGAUAACUGGCAAGCGCACUUCUUULJUUUCGUGAACUUGCCACCUUAAUUAAUUUUGAUGUUGAAAAGUG^R D S Q Q Y L L AA E I Y V D G D T V V M D T C R H L L D K GL^79122 01 GGUCCGAGAITUCACAGCAGUAUUUGCUUGCUGCUGAAAUCUAUGUGGAUGGUGACACUGUCGUGAUGGACACAUGUCGCCACUUACUCGATJAAGGGACUCK L Q R M M V S AK SGCSENYGR LVGDIVK R I S D I H K R^7752301  AAGCUCCAGCGAAUGAUGGUCAGCGCUAAAUCUGGUUCCUCUUUUAAUUAUGGCCGUCUUGUUGGAGAUCUCCUUAAAAGGUUGAGCGAUUUCCACAAGAS C A S GRA V H Y R LA P F W V 1' L 1' G G P R C G K S IF A Q 5^8 082401 GAUACUGUGCUUCAGGACGCCGCGUGCAUUAUAGGUUAGCACCAUUUUGGGUGUACUUAUAUGGCGGUCCUAGGUGCGGAAAAUCUCUUUUCGCUCAGAGF MN A A V DE MGT T V D N C Y F K NAEDDEWSG Y R Q EA^8912501  UUUCAUGAAUGCACCUGUGGACUUCAUGGGCACCACAGUUGACAAUUGUUAUUUUAAAAAUGCUCGUGACGAUUIJUUGGAGCGGCUAUCGCCAGGAAGCGI CC V DDIS SCE T Q P S I E S E F I Q L I T T14R `I GINM A^8752601  AUAUCCUGCGITUGAUGAUCUCUCCUCCUGCGAAACGCAACCCUCCAUUGAGUCGGAAUUCAUUCAAUUGA UAACGACAAUGAGAUAUGGAUUAAAUA UGGGV E E K G A S F D S K M V IT T S N F F T AP T T AK I A SK A27 01 CAGGAGUUGAGGAAAAAGGAGCCUCAUUUGAUUCGAAGAUGGUUAUCACAACAUCUAAUUUUUUCACGGCUCCAACUACUCCUAAGAUUCCUAGCAAAGC^908A YNDR R HA C I L VQR K EGVA^ SY N P DP A A A A E A M F2801  UGCCUACAACGAUAGGCGUCACGCUUCCAUUCUUGUUCAAAGAAAAGAAGGGGUUCCUUACAACCCAAGUGAUCCUGCUGCUGCUGCGGAAGCGAUGUIJU 94Results^6 4^V D S T T Q H P L S E W M S M Q E L S A EL L L R Y Q Q H R E A Q H^9752901  GUUGAUAGUACUACUCAGCAUCCGCUUUCCGAGUGGAUGAGCAUGCAGGAAUUAAGUGCUGAGUUGUUGCUGCGUUACCAACAGCAUAGGGAGGCUCAGC^A E Y S Y W K S T SR T S H D V F DI L Q K C V N G D T Q W L SL^10083001  AUGCAGAAUAUAGCUAUUGGAAAUCCACUUCGCGCACUUCUCAUGAUGUUULJUGACAUCUUGCAGAAGUGCGUGAAUGGGGAUACCCAGUGGCUAUCACUP V D V I P P S I R Q K H K G N R V F A I D G R I F M F D Y M T L3101 UCCCGUUGACGUUAUCCCUCCGUCUAUUAGGCAGAAGCACAAGGGCAACCGAGUCUUCGCUAIJUGAUGGAAGGAUUUUUAUGUUUGALTUAUAUGACCCUA^1041E Y DE I K E K E N L D A R HL EAR I L E K Y G D T R L L L E K W^10753201 GAGUACGAUGAAAUCAAGGAAAAAGAGAAUCUGGAUGCUCGUCAUCUGGAAGCUCGAAUCCUUGAAAAGUACGGUGACACCCGCUUCCUUUUAGAAAAGUGANG V V A Q F I E Q L L E G P S N V A S L E V L SK D S L ES^110833 01 GGGGUGCCAAUGGAGUUGUUGCGCAAUUUAUUGAGCAACUUCUUGAGGGUCCUUCUAACGUUGCCUCCUUGGAGGUUTJUAUCUAAGGACUCCCUCGAGAGH K E F F S T L G L I ER A T L R A V Q K K I D A AR E D L M H L34 01 UCACAAGGAAUUUUUUUCUACCUUGGGACUUAUCGAGAGAGCUACCUUGCGUGCUGUGCAGAAGAAAATJUGAUGCCGCGCGUGAGGAIRJUGAUGCAUUUG^1141S G L K P G R SL T EL F V E A Y D W V Y A N G G K L L L V L A A V^11753501 UCUGGUUUGAAACCAGGGCGCUCACUUACAGAAUUGUUCGUUGAAGCGUAUGACUGGGUUUACGCCAACGGUGGUAAGCUCCUUUUAGUGCUUGCUGCCGI L I L F F GS A C I K L M Q A I F C G A A G G T V S M A A V G K^12083601 UAAUUTJUGAUUTJUALTUCUUUGGGUCUGCUUGUAUAAAGUUGAUGCAGGCCAUUUUUUGUGGUGCCGCAGGUGGUACUGUCAGUAUGGCUGCUGUCGGGAAM T V Q S T I P S G S Y A D V Y N A R N M T R V F R P Q S V Q G S^124137 01 AAUGACCGUUCAAUCGACGAUUCCCUCCGGUAGUUAUGCAGACGUGUACAAUGCGCGCAACAUGACACGCGUUUUCCGCCCACAAUCUGUACAGGGUUCUSL A E A Q F N E S H A V N M L V R I D L P D G N I IS A C R FRG^12753801 UCUUUGGCGGAAGCGCAAUUUAAUGAAUCGCACGCUGUGAAUAUGUUAGUGCGAAUUGAUUTJACCUGAUGGCAACAUUAUUUCUGCCUGCAGGUUUCGCGK SL A L TK HQ A L T IPPGAK I HI V ITIDNNGNTK AP3901 GAAAGUCUUUGGCUUUGACUAAACAUCAGGCCUUAACCAUACCGCCAGGUGCUAAAAUCCAUALIUGUAUAUACUGACAACAAUGGAAAUACCAAAGCACC^1308L THF F^P T GPNGEHFLRFFNGTEVCI Y SHP QL S4 001 GCUGACUCAUUUUTJUCCAACCUACUGGACCCAAUGGAGAACAUUUUQUUGAGAUUCUUCAACGGGACAGAGGIJUUGUAUTJUAUUCCCACCCUCAACUUUCA^1341AL PGA PQNIF L K D V E K I WI A I K G C G I K L G R TS^137541 01 GCUUUGCCUGGCGCUCCACAAAAUUAUUUCUUGAAAGAUGUGGAAAAAAUAUCUGGUGACAUAGCCAUUAAAGGUUGUGGCAUCAAGCUAGGUAGAACCAV GEC V G V K D N E P V L NHTIR A V A K V R T T K I T I D N Y^14084201 GCGUUGGCGAGUGUGUUGGUGUUAAGGACAAUGAACCCGUCUUAAAUCACUGGCGCGCUGUCGCGAAGGUCCGCACCACCAAGAUCACUAUCGAUAAUUASEGGDISNDLP TS I I SEIVNSP E D C G A L L V A H L43 01 UUCAGAGGGUGGUGAUUAUUCCAAUGAUCUUCCUACGUCCAUCAUCUCUGAGUACGUAAAUUCACCAGAAGAUUGUGGCGCGCUUUUAGUCGCCCAUCUU^1441EGGIK I I GMFIV A GSSIP V EVDGVQMPR I I SHA SF44 01 GAAGGUGGUUACAAAAUCAUAGGGAUGCACGUGGCGGGAUCCUCUUAUCCUGUCGAGGUUGAUGGAGUGCAGAUGCCAAGAUACAUAUCUCAUGCCUCCU^1475F PDY S SF A P CQSSV IK SL IQ EA GVEER GV SK VG4501 UCUUCCCCGAUUAUUCUUCUUUUGCUCCUUGCCAGUCUAGUGUUAUCAAAUCUCUAAUUCAAGAGGCUGGCGUUGAGGAGCGUGGGGUUUCUAAAGUGGG^1508H I K DP A E T P H V G G K T K L E L V D E A F L VP SP V E V K4 601 ACAUAUUAAAGAUCCUGCUGAGACGCCCCAUGUUGGAGGGAAAACUAAGCUUGAAUUGGUUGAUGAAGCCUUCUUGGUGCCAUCACCAGUUGAGGUAAAG^1541I P 5 •I L S K D D P R I PEA ITC G Y D P L G D A M E K FIE PhIL^157547 01 AUUCCCUCCAUUCUGUCUAAAGAUGACCCGCGCAUUCCUGAAGCGUAUAAGGGUUAUGAUCCAUUGGGCGAUGCCAUGGAGAAGUUUUAUGAGCCCAUGUDL^ED V L E S VMA DM YDEF YDCQTT L R I MSDDE V4801 UGGAUCUGGACGAAD^GAUGUCUUGGAGAGCGUUAUGGCAGAUAUGUAUGAUGAAUUCUAUGAUUGCCAAACGACUCUCCGUAUUAUGUCUGAUGACGAAGU^1608I NGSDF GF NI EA VVK G T SEG Y PFVLSRR PGEK G^16414 901 UAUCAAUGGCAGCGAUUUUGGUUUCAAUAUUGAAGCCGUUGUCAAAGGUACUUCUGAAGGCUACCCGUUCGUUUUGAGUCGGCGACCGGGCGAGAAGGGCK AR F LE EL EPQPGDTK PK^L V VGTEVHSA MV AM5001 AAAGCUCGCUUUIRTAGAAGAGCUUGAACCCCAACCAGGUGACACUAAGCCUAAAUAUAAACUGGUUGUGGGCACUGAGGUGCAUUCUGCUAUGGUGGCGA^1675E Q Q A R T E V P L L I G M D V P K DER L K P S K V L E K P K T5101 UGGAACAACAGGCGCGUACUGAAGUUCCUUUGCUUAUUGGUAUGGAUGUUCCGAAGGAUGAGAGACUCAAACCGUCUAAGGUGUUGGAGAAGCCGAAGAC^1708K T F V V L P H H I N L L LRICIVG I LCSSMQVNAHR L A5201 GCGUACAUUCGUUGUUCUCCCAAUGCACUAUAACUUGCUGCUGCGUAAGUAUGUGGGAAUUUUGUGUUCUAGCAUGCAAGUUAAUAGGCAUCGUCUAGCA^1741C A V G T N P Y SR D W T D I IQRL A E K NS V A L N C D 1' SR F^17755301 UGUGCUGUGGGCACUAACCCAUAUUCGCGUGAUUGGACGGACAUTJUAUCAGCGCCUGGCUGAGAAAAAUUCAGUGGCCUIJGAAUUGUGAUUAUAGCCGCUD G L L N Y Q A Y V H I V N F INK L Y N D E H S I V R G N L LM^180854 01 UUGAUGGGCUCCUCAAUUACCAGGCAUAUGUGCAUAUUGUUAAUU1JUAUUAAUAAAUUGUACAACGAUGAACAUUCUAUCGUGCGUGGCAAUCUUUUGAUA My GR W S V CGQR VF EVR AGMP SGCAL T V I I NS L^18415501 GGCUAUGUAUGGUAGGUGGAGUGUGUGUGGGCAGAGAGUUUUCGAAGUCCGCGCUGGCAUGCCCUCUGGGUGUGCGCUCACCGUGAUCAUCAAUUCACUUF NEML IR Y V YR I TVPRPLVNNFKQEVCL IV YGD5601 UUUAACGAAAUGUUGAUCAGGUAUGUIJUAUCGCAUCACCGUACCACGCCCCCUUGUAAAUAAUUUUAAACAGGAGGUGUGUUUGAUUGUIJUAUGGUGAUGD^1875N L I S I K P D T M K Y F N G E Q I K T IL AK Y K V T I T D G S^19085701 AUAAULTUAAUUUCUAUUAAGCCGGACACCAUGAAAUAUUTJUAAUGGUGAGCAAAUCAAAACCAUUCUGGCUAAAUAUAAAGUUACCAUUACUGAUGGCAGD K N S P V L R AK P L K Q L D F L K R G F R V E S D G R V L A P5801 UGALTAAGAACUCACCUGUUCUUAGAGCCAAACCCUUGAAACAGCUCGAUUUUTJUGAAGAGAGGUUUCAGGGUUGAAAGUGAUGGGAGGGUGCUUGCCCCU^1941Results^6 5L DI Q AI Y S S I Y Y I N P Q G N I L K S L F INAQVA L R EL^19755901 UUAGAUUUGCAAGCUAUCUAUUCUUCCCUGUAUUAUAUCAAUCCGCAGGGAAAUAUAUUAAAAUCUUUGUUUUUGAAUGCUCAACUCGCUUUGAGAGAGUY I H G D V E Q F T A V R N F Y V N Q IGGNFISL P Q W R H C^20086001 UAUAUCUCCAUGGCGAUGUUGACCAAUUUACUGCUGUCAGGAAUULTUUACGUCAAUCAAAUUGGCGGAAAUUUCUUGAGUCUACCCCAGUCGAGGCACUGA SF H DE5 Y SWP W S P V K FL EvDVPDAK F L Q H K^204161 01 CGCUUCGUUCCAUGAUGAACAAUAUUCUCAGUGGAAGCCGUGGUCUCCCGUUAAAUUCUUGGAGGUAGAUGUGCCCGAUGCUAAAULTUUUACAGCAUAAGA P A T AL S I V A D R L A V A G P G W R N K DP DR Y L L V S LT^20756201 CCGCCAGCUACUGCCCUCUCGALJUGUUGCUGAUAGGCUUCCUGUUGCAGGACCUGGCUGGCCUAAUAAAGAUCCAGACAGGUAUCUCUUAGUGAGCCUUAS L K A N E G G L Y F P V D Y G E G T G55 A T E A S I R A Y R R^21086301 CCAGCUUGAAAGCAAAUGAGC,GUGGAUUAUACUUCCCCGUAGACUACGGGGAGGGUACAGC,GCAGCAAGCCACGGAAGCUUCCAUUAGAGCUUAUCGUAGL K D H R V R H M R D S W N E G K T I V F R C E G P F V S G W A A^214164 01 GCUAAAGGAUCAUCGCGUACGCCAUAUGCGCGAUUCCUGGAAUGAGGGAAAGACAAUCGUGUUCCGAUCCGAAGGUCCCUUUGUUUCAGGAUGGGCAGCUA I SF G T S V G M N A Q DLL INYGIC2 G G A H K K Y L G R Y F^21756501 GCCAUUUCCUUCGGUACUAGCCUUGGAAUGAAUGCCCAAGAUCUCUUAAUCAAUUAUGGUAUACAAGGUGGCGCCCACAAGGAAUAUCUGGCACGCUACUV GAR F K EL ER Y D R P F Q SR I I A S .^21976601 UUGUUGGUGCUCGUUUUAAAGAGCUGGAGAGGUAUGACCGACCUUUUCAGUCUCGUAUAAUUGCGAGCUAAAUCCUCUUUGAGGCGAGUAGCUGCCGUUA6701 GCAGCUUCCAAAAGGUGGCCUCUUAAUUAGCUUUUAAUAGGGGUUAUCCAGCCUUAAGCAAGCUGGCACCGGUCCUGAUGGACUACCAGGAAAGCACCUG6801 GUUUGGAAGAAUUCGAGUAAAAUUCUUAAAUCUUGUUUACUCGUGACUUAUAGUACAUUCAAGAGGAAUGACUCAUGUUUUGUUUAUUUACAUGAUGGCA6901 UAAAGAGUUAACGGCUCAUAUGGUCCUCAUIJACGUUCAAGUGUUGAAGGAUCCAAUAGCCUUGAACUGUGGUGCCAUGUGAGGAAAUCCACCUUAUCUCU7001 GAUUGUCAAAAUAGACUAGUCUAGGAGACGAUAAAUCCUAUGUGC,GUGAGUCCCACUCUGGCGAGACACGCAGUGCCUUULTAUUUGUUUGAGGUUAUCAA7101 ACAUCAUAUCUUGAGUCUGCAUUUAAAUUCGAAUAAUGUAGUUGUCAUAGCCUACCGAUGAGUCUGCGAGAAAGGUUCCAUGAGGACUUGGGUUGGCUAA7201 CCCCCACUUAAUCUCUUCAUAGAUCAUUCOACAGUGUGUCGAGAAACUAUGGGUUUUGACACCUUAAGGGAAGCGAGAGUUCUCGUAUGGAUAUCACUCU73 01 AAUCAUGGUACUUACCAUGCCAUGUUUAGAGUAAAUCAUCGCCUCGACGGUGUGAUACUUUCCUUAAGUUCUAGUCAAACGAUAGUUUCGUUGAUCGUAU7401 CUGAAGCGUGGAGCGAGUUCGAAACGAACUGAUUACCCGAGGUAGGACGCUAUUGUUCCAGGCGUUUCUUAUGC,GCAUAAGCUGUAAACUUGGUUUCGCA7501 AGCCAUGCAGCACCUCCCGUUACULIGUGUACUUUCUAGC,GGCUCCCGGCCUUCCUUCCGGUACAAUACCUAGUGAAGCAAGUAAULIGCGUUGAGGGAUAA7601 GAGUAGCAUGUUCCUACUUAAGGAAGGAAUAUGUCGUGUUUUCCACACGUUAGUGUUGCAAUGCUGUAAUGGCACUGCAGUGCAC,CAAUC,GUUCCCAGCC7701 ACUUUUUCUCGGAUUCUAAUCGUACGUCACAAUUGUGUGUGUAUCGUUGACGGAGGAGUAGCGAUCCUCUACCACGCGAGUCUGGAAGUGAUUACCAGGG7801 CCUAAGAUGGCCACCACACGGUACGAUUAAAUUUAGCUGUAAUGUAGUGGUAUGUUAAGUUGAGACUAACUUACCCGUACGAGUUAAACUCUAAGAUGGA7901 UGUGUGUUCUGCCAUCULAGAGGAAGUAGAUGUGUUULTUACCAAUCUGAGACGAGCCGUUAAUUCGGUGCUUUAAUACGUCAAUGAUAAUACUCGUGCAG8001 UUCCAGCUGCACGAGUAUGUUGGUACACACACUCUACUCGGAUACCGUCGAGUUACCCUCACAAUAGGGAUUACUCUCUCAAUCUUAACUACUGCAAGGA8101 CGUUGUUUUCGCAGGGUUUUGUUGGUCCGUUUGUGUUUCAAAACGCUGCUUUGCAAULJUUCUUUUUUGUUUUAUUGCUUUCGUAGUGUCGAACUULIGUCC8201 AAGUUCAUAAAAGC poly (A ) 1Fig. 3.7^Nucleotide sequence of TomRSV RNA1. The first two nucleotides are undetermined and are eachrepresented by an N. Also shown is the predicted amino acid sequence encoded by the long ORF beginning at thefirst AUG codon at residue 78 and terminating at the UAA stop codon at position 6669. Numbers to the left of thesequence refer to nucleotide sequence position and numbers to the right refer to amino acid sequence position.Results^6 6MS S I C F A G G N H A R L P 151 NNAGCGAAAAAUCUGGLIGAUATJUCCAACUUCUCUCAKUUCACACUUCCANUGUGNCGIJUUUGUUUUCTJUUUCUUTJUGAUGUCCUCCAIJUUGUUUCGCCGGUGC'CAACCACGCCAGGUUGCS K A A Y Y K A I S D R EL D R E G R F P C G C L A C I T V Q A P P P AK T Q E 55121 CAUCGAAGGCUGCULJACUAUCGGGCUAUUUCCGAUAGGGAGCUGGACCGCCAGGGUCGCHUCCCUIJGCGGGUGUCUAGCACAGUAUACUGUGCAAGCCCCCCCUCCUGCCAAGACACAGGK A VG R S A D L Q K G N V A PL K K Q R C D V V V A VS G P PP L E L V Y P A 95291 AGAAAGCCGUAGGCAGGUCCGCUGACCUCCAAAAGGGUAAUGUUGCUCCCCUUAAGAAGCAACGCUGCGAUGIJUGUGGUCGCAGUCUCUGGACCUCCUCCUULJGCACUUGGIJCUACCCUGR V G Q H R I D Q P S K G P L A V P S AK QT ST A N E V VISA EE A A I T A 135361 CCCGGGUAGGGCAACAUAGGIRIGGACCAACCUUCAAAAGGUCCCUUGGCAGUCCCGUCUGCCAACCAAACCUCCACUGCAAUGGAGGUUGUUCUIJUCUGCUGAGGAGGCGGCUAUCACCGP HI IR P C K GE A P PP PP L T Q R Q Q F AA L K K R L A V K G Q Q I I R E 175481 CCCCCUGGCUIJCUUCGCCCCUGCAAGGGCGAAGCCCCCCCCCCCCCCCCCCUIJACACAGAGGCAGCAAUOUGC1JGCCCUAAAGAAGAGGCUGGCCGIICAAGGGCCAGCAAAUCAUUCGCGH I R A R K A A K Y A A I AK AK K A A A L A A V K A A Q E A PR L A A C K A A 215601 AGCACAUUCGUGCUCGCAAGGCGGCCAAAUAUGCCGCCAUCGCCAAAGCCAAAAAGGCUGCGGCUCUUCCUGCCGUUAAGGCAGCGCAGGAGGCUCCLICGCGIJCGCGGCCCAAAAGGCCGI SK I L R DR D V A A L P P PP P P S A AR LA AE A EL A SK A E SLR R L 255721 CCAUCAGCAAGAUCCUUCGGGAUCGAGAUGUNGCUGCUCUCCCCCCUCCCCCCCCUCCUUCUGCUGCCAGAUUGGCAGCCGAGGCCGAAIJUGGCCUCAAAGGCCGAGUCUCUCCGGAGGCK A F K T F SR V R P A LNT SF PP PP PP P P AR SS EL LA AF EA ANN 295841 UCAAAGCCUITUAAAACUUNUAGCAGGGUACGCCCUGCUULJAAACACUUCUUUUCCUCCCCCUCCCCCACCCCCUCCGGCUCGGUCIJIJCCGAGCULTUUGGCAGCUUSIJGAGGCUGCCAUGA• SQPVQGGF S LP T R K G V 1V A P T V QGVVR A GL R A QK GP L N A 335961 ACAGGUCUCAGCCUGUT1CAAGGGGGCUUCUCCCUCCCUACCCGCAAGGGIJGUUUAUGUCGCUCCCACCGUUCAGGGUGUGGUGCGCGCUGGGCUUCGUGCCCAGAAGGGCUUMUUGAAUG^ S T GI V A G A R I L K SK SQNWF RIRSNGI AHD Y V E G C M A S T V L 3751081 CCGUCUCCACCGGCAUUGUGGCUGGAGCUCGCAUUUUAAAGAGCAAAAGCCAAAAUUGGUUUAGGAGGAGCAUGGGCAUUGCCCAUGAIMAUGIJUGAAGGAUGCAUGGCCAGCACUGUUUG C AG P V VC3 RQE A C S V V A A P PI V E PV L H VP P L SE Y A N D F P K 4151201 UAGGIMGUGCUGGGCCUGIJUGUGCAACGACAGGAACCUUGCAGCGULIGUUGCUGCACCLICCUAUAGUGGAGCCCGUIJUUGUGGGIJUCCCCCAUIJGAGCGAGUACGCUAACGAIJUIJUCCUAL T CS T F TEHQRPRICOSI AI SNIFIRK L I DR A L L V S G V S L I A 4551321 AGCUUACUIJGCUCUACUUSUACUGAAUGGCAAAGACCGCGCAAGCAAUCUAUUGCCAUUUCUAACCUUUUCCGCAAGCUCAUUGAUCGUGCUUUCCUUGUGAGUGGCGUUAGC1JUGAULIGS V LL F E I A E N F A V R QA VC P V E M P SC A T S V S E K S L V S L D E G 4951 941 CGAGCGUCCUCCUAUUUGAGAUUGCGGAAAALJUUMGCCGUCCGACAGGCGGUUUGCCCAGUGGAAAUGCCGLICUUGCGCAACCUCUGUUUCGGAGAAAUCCUUAGUCUCAUUGGAUGAAGN F Y L R K Y L SP P P Y P F G R ES F Y FQAR PR F I GPFIP SNIVR AV P 5351561 GGAAU1RJUUAUCUCCGUAAAUAU1JUAUCACCACCUCCCIAUCCUUUUGGUAGGGAAAGIJIYUCIJAIIIJIJUCAAGCUAGGCCCCGUIRMAUUGGGCCUAUGCCUUCUAUGGIJUAGGGCUGUACQ I VQQP TNT E EL BF E V P SSW S S P L P L F A NF K VNIRGACF L {35751 681 CACAAAUUGIJACAACAGCCUAGCAUGACGGAGGAACUCGAAUUUCAAGUUCCUUCCIJCAUGGUCUUCUCCUUUGCCUCUGUUCGCGAAUUUITAAAGUGAAUAGGGGCGCAUGUUUSUUGC^ L PQR V VI P D E C M D L LS IF EDQL P E G P L P SF SK SS PL PL F 6151 801 AAGUCCUGCCUCAAAGGGUUGUULIUACCCGAUGAAUGCAUGGAIJUUCCUUUCUCUITIMUGAGGAUCAAUUGCCGGAAGGGCCUUMGCCUUCCULTUAGIJUGGUCUUCACCUUUACCUCUAIJA NF K V N R G AC F LCVLPQNVVIPDECNDLISL F ED41 P KG P 6551 921 UCGCGAAIJUIJUAAAGUGAANAGCGGISGCAUGUUNUU1JGCAAGUCCUGCCUCAAAGGGIIUGUNUIJACCUGAUGAAUGCAUGGAUCTUGCUUUCUCIJUUMGAGGAUCAAUUGCCGGAAGGGCL P SF SIR S 5 PL P L F A S F K V N R G A C F L QV L P AR K V V S DE F M D 6952041 CUUUACCUUCCUSUAGUUGGIJCUUCACCUUUACCIJCUAULICGCGAGUUIJUAAAGUGAAUAGGGGUGCAUGCUUIRJUGCAGGUUUUGCCUGCGCGCAAGGUCGLICUCUGALIGAGUT1CAUGG^ I P F L F S P L V SHC2 E E E P E M V P AVIS A ADS V G D I T E A F F D D 7352161 ACGUCUUGCCCUUUUUGUUCUCUCCACIJGGUAUCGCACCAGGAAGAGGAACCAGAAAUCGUUCCIJGCUGUGUUCGAAGCAGCAGAIJUCAGUUGGUGAIJAUCACCGAAGCCUIJCUUUGAUGL E C E S F YID S Y S D EKE A E H A E V P R C K T M S E LC A S LT L A C C A 7752281 ACIIUGGAAUGUGAGUCUUUCUAUGACUCAUAUUCUGAUGAAGAAGAAGCUGAGUGGGCUGAAGUGCCAAGGUGUAAGACIJAUGUCUGAACUIJUGCGCUUCUCUUACLICUUGCCGGIJGACGE GIRK SWF L K FLVTY LQ SF EE PL Y S sR A F Y S V K V K PV Y 8152401 CCGAGGGGCUACGUAAGI/CUCACGGGGITUUUMUUAAAACGCCUUGUGACUUACCUUGAGUCGUUCGAAGAGCCACUULTACUCUUCACGCGCCUUMUAUAGCGUGAAGGUGAAGCCAGUUU• K K F E GHIDCPCIDGNMG EWEN/12 E S VDANIVINC PG R L I N T 8552521 AUCGUCCCAAAAAAUUSGAGGGACANAUCGAUUGUACCUGCCUCGAUGGCAAUAUGGGCGAGUGGGAAUGGCGCGAAAGCGUCGACGCUAUGUGGCGUNGCCCAGGGCGCLTUGCUCAAUAT F T R OCHE R VQY LAI G F WAIF F2 N H R V I N I E ENNIS L 8952641 CAAAGCGCACAMJCACUCGCGAUGAUUGGGAGCGUGUGCAAUAIJIAJACGCAUAGGCUUCAAUCAGGGUAGAUACCGCCGAAALjUGGCGCGUIRIOAAACCUUGAGGAGAUGGAUCUCUCUUH EY PEI SSA P V Q S S L F S N V V D N G A T L A SS I P F V T R S N C Q S 9352761 UGCAUGAAUAUCCAGAGAUUUCGUCUGCCCCAGUACAGUCCUCUCULJUUUUCGAGGGUUGUCGAUAGGGGAGCCACCUUGGGAAGUAGUA UCCCCUUUGUIJACUCGCUCUAACUGCCAGI1• G T PG I N V H T I H(2EA P T T I R A P P F TG A R N V M G S S DA GA N 9752881 CUUCLICUAGGAACUCCUGGIMAAAUGUAGAUACUAUACACCAGGAAGCCCCUACUACAUUGCGAGCUCCACCUUUSACACGAGCGCGCAAUGUAAUGGGA UCCTJCUGALJGCGGCLJGCAAA A P Y R S E A R K RHISRKQEDSQEDNI K R Y A D K MCI SF KEAR10153001 AUGCUGCUCCGUACCGCLICAGAACCGCGCAAGCGCUGGCUGAGUCGUAAACAAGAGGAUUCCCAAGAAGAUAACAUIJAAGAGGUALIGCCGANAAGCACGGCAIJUUCCUUUGAAGAGCCUAA V YK A P K E G V PT QR S I L P D V R DA Y S A R S A G A R V R. S L F GGS10553121 GAGCUGIJC1JACAAGGCCCCAAAGGAAGGAGUGCCUACCCAGCGCUCCAUUCUGCCUGA UGUUAGAGAUCCUUAUUCUGCCCGUI1CUGCUGGCGCUCGGOUUCC,GIJCCCUCTIUCGGAGGAUP T T R AQR T E D F %/LT SF S A G D A S S F SF Y F N P V S E Q E M A E Q E 10953241 CCGCUACCAGGCGCGCACAGAGGACGGAAGAUUMUGUGUlTAACGAGCCCGUCUGCGGGGGAUGCAAGCUCGUMAGUUULTUAUUUUAAUCCCGULJUCUGAGCAAGAGAUGGCUGAGCAAG• G G N T N I S L ID A V E V VI I) PV G M PGIDDT ID L T V M V INICQNSCD11353361 AGCGUGGUGGUAAUACIJAUCCUGUCUCUUGAUGCGGIJUGAGGI/CGIJUAUCGACCCAGUIJGGCAUGCCUGGUGAUGACACUGAUljUGACUGUI1AUGCUCLYUGUGGUCUCAAAAUUCAGAUGQ R A L I G ANS T F VGNGL A R A VR Y P G 1 K L L y A NCR^ CCV RGR V 1 11753481 A UCAGCCCGCUCUGAUCGGGGCCAUGUCUACIJULJUGUGGGCAAUGGCCUGGCCAGAGCCGIJUUUCUAUCCCGGGCUUAAAUUAUSA llAUGCCAAIRIGUAGAGUGCGAGA UGGCCCAGUUUResults^6 7K VIVSSTNSTLTHGLPQAQVSIGTIRQHLGPGHDRTISGAl2153601 UAAAGGUCAUUGUGAGCAGCACAAAUUCAACGCUCACGCAUGGUUUGCCCCAGGCUCAAGUCUCCAUUGGGACUUUGCGCCAGCAUUUGGGGCCAGGUCAUGAUCGCACUAUCUCUGGUGL YASQQQGFNIRATEQGGAVTFAPQGGHVEGIPSANVQMG12553721 CCCUGUAUGCUUCCCAACAACAGGGUUUCAAUAUACGCGCCACGGAACAAGGUGGUGCUGUAACAUUUGCCCCCCAAGGGGGCCAUGUUGAGGGUAUCCCCAGCGCCAAUGUACAGAUGGA G E H L I Q A G P M Q W R I Q R S Q S S R F V V S G H S R T R G S S L F T G S 12953841 GCGCUGGGGAGCAUULIAAUUCAAGCGGGUCCCAUGCAGUGGCGCUUGCAGAGGUCGCAAUCUUCUCGAUUUGUGGUCUCUGGUCAUUCGCGAACGCGUGGAAGCUCUUUGUUUACUGGAA^ DRTQQGTGAFEDPGFLPPRNSSVQGGSWQEGTEAAYLGK13353961 GUGUCGAUAGGACGCAGCAGGGAACGGGGGCUUUUGAAGACCCGGGUUUUUUACCACCCACGAAUUCUUCUGUUCAGGGCGGGUCCUGGCAAGAAGGUACUGAAGCCGCUUAUUUAGGCA^ TCAKDAKGGTILHTLDIIKECKS0NLLRYKEWQRQGFLH13754081 AAGUUACCUGUGCGAAGGACGCCAAGGGUGGAACUUUAUUGCACACUUUGGAUAUUAUAAAAGAGUGCAAAUCCCAAAAUUUAUUAAGGUAUAAAGAAUGGCAACGUCAAGGCUUUCUUCG KLRINCFIPTNIFCGHSMMCSIDAFGRYDSNVLGASFPV14154201 AUGGAAAGCUUAGAUUGCGCUGCUUUAUACCCACUAACAUUUUUUGUGGGCAUUCCAUGAUGUGUUCCUUGGACGCGUUUGGLICGUUAUGAUUCGAACGUGCUAGGUGCUAGUUUUCCAGK LASILPTEVISLADGPVVTWTFDIGRLCGHGLYYSEGAY14554321 UGAAGUUGGCAAGUIJUAUUGCCAACGGAGGUGAUUAGUCUGGCUGAUGGGCCCGUGGUCACGUGGACGUUUGAUAUUGGGCGUCUGUGUGGUCAUGGUCUCUAUUAUUCCGAGGGCGCUUARPKIYFLVLSDNDVPARADWQFTYQLIFEDHTFINSFGA14954441 AUGCGAGGCCCAAAAUUUAUUTJUUUAGUUCUUUCCGAUAAUCAUGUUCCUGCAGAAGCAGAUUGGCAAUUUACCUAUCAGCUUUUCUUCGAGGAUCAUACAUUUUCGAAUUCCUUUGGGG^ PFITLPHIFNRLDIGYWRGPTEIDLTSTPAPNAYALLF015354561 CGGUUCCUUUUAUUACCUUACCCCAUAIRRJUUAAUAGAUUAGAUAUAGCUUAUUGGCGCGGGCCAACAGAGAUAGAUWAACAUCAACUCCCGCACCAAACGCCUAUCGUUUACUUUUCGI S T V I S G N M S T L N A N Q A L L R F F Q G S N G T L H G R I K K I G T A L 15754681 GCUUGUCCACUGUCAUUAGUGGUAACAUGUCCACUIRIGAAUGCCAAUCAAGCCUUAUUGCGUIMUUCCAGGGCUCGAAUGGCACUUUACAUGGGCGCAUUAAAAAGAUAGGGACAGCACTTCSILLSLRHKDASLTLETAYQRPHYILADGQOAFSLPI 16154801 UUACAACCUGUUCCCUUUUAUUAUCGUUGCGCCACAAAGAUGCGAGUCUCACAUUGGAGACCGCAUAUCAAAGGCCCCAUUACAUUUUGGCUGACGGACAAGGGGCUUUUUCAUUACCAAS TPHAATSFLEDMIRLEIFAIAGPFSPKDNKAKYQFMCYF16554921 UUUCUACCCCCCAUGCAGCAACCUCCUUUUUAGAGGACAUGUUGCGCCUGGAGAUMAJUGCUAUUGCUGGGCCUUUUAGUCCCAAAGAUAAUAAAGCAAAAUACCAAUUCAUGUGUUAUUD H I E L V E G V P R T I A G E Q Q F N W C S F R N F K I D D W K F E W P A R L 16955041 UCGAUCACAUAGAAUUGGLIUGAGGGGGUACCUAGAACUAUAGCACGCGAACAGCAGUUCAACUGGUGCAGUUUUAGAAAUUUCAAAAUCGAUGACUGGAAGUUUGAGUGGCCGGCUCGCCP DILDDRSEVLIRQHPISLLISSTGFFTGRAIFVFQW0LN17355161 UUCCAGAUAUACUUGAUGAUAAGUCAGAAGUGCUCUUAAGGCAACAUCCUUUAUCUCUGCUUAUCUCAUCUACCGGUUUUIRRIACGGGUAGAGCCAUUUUUGUUUUCCAGUGGGGUUUGATTAGNMKGSFSARLAFGKGVEEIEQTSTVQPLVGACEARI 17755281 AUACUACUGCUGGGAAUAUGAAAGGUUCAUUUUCUGCGCGCCUGGCCUUUGGCAAGGGCGUUGAGCAAAUUGAGCAAACGUCAACAGUGCAACCACUUGUUGGCGCUUGUGAAGCCCGCAP VEFKTYTGYTTSGPPGSMEPYIYVALTQAKLVDRLSVNV18155401 UACCCGUGGAGUUUAAGACUUACACGGGUUAUACUACUUCGGGUCCUCCUGGAUCUAUGGAACCAUACAUUUACGUGAGGCUUACGCAAGCUAAGCUUGUGGAUAGGCUUUCUGUGAAUGILQEGFSFYGPSVKRFKKEVGTPSATIGTNNPVGRPPENV18555521 UUAUUUUACAGGAGGGAUUUUCUUUCUAUGGACCUAGUGUCAAACAUUUUAAGAAAGAAGUCGGCACGCCUAGUGCCACCCUAGGGACAAAUAAUCCCGUUGGGCGCCCACCUGAGAACGD TGGPGGQYAAALQAAQQAGKNPFGRG.^18825641 UCGAUACAGGGGGUCCUGGCGGCCAGUAUGCAGCUGCCUUACAAGCAGCCCAGCAAGCUGGAAAAAAUCCUUUCGGGCGUGGCUAAGUUGGCUUCCUGAAAGGCGAGUAGCUGCCGUUAG5761 CAGCUUCCAAAAGGUCGCCUCUUAAWAGCUIRJUAAUAGGGGUUAUCCAGCCUUAAGCAAGCUGGCACCGGUCCUGAUGGACUACCAGGAAAGCACCUGGUUUGGAAGAAUUCGAGUAAA5881 AUUCUUAAAUCUUGUUOACUCGUGACUUAUAGUACAUUCAAGAGGAAUGACUCAUGUUUUGUUUAUUUACAUGAUGGCAUAAAGAGUUAACGGCUCAUAUGGUGCUCAUUACGUUCAAGU6001 GUUGAAGGAUCCAAUAGCCUUGAACUGUGGUGCCAUGUGAGGAAAUCCACGUUAUCUCUGAUUGUCAAAAUAGACUAGUCUAGGAGACGAUAAAUCCUAUGUGGGUGAGUCCCACUCUGG6121 CGAGACACGCAGUGCCUUUUAUUUGUUUGAGGUUAUCAAACAUCAUAUCUUGAGUCUGCAUUUAAAUUCGAAUAAUGUAGUUGUCAUAGCCUACCGAUGAGUCUGCGAGAAAGGUUCCAU6241 GAGGACUUGGGUUGGCUAACCCCCACUUAAUCUCUUCAUAGAUCAUUCGACAGUGUGUCCAGAAACUAUGGGUUUUGACACCUUAAGGGAAGCGAGAGUUCUCGUAUGGAUAUCACUCUA6361 AUCAUGGUACUUACCAUGCCAUCUUUAGAGUAAAUCAUCGCCUCCACGGUGUGAUACUUUCCUUAAGUUUUAGUCAAACGAUAGUCUCGUUGAUCGUAUGUGAAGCGUGGAGCGAGUUCG6481 AAACGAACUGAUUACCAGAGGUAGGACGCUAUUGUUCCAGGCGUUUCUUAUGGGCAUAACCUGUAAACUUGGUUUCGCAAGCCAUGCAGCACCUCCCGUUACUUGUGUACUUUCUAGGGG6601 CUCCCGCCCUUCCUUCCGGUACAANACCUAGUGAAGCAAGUAAUUGCGUUGAGGGAUAAGAGUAGCAUGUUCCUACUUAAGGAAGGAAUAUGUCGUGUUUUCCACACGUUAGUGUUGCAA6721 UGCUGUAAUGGCACUGCAGUGCACGAAUGGUUCCCAGCCACUUUUUCUGGGAUUCUAAUCGUACGUCACAAUUGUGUGUGUAUCGUUGACGGAGGAGUAGCGAUCCUCUACCACGCGAGU6841 CUGGAAGUGAUUACCAGGGCCUAAGAUGGCCAGCACACGGUACGAUUAAAUUUAGCUGUAAUGUAGUGGUAUGUUAAGUUGAGACUAACUUACCCGUACGAGUUAAACUCUAAGAUGGAU6961 GUGUGUUCUGCCAUCUUAGAGGAAGUAGAUGUGUUUUUACCAAUCUGAGACGAGCCGUUAAUUCGGUGCUUUAAUACGUCAAUGAUAAUACUCGUGCAGUUGCAGCUGCACGAGUAUGUU7081 GGUACACACAGUCUACUCGGAUACGGUCGAGUUACCCUCACAAUAGGGAUUACUCUCUCAAUCUUAACUACUGCAAGGACGUUGUUUUCGCAGGGUULTUGUUGGUCCGUUUGUGUUUCAA7201 AACGCUGCUUUGCAAUUUUCUUUUUUGUUUUAUUGCUUUCGUAGUGUCGAACUUUGUCCAAGUUCAUAAAAGC [poly(A)3Fig. 3.8^Nucleotide sequence of TomRSV RNA2. The first two nucleotides are undetermined and are eachrepresented by an N. Also shown is the predicted amino acid sequence encoded by the long ORF beginning at thefirst AUG codon at residue 78 and terminating at the UAA stop codon at position 5646. Numbers to the left of thesequence refer to nucleotide sequence position while numbers to the right of the sequence refer to amino acidresidue number.Results^68TomRSV RNA2 is 7,273 nucleotides in length excluding the 3' poly(A) tail. Thecalculated molecular weight of RNA2 is 2.35 x106 daltons [excluding the 3' poly(A) tailand 5' VPg], which is similar to the M r of 2.4 x 106 determined by denaturing gelelectrophoresis (Murant et al., 1981). The base composition of RNA2 is 22.55% A,23.28% C, 24.89% G, and 29.29% U.3.4 Nucleotide Sequence Comparisons Between RNA1 and RNA2Computer assisted comparisons were made between the nucleotide sequences of TomRSVRNA1 and RNA2 using the FASTN algorithm of Lipman and Pearson (1985). Thesecomparisons confirmed that there is extensive nucleotide sequence similarity between thetwo RNAs at their 3' termini. The 3' terminal 1533 nucleotides of TomRSV RNA1 andRNA2 are identical with the exception of nucleotide positions 7,369, 7,388 and 7,438(relative to the RNA1 sequence) (Fig. 3.9). In the RNA2 sequence these nucleotides arechanged from a C to T, T to C and a C to A, respectively. In addition to the 3' sequenceidentity there is also extensive nucleotide sequence identity at the 5' termini. Eighty-eightpercent of the 5' terminal 907 nucleotides of RNA1 and RNA2 contain identical nucleotideresidues: the first 459 nucleotides are identical at all positions, whereas the next 447 areidentical at 75.8% of the nucleotide positions (Fig 3.10). All of the identical residues at the3' termini are within the 3' noncoding region whereas identical residues shared at the 5'terminus extend into coding regions. Such extensive nucleotide sequence conservationamong the genomic RNAs of multipartite RNA viruses is highly unusual, especially thefinding of such extensive similarity at the 5' termini. The possible significance of thisfinding will be discussed.Results^6 9**.*******^***RNA1^GAGCTAUCCUCUUUGAGGCGAGUAGCUGCCGUUAGCAGCUUCCAAAAGG UGGCCUCUTJAAUUAG CUUUUAAUAGGGG UUAUCCAGCCUUAAGCAAG CU 6764RNA2 .ALGUUGG CUUCCUGAAAGGCGAGUAGCUGCCGUUAGCAGCUUCCAAAAGG UGGCCUCUUAAUUAG CUUUTIAAUAGGGG UUAUCCAGCCUUAAGCAAG CU 5823RNA1^GGCACCGGUCCUGAUGGACUACCAGGAAAGCACCUGGUUUGGAAGAAUUCGAGUAAAAUUCTJUAAAUCUUGUUUACUCGUGACUUAUAGUACAUUCAAGA 6864RNA2 GGCACCGGUCCUGAUGGACTJACCAGGAAAGCACCUGGUUUGGAAGAATJTJCGAGUAAAAUUCUUAAAUCUUGUUUACUCGUGACUUAUAGUACAUTJCAAGA 5923RNA1 GGAAUGACUCAUGUUUUGUUUAUUUACAUGAUGGCAUAAAGAGUUAACGGCUCAUAUGGUGCUCAUUACGUUCAAGUGUUGAAGGAUCCAAUAGCCUUGA 6964RNA2 GGAAUGA CUCAUGUUUUGUUUAUTJUACAUGAUGG CAUAAAGAGUUAACGGCUCAUAUGG UG CUCAUUACGUUCAAG UG UTJGAAGGAUCCAAUAGCCUUGA 6023RNA1 ACUGUGGUGCCAUGUGAGGAAAUCCACGUUAUCUCUGAUUGUCAAAAUAGACUAGUCTJAGGAGACGAUAAAUCCUAUGUGGGUGAGUCCCACUCUGGCGA 7064RNA2 ACUGUGGUCCCAUGUGAGGAAAUCCACGUUAUCUCUGAUUGUCAAAAUAGACUAGUCUAGGAGACGAUAAAUCCUAUGUGGGUGAGUCCCACUCUGGCCA 6123RNA1^GACACGCAGUGCCUUUUATJUUGUTJUGAGGUUAUCAAACAUCAUAUCUUGAGUCUGCAUUUAAAUUCGAAUAAUGUAGUUGUCAUAGCCUACCGAUGAGIJC 7164RNA2^GACACGCAGUGCCUTJTJUAULTUGUIJUGAGGUUAUCAAACAUCAUAUCUUGAGUCUGCATJUUAAAUUCGAATJAAUGUAGUUGUCAUAGCCUACCGAUGAGUC 6223RNA1 UGCGAGAAAGGUUCCAUGAGGACUUGGGUUGGCUAACCCCCACUUAAUCUCUUCAUAGAUCAUUCGACAGUG UGUCGAGAAACUAUGGG UUUUGACACCU 7264RNA2^UGCGAGAAAGGUUCCAUGAGGACUUGGGUUGGCUAACCCCCACUUAAUCUCUUCAUAGAUCAUUCGACAGUGUGUCGAGAAACUAUGGGUUUTJGACACCU 6323RNA1^UAAGGGAAGCGAGAGUUCUCGUAUGGAUAUCACUCUAAUCAUGGUACUUACCAUGCCAUGUUTJAGAGUAAAUCAUCGCCUCGACGGUGUGAUACUUUCCU 7364RNA2^UAAGGGAAG CGAGAG UUCUCGUAUGGAUAUCACUCUAAUCAUGGUACUUAC CA UG CCAUGUIJUAGAGUAAAUCAUCGCCUCGACGG UG UGAUACTJUUCCU 6423RNA1^UAAGCUCUAGUCAAACGAUAGUUUCGUUGAUCGTJAUGUGAAGCGUGGAGCGAGUTJCGAAACGAACUGAUUACCCGAGGTJAGGACGCUAUUGUUCCAGGCG 7464RNA2^UAAGTJUUTJAGUCAAACGAUAGUUCCGUUGAUCGUAUGUGAAGCGUGGAGCGAGUUCGAAACGAACUGAUUACCAGAGGUAGGACGCUAUTJGUUCCAGGCG 6523RNA1^UUTJCUUAUGGGCAUAAGCUGUAAACUUGGUTJUCGCAAGCCAUGCAGCACCUCCCGUUACUUGUGUACUUUCUAGGGGCUCCCGGCCUUCCLTUCCGGUACA 7564RNA2^UUUCIJUAUGGGCAUAAGCUGUAAACUUGGUUUCGCAAGCCAUGCAGCACCUCCCGUUACUUGUGUACUUUCUAGGGGCUCCCGGCCUUCCUUCCGGUACA 6623RNA1 AUACCUAGUGAAGCAAGUAAUUGCGIJUGAGGGAUAAGAGUAGCAUGUUCCUACUUAAGGAAGGAAUAUGUCGUGUUDUCCACACGUUAGUGUUGCAAUGC 7664RNA2 AUACCUAGUGAAGCAAGUAAUUGCGUUGAGGGAUAAGAGUAGCAUGUUCCUACUUAAGGAAGGAAUAUGUCGUGUIJUUCCACACGUUAGUGUUGCAAUGC 6723RNA 1^UGUAAUGGCACUGCAGUGCAGGAAUGGUUCCCAGCCACUUUUUCUGGGATJUCUAAUCGUACGUCACAAUUGUGUGUGUAUCGUUGACGCAGGAGUAGCGA 7764RNA 2 UGUAAUGG CACUGCAGUG CAGGAAUGGUUCCCAG CCACTJUUUUCUGGGAUUCUAAUCGUACG UCACAAUUGUGUGUGUAUCGUUCACGGAGGAGUAGCCA 6823RNA1^UCCUCUACCACG CGAG UCUGGAAG UGAUUACCAGGGCCUAAGATJGG CCAGCACACGGUACGAUUAAAUUUAG CUGUAAUG UAGUGG UAUGUUAAGTJUGAC 7864RNA2 UCCUCUACCACGCGAGUCUGGAAGUGAUUACCAGGGCCUAAGAUGGCCAGCACACGGUACGAUUAAAUUUAGCUGUAAUGUAGUGGUAUGUTJAAGUUGAG 6923RNA1 A CUAA CIJUACCCGUA CGAG UUAAACUCUAAGAUGGAUGUGUGUUCUGCCAUCUUAGAGGAAG UAGAUGUaTUUUUACCAAUCUGAGACGAGCCGUUAAUU 7964RNA2 ACUAACUUACCCGUACCAGUUAAACUCUAAGAUGGAUGUGUGUUCUGCCAUCUUAGAGGAAGUAGAUGUGUUUUUACCAAUCUGAGACGAGCCGUUAAUU 7023RNA1^CGOUGCUUUAAUACGUCAAUGAUAAUACUCGUGCAGUTJGCAGCUGCACGAGUAUGUUGGUACACACAGUCUACUCGGATJACGGUCGAGUUACCCUCACAA 8064RNA2^CGGUGCUUUAAUACGUCAAUGAUAAUACUCGUGCAGUTJGCAGCUGCACGAGUAUGTJUGGUACACACAGUCUACUCGGAUACGGUCGAGUUACCCUCACAA 7123RNA1^UAGGGAUUACUCUCUCAAUCUIJAACTJACUGCAAGGACGUUGUUUUCGCAGGGUUTJUGUTJGGUCCGUUUGUGTJUUCAAAACGCUGCUUUGCAAUULTUCUTJU 8164RNA2^UAGGGAUUACUCLICUCAAUCUUAACUACTJGCAAGGACGTJUGUUUUCGCAGGGUIJUUGUUGGUCCGUUUGUGUUUCAAAACGCUGCUULTGCAAUTJUUCUUU 7223RNA1^UUUGTJUUUAUUGCUUUCGUAGUGUCGAA CUUUGUCCAAGUUCAUAAAAGC^8214RNA2^UTJUGUUUUALJUGCUULICGUAGUGUCGAACUUUGUCCAAGUUCAUAAAAGC 7273Fig. 3.9^Nucleotide sequence alignment of the 3' termini of TomRSV RNA1 and RNA2.Nucleotide differences are indicated by an asterisk. Stop codons for the RNA1 and RNA2 longORFs are underlined. The residues in bold outline are conserved between TomRSV,TBRV andGCMV (see section 3.7.2). Numbers to the right of the sequences indicate nucleotide residuepositions in RNA1 and RNA2 as assigned in Figs. 3.7 and 3.8.Results^7 0**************************************************************************************************RNA1 NNAG.O.AaALALCUGGUGAUAUUCCAACUUCUCUCAAUUCACACUUCCAUUGUGUCGUUUUGUUUUCUUUUCUUUUGAUCUCCUCCAUUUGUUUCGCCGC 100RNA2 NNAGMILLMUCUCGUGAUAUUCCAACUUCUCUCAAUUCACACUUCCAUUCUCUCGUUUUGUUUUCUUUUCUUUUGAUGUCCUCCAUUUCUUUCGCCGG 100****************************************************************************************************RNA1 UGGCAACCACGCCAGGUUGCCAUCGAAGGCUGCUUACUAUCCGCCUAUUUCCGAUACCCACCUGGACCGCGAGGCUCGCUUCCCUUGCGCGUGUCUAGCA 200RNA2 UGGCAACCACGCCACCUUCCCAUCGAAGGCUCCUUACUAUCCGCCUAUUUCCGAUACCGAGCUGGACCGCGAGGGUCGCUUCCCUUGCGGCUCUCUAGCA 200****************************************************************************************************RNA1 CAGUAUACUCUGCAAGCCCCCCCUCCUGCCAAGACACACCACAAAGCCGUACCCAGGUCCCCUGACCUCCAAAAGGGUAAUGUUCCUCCCCUUAAGAAGC 300RNA2 CAGUAUACUCUCCAACCCCCCCCUCCUGCCAAGACACAGGAGAAAGCCGUAGGCACCUCCGCUGACCUCCAAAAGGGUAAUCUUCCUCCCCUUAAGAACC 300****************************************************************************************************RNA1 AACGCUGCCAUGUUGUGGUCGCACUCUCUCCACCUCCUCCUUUGGACUUGGUCUACCCUCCCCGGGUAGGCCAACAUAGGUUGGACCAACCUUCAAAAGG 400RNA2 AACGCUGCGAUCUUGUCCUCGCACUCUCUCCACCUCCUCCUUUGGAGUUGGUCUACCCUGCCCGGGUAGCGCAACAUACCUUGGACCAACCUUCAAAAGG 400***********************************************************^* ********** * ** **************^** **RNA1^UCCCUUGCCACUCCCCUCUGCCAAGCAAACCUCCACUGCAAUCCACCUUGUUCUUUCUCUCCGGGAGGCCCCUCUUACUGCCCCCUGCCUUCUCUGCUCC 500RNA2 UCCCUUGGCAGUUCCCUCUCCCAAGCAAACCUCCACUGCAAUGGAGGUUCUUCUUUCUCCUGAGGAGGCCGCUAUCACCGCCCCCUGGCUUCUUCGCCCC 500* **** * *^*^************** * ** ** ****************** * ** * ******** *^************** ****RNA1 UACAAGAGUGGAGUUUCUUCCCCCCCCCCCCCCAUGACCCAAACCCACCAAUUUCCUGCCAUUAAAAGGAGCCUGGUCCAGAAGGCCCAGCAAAUUAUUC 600RNA2 UCCAAGCGCGAAC---CCCCCCCCCCCCCCCCCCUUACACAGAGGCACCAAUUUGCUCCCCUAAAGAAGAGGCUGGCCCUCAAGGGCCACCAAAUCAUUC 597****** *** ** ************** ** ********* * ***^** ***** ** ***** ******^****** * **^*******RNA1 GCGACCUCAUCCGACCUCGCAAGGCCCCUAAGUAUGCCGCCUUUGCCGCCCGGAAGAAGGCGGCACCUGUGGCUCCCCAAAAGGCACGAGC-UGAGCCUC 699RNA2 CCGACCACAUUCGUGCUCGCAAGGCGGCCAAAUAUCCCGCCAUCGCCAAACCCAAAAAGGCUGCGCCUCUUGCUCCCGUUAAGCCA-GCGCAGGACCCUC 696* *********************** **^****************** *^***^*^*** ******^*****************RNA1^CGCGCCUCGCCGCCCAAAAGGCCOCAAUUGCCAAGAUCCUUCCGCAUCC-GCAAUUC--C---UUUCCCUUCCCCCUCCUCCUCCUCCUUCUCCUCCCAG 793RNA2^CUCGCCUCCCGGCCCAAAAGGCCGCCAUCAGCAAGAUCCUUCGGGAUCCACAUGUUCCUGCUCUCCCCCCUCCCCCC^ CCUCCUUCUGCUCCCAG 790******** ***************** **^* *^***** * ********* " le*** * *^* * **** * **** *^******* ****RNA1 GUUGCCAGCUGAGGCCGAAUUGGCCUCCAAAUCAGCCUCUCUUCAGAGGCUCAAGGCCUUUCAUAGGGCCAACCGGGUUCGCCCGGUGUUAAACAAUUCU 893RNA2 AUUGGCAGCCGAGGCCGAAUUGGCCUCAAAGGCCGACUCUCUCCGGAGGCUCAAAGCCUUUAAAACUUUUAGCAGGGUACGCCCUGCUUUAAACACUUCU 890*****^**** **RNA1^UUUCCCUCCCCCCC^904RNA2^UUUCCUCCCCCUCC 907Fig. 3.10 Nucleotide sequence alignment of the 5' termini of TomRSV RNA1 and RNA2.Identical residues are indicated by an asterisk. Dashes in the sequences are spaces inserted tooptimize the alignment. Underlined residues are conserved between TomRSV, TBRV, GFLV andCPMV (see section 3.7.1). Numbers to the right indicate nucleotide residue position in RNA1 andRNA2 as assigned in Figs. 3.7 and 3.8.Results^713.5 Nucleotide Sequence Repetition Within RNA2Within the sequence of TomRSV RNA2, beginning at nucleotide position 1,740, there arethree tandem repeats (Fig. 3.11). The first two repeats are 159 nucleotides in length andnear perfect with only two nucleotide changes of C to U. The third repeat is shorter andmore degenerate. The region containing the repeats fell almost completely within theoverlap portion of clones K6 and 035 (see Fig. 3.1) and therefore the presence of therepeats could be confirmed in two independent clones. In addition, part of the repeatedsequence was confirmed using oligo#2 as a primer for dideoxynucleotide sequence analysisand viral RNA as template (see Materials & Methods, section 2.8.2).3.6 Open Reading FramesThe nucleotide sequences of both TomRSV RNA1 and RNA2 were analyzed for openreading frames (ORFs). In both RNA1 and RNA2, a single long ORF is present in thevirion sense orientation ( Fig. 3.12 and 3.13, respectively). The long ORF present inRNA1 is 6,591 nucleotides in length beginning from the first AUG codon at nucleotideposition 78 (AUG78) and terminating at a UAA stop codon at nucleotide position 6,668.This ORF accounts for 80% of the RNA1 nucleotide sequence and would give rise to apolyprotein with a calculated molecular weight (mol. wt.) of 244 kilodaltons (kDa) (Fig.3.14). The long ORF in RNA2 is 5,646 nucleotides in length and accounts for 78% of theRNA2 sequence. This ORF also begins at the first AUG codon at position 78 (AUG78)and terminates at a UAA stop codon at position 5,723. The predicted translation productwould have a mol. wt. of 207 kDa (Fig. 3.14). The first in-frame AUG triplets for bothRNA I and RNA2 at position 78 (UUUGAUGUC) are in neither the optimal Kozakrepeat 1repeat 2repeat 3repeat 1repeat 2repeat 3repeat 1repeat 2repeat 1repeat 2*******A******************A*****A***********AA****UUC GC GAAUUUUAAAGUGAAUAGG GGC GC AUGUUUUUUGC AAGUC C UGC C 1789UUC GC GAAUUUUAAAGUGAAUAGG GGUGC AUGUUUUUUGC AAGUC C UGC C 1948UUC GC GAGaRJUAAAGUG AAUAGG GGUGC AUGC UUUUUGC AG GUUUUGC C 2107*AAAA*AAAAA*AA*AA*A*****A*A******AA*A***A*A*******UC AAAG GGUUGUUUUAC C C GAUG AAUGC AUG GAUUUGC UUUC UC UUUUUG 1839UCAAAGGGUUGUUUUACCUGAUGAAUGCAUGGAUUUGCUUUC UCUUUUUG 1998UGC GCGCAAGGUC GUCUC UGAUGAGUUCAUGGAC GU - CUUGCCCUUUUUG 2157A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A AAGGAUCAAUUGC CGGAAGGGCCUUUGC CUUC CUUUAGUUGGUCUUCAC CU 1889AGGAUCAAUUGC CGGAAGGGC CUUUAC CUUC CUUUAGUUGGUCUUC AC CU 2048AAAAAAAAAUUACCUCUA^1898UUACCUCUA 2057Fig. 3.11 Alignment of the the three tandemly repeated nucleotide sequences within RNA2.Conserved residues in all three sequences are indicated by an asterisk (*) while conserved residuesbetween two of the three sequences are indicated by a caret (^). Gaps introduced in the sequence tomaximize the alignment are indicated by a dash. Numbers to the right of the sequence indicate nucleotideresidue number.-3Frame 3Frame 2Frame 1Frame -1Frame -2Frame -3^►1^I II^►1 01 II4^I ■^I 1 11 414 441^4 111 11 kbFig. 3.12^Map of TomRSV RNA1 open reading frames. The center line represents RNA 1. The filledarrows represent open reading frames in all six possible reading frames. Arrows begin at an AUG triplet andend at the first in-frame UAA, UAG or UGA.LA)Frame 3Frame 2Frame 1Frame -1Frame -2Frame -3►^ 1■11441111 1-40 1^II^11 11 kbFig. 3.13^Map of TomRSV RNA2 open reading frames. The single center line represents RNA2 while thetilled arrows represent open reading frames in all six possible reading frames. Arrows begin at an AUG tripletand end at the first in-frame UAA, UGA or UAG.-.14=,AUG 78 AUG441RNA1^I  UAA 666811■■•••■•■AAA(8214)^VPg 244 kDa polyproteinAUG 78 AUG441RNA2UAA 5723^101■1 ■1■AAA(7273)^VPg 207 kDa polyproteinFig. 3.14^Diagramatic representation of the coding and noncoding coding regions of RNA1 and RNA2.Both RNAs contain a single long ORF which account for 80% and 78% of the nucleotide^sequence,respectively. The RNA1 ORF has the capacity to code for a polyprotein of 244 kDa beginning at the first in-frame AUG at position 78 and terminating at the first in-frame stop codon at position 6668. The RNA2 ORFhas the capacity to code for a polyprotein of 207 kDa beginning at AUG78 and terminating the UAA stopcodon at position 5723.Results^76(CG/ACCAUGG) nor the Liitcke (AACAAUGGC) context for translation initiation inanimals or plants, respectively (Kozak, 1986; Liitcke et al., 1987). The next in-frameAUG, which is the fourth AUG from the 5' terminus, occurs at position 441 and is in afavorable Kozak context for the initiation of translation with a G in the -3 and +4 positions(UGCAAUGGA). It is possible that AUG78 and /or AUG441 may act as initiation sitesfor translation. Between AUG78 and AUG441 are two additional AUG triplets, AUG280(GGUAAUGUU) and AUG310 (UGCGAUGUU) that are not in-frame with the longORF (Fig. 3.15). The 5' 800 nucleotides of TomRSV RNA2 were analyzed for codingprobability using the program TESTCODE (Fickett, 1982) (Fig. 3.16) which is based onthe unequal use of codons found in regions that code for proteins. The region beginningfrom AUG78 gave a score of between 50-77% for the first 100 nucleotides and then rose to100% through to the end of the sequence analyzed. Scores between 77-100% indicate thatthe regions most probably have coding function (Fickett, 1982). Further analysis using the"absolute positional base preference" method (Staden, 1984) (Fig. 3.17), which is basedon the nonrandom use of amino acids in proteins and the structure of the genetic code,suggests that there is only one coding frame in this region beginning at ca. nt 180 to the endof the sequence analyzed (the first 800 nucleotides). AUG78 is the only potential in-frameinitiation site upstream of the coding region indicated by TESTCODE and the absolutepositional base preference method.All other ORFs in both RNA1 and RNA2 are less than 355 nucleotides in lengthexcept for two ORFs in the negative-strand orientation of RNA2 which are 603 and 582nucleotides in length (positive-strand nucleotide positions 956-353 and 3,103-2,521,respectively). A search of the NBRF, Swiss-Prot, Pseqlp, Genbank, and EMBLdatabases failed to detect sequences with significant amino acid sequence similarity to eitherof these two ORFs encoded in the negative strand.Results^7 7Kozak consensusLatcke consensusTomRSV AUG78TomRSV AUG441TomRSV AUG280TomRSV AUG310ACGCCAUGGAAACAAUGGCUUUGAUGUGUGCAAUGGAGGUAAUGUUUGCGAUGUUFig. 3.15 The nucleotide sequence context for the first four AUG triplets in TomRSVRNA1 and RNA2 compared with optimal sequence contexts determined by Kozak (1986)and Liitcke (1987) for animal and plant translation systems, respectively. AUG78 andAUG441 are in-frame with the RNA1 and RNA2 long open reading frames while AUG280and AUG310 are not. The first residue of the AUG codon is assigned position number 1.100%77%40% limosimmiotooleMmowommeamassissaummussinearsatnuswaramolaualMOWINISANIt40414.160110NONIONOMaarrispoiberrosomerwamemodommum01^1^1^I^1^1^1^1^I0^100^200^300^400^500^600^700^800Fig. 3.16 Analysis of the first 800 nt of RNA2 for coding potential using"TESTCODE" (Fickett, 1982). Values between 77 and 100% indicate ahigh probability of coding sequence. Beginning at ca. nucleotide 180, theTESTCODE value increases to 100% to the end of the sequenceanalyzed.00I100^2001^I^1^1^1^I300^400^500^600^700^800101Reading Frame IIIReading Frame IIReading Frame IFig. 3.17 Graph showing analysis of the first 800 nt of RNA2 by the "absolute positional basepreference method" (Staden, 1984). All three reading frames are shown. The dashed line in eachreading frame indicates coding strand expected mean. The line through reading frame III indicatesthe most likely coding phase.Results^803.7 Noncoding Regions^3.7.1^5' Noncoding RegionThe 77 nucleotide 5' noncoding regions of TomRSV RNA1 and RNA2 are identical. Thissequence was compared to the 5' noncoding regions of the two RNA components of thenepoviruses TBRV (Meyer et al., 1986; Greif et a/.,1988), GCMV (Brault et al., 1989; LeGall et al., 1989), GFLV (Serghini et al., 1990; Ritzenthaler et al., 1991) and the B and Mcomponents of the comovirus CPMV (Lomonossoff & Shanks, 1983; van Wezenbeek etal., 1983). It has been previously reported that TBRV, GCMV, GFLV, and CPMV sharea conserved UGAAAAAU sequence at the 5' terminus (Serghini et al., 1990). TomRSVRNA has a similar sequence (CGAAAAAU) (see Fig. 3.10). This short octanucleotide isthe longest region of sequence identity that could be detected between the 77 nucleotide 5'noncoding region of TomRSV RNA2 and the 114-287 nucleotide 5' noncoding regions ofany of these other viruses. The base composition of the 5' noncoding region of TomRSVRNA2 (assuming AUG78 is the translation initiation site) is similar to those of TBRV,GCMV, GFLV and CPMV with a high U content (44.2%) and a low G+C content(35.1%) (Table 1). If AUG441 is the translation initiation site the U content drops to26.8% and the G+C content increases to 53.1%.^3.7.2^3' Noncoding RegionThe 3' noncoding regions of TomRSV RNA1 and RNA2 are 1,546 and 1,550 nucleotidesin length, respectively, excluding the poly(A) tail. As stated in section 3.4 most of thissequence is identical between RNA1 and RNA2 with only 3 nucleotide differences withinTable 1.^Nucleotide composition analysis of the 5' and 3' noncoding regions of TomRSV RNA2,TBRV and GCMV RNA1 and RNA2, CPMV M and B RNA, and CLRV RNA 3' region.viral RNA 5' noncoding regionnucleotides^% U^% G+C3' noncoding regionnucleotides^%U^%G+CTomRSV RNA2a 77b 44.2 35.1 1,550 31.2 43.4440c 26.8 53.2 110d 46.4 37.3TBRV RNA2 287 37.8 36.7 304 40.5 33.6TBRV RNA1 260 35.4 39.6 304 40.5 33.6GCMV RNA2 217 32.3 40.1 252 41.2 33.7GCMV RNA1 215 35.3 38.1 241 41.1 33.2GFMV RNA2 232 44.8 28.4 215 46.5 33.0CPMV MRNA 114 35.7 35.7 177 48.0 22.0CPMV BRNA 206 35.9 36.4 85 41.2 31.8CLRV RNA - - - 1,500 31.2145e 44.1a values for TomRSV RNA1 and RNA2 are identicalb the 77 nucleotides preceding the first in-frame AUG codon of TomRSV RNA2the 440 nucleotides preceding the second in-frame AUG codon of TomRSV RNA2 werealso analyzed as a comparisond the last 110 nucleotides of the TomRSV RNA2 noncoding sequence were also analyzedas a comparisone the last 145 nucleotides of the CLRV noncoding sequence were also analyzed as acomparison- sequence not available00Results^82the 3' most 1533 nucleotides (see Fig. 3.9). The regions of sequence identity are precededby 13 and 17 nucleotides of noncoding sequence (UAAAUCCUCUUUG andUAAGUUGGCUUCCUGAA; underlined nucleotides indicate stop codons for the largeORFs of RNA1 and RNA2, respectively). The 3' noncoding regions of TomRSV RNA1and RNA2 are much longer than the 3' noncoding regions of TBRV, GCMV, GFLV, andCPMV, as well as of most other (+)ssRNA viruses. The 3' noncoding regions of TBRV,GCMV, GFLV and CPMV are less than 305 nucleotides. Comparison of the 3' noncodingregion of TomRSV RNA with the 3' noncoding regions of the RNA components ofGFLV, TBRV and GCMV, the M RNA from CPMV, and poliovirus RNA, did not revealany significant sequence similarity except for the sequence AAAAGC found immediatelypreceding the poly(A) tail in RNA1 and RNA2 of TomRSV, TBRV, and GCMV. Theconserved blocks of sequence among the 3' noncoding regions of TBRV, GCMV, GFLVand CPMV previously reported by Serghini et al. (1990), could not be detected in the 3'noncoding region of TomRSV RNA. The 3' terminal sequences of the RNA1 and RNA2components of the nepovirus, cherry leaf roll (CLRV), have been determined (Scott et al.,1992). Like the TomRSV 3' terminal sequences, the 3' terminal sequences of CLRVRNA1 and RNA2 are identical for 1,500 nucleotides and are noncoding. Despite thisoverall similarity, only two short regions of nucleotide sequence identity (70.2% of 37 ntand 85.7% of 42 nt) could be detected between the TomRSV and CLRV sequences (Fig.3.18)The 3' noncoding regions of TBRV, GCMV, GFLV and CPMV have a high Ucontent (40.5-48.0%). The entire 3' noncoding region of TomRSV RNA2 has a lower Ucontent of 31.2%. However, its extreme 3' end (ca. 110 nucleotides) has a U content of44.2% which is similar to the high U content in the 3' noncoding regions of these otherviruses. Likewise, the CLRV 3' terminal 1500 nucleotides are 31.2% U, however theextreme 3' 145 nucleotides are 44.1% U (Table 1).* * * * * * * * * * * * * * * * * * * * * * * * * *CLRV^236^GCGCUUGGUUGCCGUAAGCAACCUUCAAAAGGAUGUCTomRSV RNA1^6682 GCGAGUAGCUGCCGUUAGCAGCUUCCAAAAGGUGGCC* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *CLRV^671^AUGCCCUUUUUAUUUGUUUGAGGGUUUCAAACAUCAUAUCUUTomRSV RNA1^6972 A -GUGCCUULTUAUUUGUUUGAGGUUAUCAAACAUCAUAUCUUFig. 3.18 Regions of nucleotide sequence similarity observed between portions of the 3' noncoding regions of TomRSVand CLRV RNA. The first region was 37 nt in length of which 26 nt are identical and the second region was 42 and 41 nt inlength for CLRV or TomRSV, respectively, of which 36 are identical. Nucleotide position numbers to the left of the sequencein CLRV are according to Scott et al., (1992). Numbers to the left of the TomRSV sequence refer to positions in RNA1.Results^843.8 Analysis of the RNA2 Coding RegionThe deduced amino acid sequence of the TomRSV RNA2 long ORF was compared usingthe TFASTA program of Pearson & Lipman (1988) with sequences available through thefollowing data banks: the protein sequence libraries from NBRF, Swiss-Prot and Pseqlpand the nucleic acid sequence libraries from NBRF, GenBank and EMBL. The onlysequences which showed significant sequence similarity to the TomRSV RNA2polyprotein sequence were those from the RNA2 components of the nepoviruses TBRVand GCMV, within the coat protein sequences of these viruses. The sequence of GFLVRNA2 subsequently became available (Serghini et al., 1990), and was also used incomparisons. Comparisons were also made with the M RNA-encoded polyprotein ofCPMV. Location of the coat protein gene on the RNA2 components of TBRV, GCMV andGFLV (Meyer et al., 1986; Brault et al., 1989; Serghini et al., 1990) would suggest thatTomRSV may also encode its coat protein gene on RNA2 (see section 3.8.1 below).3.8.1^Location of the Putative Coat Protein Coding RegionTremaine & Stace-Smith (1968) reported the coat protein amino acid composition for araspberry isolate of TomRSV based on a coat protein with a M r of 24,000. Later, Allen &Dias (1977) reported that the intact TomRSV coat protein had a M r of 58,000. Table 2shows the relative mole ratio of each amino acid (18 different amino acids) as determinedby Tremaine and Stace-Smith (1968), and the rescaled values for the predicted number ofeach amino acid that would be present for a protein of 58 kDa (total of 519 amino acids).These values were compared to the amino acid composition of the first 519 amino acids ofthe RNA2 polyprotein sequence, and summed to obtain values of chi-squared. This wasrepeated for sequences between residues 2-520, and so on until the end of the amino acidResults^8 5Table 2. Amino acid composition analysis of the TomRSV coat protein.Amino Acid Relativemole ratio* ScaledlDeterminedfrom sequence§A 14.9 36 43C 5.1 12 10D + N 16.0 38 45E + Q 18.1 43 49/47F 14.5 35 37G 18.1 43 53/51H 5.2 12 13I 12.7 30 28K 9.7 23 26L 24.1 58 60M 2.7 7 7P 11.5 28 32R 10.5 25 26S 16.2 39 38/37T 14.9 36 40V 9.9 24 26W 5.2 12 9/8Y 7.3 18 20* from Tremaine & Stace-Smith (1968).t relative mole ratio values were resealed for a coat protein with Mr of ca. 58,000as determined by Allen & Dias (1977).§ value to the left of the slash is determined from the potential Q-G cleavage site, thevalue to the right of the slash is determined for the potential Q-E cleavage site (seesection 3.8.2).Results^86sequence at residue 1882 was reached. Chi-squared values were plotted versus thebeginning of the corresponding block of 519 amino acid sequence of the RNA2 polyprotein(Fig. 3.19). The probability of statistical difference at the 0.05 level for 17 degrees offreedom is 27.6. Values of chi-squared were less than 27.6 for sequences C-terminal toamino acid residue 1037 and were less than 5.4 for sequences between residues 1235 and1882 (Fig. 3.19). This suggests that the TomRSV coat protein is encoded C-terminal ofamino acid residue 1037 and is possibly encoded at the very C-terminal region of theRNA2 polyprotein. A comparison of the amino acid composition of the TomRSV coatprotein and the C-terminal region of the RNA2-encoded polyprotein is shown in Table 2.The coat protein coding regions for the nepoviruses TBRV, GCMV, and GFLV have alsobeen localized to the C-terminal region of the long RNA2 ORF (Meyer et al., 1986; Braultet al., 1989; Serghini et al., 1990). An alignment of the amino acid sequence at the C-terminal region of the TomRSV RNA2 polyprotein and the coat protein sequences ofTBRV, GCMV and GFLV is shown in Fig. 3.20. The putative TomRSV coat proteinsequence shares sequence similarity of 21.4%, 22.9%, and 23.4% with those of TBRV,GCMV and GFLV, respectively.3.8.2^Analysis of the Protease Cleavage Site for the Coat ProteinSequenceThe N-terminal region of the putative coat protein was analyzed for the following potentialprotease cleavage sites: Q/S, Q/G, QIM, E/S, E/G, R/A, R/G, and KIA (Palmenberg, 1990;Wellink et a/.,1986; Serghini et al., 1990; Brault et al., 1989; Demangeat et al., 1990)[where Q represents glutamine, S (serine), G (glycine), M (methionine), E (glutamate), R(arginine), A (alanine) and K (lysine)]. The sites Q/G and E/G at amino acid positions1320-1321 and 1325-1326, respectively (Fig. 3.20), are tentativelyI^'^I^•^I^'^I^'^I^'^I^•^I^'^1^'^I^'^1^'^1^1^I^'^10^100^200^300^400^500^600^700^800^900^1000^1100^1200^1300Amino Acid PositionFig. 3.19 Chi-squared values, obtained by comparing the amino acid composition of the TomRSV coat protein(assuming a protein of 519 amino acids, see section 3.8.1), and the amino acid composition of blocks of519 amino acids from the sequence of the TomRSV RNA2 polyprotein, were plotted against the first amino acidcorresponding to each block of 519 amino acids. 18 amino acids were compared (X 2=27.6, P=0.05 indicated bythe horizontal line). The program used for this analysis was written by Bill Ronald and Jim Purves of AgricultureCanada.Results^8 8*^ *^*^ **^* *^* *^* * *^*^*^ *^*TomRSV 1,320 OC,GSWQELaTEAAYLGKVTCAKDAKGGTLLHTLDIIKECKSQNLLRYKEWQTBRV^834^K A G G S Y A F G E T I E L P A T V T P G T V L A V F N I F D K I Q E T N T K V C S K W LGCMV^810 RAGGEFAFIHTIDLPTAVTEGQVLAKIDIFKKIQDAKSMVCVQWMGFLV^680 RGLAGRGVIYIPKDCQANRYLGTLNIRDMISDFKGVQYEKWI^* *^*^x^x^*^*^*^ * * * * * * *^*^*TomRSV 1,371 RQGFLHGKLRLACFIPTNIFCGHSMMCSLDAFGRYDSNVLGASFPVKLASTBRV^880 EQGYVSQNLTAISHLAPNAFSGIAIWYIFDAYGKIPGDV-TTTFELEMARGCMV^856 QAGYVNKNLTFISHLAPSQFCGVAIWYIFDAYGKIPSDV-TTSLELEIARGFLV^723 TAGLVMPTFKIVIRLPANAFTGLTWVMSFDAYNRITSRI- - T A S A D P V Y T* x+^*^*^x^*^* *^++ *^*^* *^x^*^*^*^* *TomRSV 1,421 LLPTEVISLADGPVVTWTFDIGRLCGHGLYYSEGAYARPKIYFLVLSDNDTBRV^929 S F D P H V Q V L R D V S T S T W V I D F H K I C G Q T L N F S G Q G Y C V P K I W V I A A S T F QGCMV^905 S L C P H V H V L R D S K T S V W T I D F H K I C G Q S L N F S G R G F S K P T L W V I A A S T A QGFLV^771 L S V P H W L I H H K L G T F C S E I D Y G E L C G H A M W F K S T T F E S P R L H F T C L T G N N*^* *^*TomRSV 1,471 VPAEADWQFTYQLLFEDHTFSNSFGAVPFITLPHIFNRL-DIGYWRGPTETBRV^979 LARSTATKFRLEFYTRGEKLVRGLAEQP-LSYPIEARHLTDLNLMLAPKQGCMV^955 L P W S A Q V T Y R L E A L A Q G D E I A H G L A T R S I V T Y P I S L E H L K D I E I M L P P R QGFLV^821 KELAADWQAVVELYAELEEATSFLGKPTLVFDPGVFNGKFQF-LTCPPIF*^* * *^* * *^x+ *^*TomRSV 1,520 I D L T S T P A P N A Y R L L F G L S T V I S G N M S T L N A N Q A L L R F F Q G S N G T L H G R ITBRV^1,028 IAVGTYAMIT-FPVSLAAKLQSTSGRTAYSYAAGLLSHFLGVGTGIHFVVGCMV^1,005 MAIGNAGSIN-FPLSFAVQQKSSSGRIAYSYAAGLLSHFLGIGTGIHFKIGFLV^870 FDLTAVTALRSAGLTLGQVPMVGTTKV-YNLNSTLVSCVLGMGTGVRGRV^* * ^*^ *^* *^*^ * *^*^* *TomRSV 1,570 K K I G T A L T T C S L L L S L R H K D A S L T L E T A Y Q R P H Y I L A D G Q G A F S L P I S T PTBRV^1,077 RTTSSAFVTSKLRIALW- -GTVPETDQLAQMPH-VDVEVNVDASLQIQSPGCMV^1,054 QCTSSAFVTARLRVALW- -GDTITLEQLSQMPH-VDCDVDVVSSLKIQSPGFLV^919 HICAPIFYSIVLWVVSEWNGTTMDWNELFKYPG-VYVEEDGSFEVKIRSPTomRSV 1,620 - H A A ^ T S F L E D M L R L E I F A I A G P F S P K D N K A K Y Q F M C Y F D H I E L VTBRV^1,124 F F S T  A N F G N S G S A F Y V S T L C A P M A P E T V E T G S E Y Y I Q I K G I E A NGCMV^1,101 F Y A T ^ ANFGDSGARFWVTPMSSPMAPETMESKLEYYIQILGIDADGFLV^968 YHRTPARLLAGQSQRDMSSLNFYAIAGPIAPSGETAQLPIVVQIDEI-VR*^*^*^*^x^x^ x^* * *TomRSV 1,663 EGVPRTIAGEQQFNW-CSFR- - -NFKIDDWKFEWPARLPDILDDKSEVLLTBRV 1,168 PGLCREINYKQRFAW-CLLECLDNSKASPIKVKIPSRIGNLSSKHVKVTNGCMV 1,145 PPMCRQINYDQRFAWFTLLRPPDPKLSKILKLTLPMCNIAYKEATVTNGFLV 1,017 PDLSLPSFEDDYFVW-VDFS- - -EFTLDKEEIEIGSRFFDFTSNTCRVSM*^*^*^* *^* * *^1r *^*^*^*^*^* *^ x^x^x+ *TomRSV 1,709 R Q H P L S L L I S S T G F F T G R A I F V F Q W G L N T T A G N M K G S F S A K L A F G K G V E ETBRV^1,217 FVNALAILCATTGMHHGNCTIHFSWLWH- -PAELGKQL-GRLKFVQGMGIGCMV^1,195 YVNAFAIMCATTGMHAGKCILHFSWTLN- - K G T S F K D L Q G H I S F Y S G M G DGFLV^1,063 GENPFAAMIACHGLHSGVLDLKLQWSLN- - - -TEFGKSSGSVTITKLVGD* *^* x *^x^x^*^*TomRSV 1,759 I- - -EQTSTVQPLVGACEA-RIPVEFKTYTG-YTTSGPPGSMEPYIYVRTTBRV^1,264 N- - NEHIGDTMCYNSLSNTHSVPFQFGSFAGPITSGGKADEAENWIEIQSGCMV^1,243 S T I G E H H G E F H L G G P L S S S L A V P F E F G S F A G P V T S G G T P F T S E N W L R V E TGFLV^1,109 KA-MGLDGPSHVFAIQKLEGTTELLVGNFAG-ANPNTRFSLYSRWMAIKLResults^8 9.^* *^* * **^* * *^* * * * *^ *TomRSV 1,803 TQAKLVDRLSVNVILQEGFSFYGPSVKHFKKEVCTPSATLGTNNPVGAPPTBRV^1,312 PDFSWVASLHVSIEVHEGFKFYGRS ^  AGPLTIPAGCMV^1,293 A H W D W L T S L T V D I Q V L P G F R F Y G R S  A G P L T I P SGFLV^1,157 DQAKSIKVLRVLCKPRPGFSFYCRT  SFPVTomRSV 1,853 ENVDTGGPGGQYAAALQAAQQAGRNPFGRGTBRV^1,345 TVADVSAVSGSFig. 3.20 Alignment of the amino acid sequences of the TomRSV, TBRV, GCMV and GFLVcoat proteins. The alignment was generated using the multiple sequence alignment program ofFeng & Doolittle (1987). A double asterisk indicates amino acids common to all four sequences.A single asterisk indicates two possible amino acids at that position in all four sequences. Dashesindicate gaps introduced to maximize the alignment. Underlined dipeptides are known or potentialcleavage sites. Numbers to the left of each line of sequence refer to amino acid position on eachnepovirus RNA2 polyprotein.Results^90identified as potential cleavage sites for the TomRSV coat protein based on their locationand the previously determined size of the TomRSV coat protein. The Q/G and E/G sitesproposed for TomRSV are different from the R/A, RIG and K/A cleavage sites proposedfor GCMV, GFLV and TBRV, respectively, but are similar to those of the como-, poty-,and picornaviruses (Hellen et al., 1989). The putative TomRSV coat protein has acalculated size of 62 kDa which is similar to the 58,000 Mr determined by SDS-PAGEanalysis (Allen & Dias, 1977).3. 8 . 3 5' Terminal Coding Region of RNA2The coding sequence 5' of the putative coat protein gene has the capacity to encode a 145kDa polypeptide. This region can be divided into several domains, based on similarities inamino acid sequence and genomic organization with other nepo- and comoviruses. Thisregion of the TomRSV RNA2 polyprotein also contains unique sequences not present inthe other nepoviruses or the closely related comoviruses.Sequence similarity for over 300 amino acids (36.7% ) was detected between theregions immediately N-terminal of the TomRSV and GFLV putative coat protein sequences(Fig. 3.21). This was the highest match obtained in comparisons between the entireTomRSV RNA2 polyprotein and those of the other nepoviruses. This region of theTomRSV RNA2 polyprotein could not easily be aligned with the corresponding regions ofTBRV and GCMV which have been identified as putative transport proteins involved inviral cell-to cell movement (Koonin et al., 1991). Known and putative transport proteinscontain the conserved amino acid residues "LPL", "G" and "D" which are each separatedby 19 to 30 amino acids (Koonin et al., 1991). In TomRSV only the P and G componentsare present, based on alignments by A. Mushegian (personal communication) (see Fig.3.21). N-terminal to the putative transport region of TomRSV, and separated by ca. 280*^** ** *^** * **^* *TomRSV 984 ARKRWLS-RKQEDSQEDNIKRY---ADKHGISFEEARAVYKAPKEGVPTQRSILPDVRDAYSARSAGARVRSLFGGFLV^360 GRAQWISERRQALRRREQANSFEGLAAQTDMTFEQARNAYLGAADMIEQGLPLLPPLRSAY^APRGLWR* * *** * ** *^***** ** *^**^* * **** *****TomRSV 1059 GSPTTRAQRTEDFVLTSPSAGDASSFSFYFNPVSEQEMAEQERGGNTMLSLDAVEVVIDPVGMPGDDTDLTVMVLGFLV^435 G-PSTRANYTLDFRLNGIPTG-TNTLEILYNPVSEEEMEEYRDRGMSAVVIDALEIAINPFGMPGNPTDLTVVAT^* ** ** *** ******* * ***^* * *** *^* * ******TomRSV 1134 WCQNSDDQRALICAMSTFVGNGLARAVEYPGLKLLYANCRVRDGRVLKVIVSSTNSTLTHGLPQAQVSIGTLRQHGFLV^510 YGHERDMTRAFIGSASTFLGNGLARAIFFPGLQ--YSQEEPRRESIIRLYVASTNATVDTDSVLAAISVGTLRQH* * **^* **^*** * *^* * * * * * * * *^** **^*TomRSV 1209 LGPGHDRTISGALYASQQQGFNIRATEQGGAVTFAPQGGHVEGIPSANVQMGAGEHLIQAGPMQWRLQRSQSSRFGFLV^585 VGSMHYRTVASTVHQAQVQGTTLRATMMGNTVVVSPEGSLVTGTPEARVEIGGGSSIRMVGPLQWESVEEPGQTF**TomRSV 1284 VVSGHSRTGFLV^660 SIRSRSRSFig. 3.21^Amino acid sequence alignment of the regions N-terminal to the TomRSV and GFLV coatproteins. Dashes indicate gaps introduced to maximize the alignment. Asterisks above the sequence indicatepositions of amino acid identity. The numbers to the left of the sequences indicate amino acid positions of thelong polyproteins of TomRSV RNA2 and GFLV RNA2. Underlined residues are conserved among knownand putative transport proteins (A. Mushegian, personal communication).Results^92amino acids are three tandemly repeated amino acid sequences. This region consists of twoidentical amino acid repeats, 53 amino acids in length, and one partial and degenerate repeatof 38 amino acids (Fig. 3.22). The sequence has a high proline content (12.1%) and thedipeptide L-P is repeated 13 times while the sequence LPL is repeated three times.3.9 Amino Acid Sequence Similarity Between TomRSV RNA1 and RNA2PolyproteinsAs indicated above (section 3.7.1), the 5' terminal nucleotide sequences of TomRSVRNA1 and RNA2 are very similar. This similarity includes the two potential in-frameinitiation sites at AUG78 and AUG 441. If translation initiation occurs at AUG78, then theN-terminal regions of TomRSV RNA1 and RNA2 polyproteins would be identical for thefirst 132 amino acids, and of the next 145 amino acid residues, 75.3% of the positionswould be identical (Fig. 3.23). The second in-frame initiation site at AUG441 occursshortly before the point where the homology between the RNA1 and RNA2 polyproteinsbecomes less than perfect.3.10 Analysis of the TomRSV RNA1 Coding RegionComparisons were made between the deduced polyprotein sequence of the TomRSVRNA1 long ORF and those of TBRV (Greif et al., 1988), GCMV (Le Gall et al., 1989),and GFLV (Ritzenthaler et al., 1991), as well as the CPMV B RNA (Lomonossoff &Shanks, 1983), the potyvirus tobacco etch (TEV) (Allison et al., 1986), and poliovirusRNA (Racaniello & Baltimore, 1981) using the FastN algorithm of Lipman and Pearson(1985). The TomRSV RNA1 polyprotein contains sequences characteristic of a putativeResults^9 3repeat 1 554 SWSSPLPLFANFKVNRGACFLQVLPQRVVLPDECMDLLSLFEDQLPEGPLPSF*****************************************************repeat 2 607 SWSSELELFANFKVNRGACFLQVLEQRVVLEDECMDLLSLFEDQLEEGELESF**********^**************^*^*^**^**^*repeat 3 660 SWSSPLPLFASFKVNRGACFLQVLPARKVVSDEFMDVLEFig. 3.22 Alignment of the amino acid sequences of the three tandem repeats near theN-terminal region of the TomRSV RNA2 polyprotein. Asterisks between repeats indicatepositions of amino acid identity. Proline residues are underlined as well as the dipeptidesLP and the tripeptides LPL. The numbers at the beginning of each line refer to amino acidposition on the large TomRSV RNA2 polyprotein.Results^94**************************************************^RNA1 MSS ICFAGGNHARLPSKAAYYRAISDRELDREGRFPCGCLAQYTVQAPPP^50^RNA2 MSS ICFAGGNHARLPSKAAYYRAISDRELDREGRFPCGCLAQYTVQAPPP 50**************************************************^RNA1 AKTQEKAVGRSADLQKGNVAPLKKQRCDVVVAVSGPPPLELVYPARVGQH^100RNA2 AKTQEKAVGRSADLQKGNVAPLKKQRCDVVVAVSGPPPLELVYPARVGQH 100*************************** *** ******^* *^**^RNA1 RLDQPSKG PLAVPSAKQT STAMEVVL SVGEAALTAPWLLC SYKS GVS S PP^150RNA2 RLDQPSKGPLAVPSAKQTSTAMEVVL SAEEAAITAPWLLRPCK - GEAPPP 149** ******** * ** ******** *********** * ***** **RNA1 PPMTQRQQF AAIKRRLVQKGQQ I IREL IRARKAAKYAAFAARKKAAAVAARNA2 PPLTQRQQFAALKKRLAVKGQQ I IREHIRARKAAKYAAIAKAKKAAALAA** ************ ******^*********************RNA1 QKARAEAPRLAAQKAAIAKILRDRQLVSLPPPPPPSAARLAAEAELASKSRNA2 VKAAQEAPRLAAQKAAISKILRDRDVAALPPPPPPSAARLAAEAELASKA** *****^**** ** *** **RNA1 ASLQRLKAFHRANRVRPVLNNSFPSPP 277RNA2 ESLRRLKAFKTFSRVRPALNTSFPPPP 276200199250249Fig. 3.23 Amino acid sequence alignment of the N-terminal regions of the TomRSV RNA1and RNA2 polyproteins. The sequences shown begin at the first in-frame AUG codon atnucleotide position 78. The M which corresponds to the second in-frame AUG at nucleotideposition 441 is underlined. Asterisks indicate positions of amino acid identity. The dashindicates a gap used to maximize the alignment. Numbers to the left of the sequence refer toamino acid residue position.Results^95viral protease cofactor (Ritzenthaler et al., 1991), an NTP-binding domain (Gorbalenya &Koonin, 1989; Gorbalenya et al., 1989), a viral cysteine protease (Bazan & Fletterick,1988) and an RNA-dependent RNA-polymerase (Kamer & Argos, 1984; Argos, 1988).3.10.1^RNA-Dependent RNA-PolymeraseThe C-terminal region of the TomRSV RNA1-encoded polyprotein contains sequencescharacteristic of known and putative RNA-dependent RNA-polymerases (RDRP) (Kamer& Argos, 1984; Argos, 1988) identified from a wide number of RNA viruses that infectvertebrate, plant and insect hosts, including those encoded by nepo-, como-, poty- andpicornaviruses. Candresse et al., (1990) defined the RDRP domain consensus sequence asDxxxxD-xy-GxxxTxxxN-xy-nnnnGDDnnn,where x is any amino acid, n is a hydrophobicresidue and x y indicates a variable number of amino acids. This domain is present in theTomRSV RNA1 polyprotein sequence beginning at residue 1771. Fig. 3.24 shows analignment of this region from TomRSV with the corresponding regions from the othernepoviruses TBRV, GCMV and GFLV as well as the CPMV 87K protein, the TEV Mband the poliovirus 3D proteins. Overall amino acid sequence similarity between theTomRSV polyprotein containing the RDRP domain and those of the other viruses are asfollows; TBRV (37.6% over 476 residues), GCMV (39% over 462 residues), GFLV(38.9% over 691 residues), CPMV (40% over 492 residues) TEV (24.6% over 191residues), and polio (23.1% over 260 residues).**** *****^***^** **^* ********TomRSV 1,769^NCDYSRFDGLL--X49--PSGCALTVIINS--X28--LIVYGDDNLITBRV^1,716^NCDYSGFDGLL--X54--PSGFALTVVVNS--X28--LLVYGDDNLIGCMV^1,701^NCDYSGFDGLL--X54--PSGFALTVVMNS--X28--LLVYGDDNLIGFLV^1,732^YCDYKAFDGLI--X50--PSGCALTVVLNS--X28--LITYGDDNVFCPMV 1,433^CCDYSSFDGLL--X51--PS2FPMTVIVNS--X33--LVTYGDDNLITEV^2,525^DADGSQFDSSL--X52--NSGQPSTVVDNT--X22--YYVNGDDLLIPolio 1,979^AFDYTGYDASL--X45--PSGCSGTSIFNS--X24 --MIAYGDDVIAConsensus^D^D^G T N^GDDFig. 3.24^Amino acid sequence alignment of the RNA-dependent RNA polymerase domains from TomRSV,TBRV, GCMV, GFLV, CPMV, TEV and polio. X n refers to the number of amino acids separating the conserveddomains. Asterisks indicate identical amino acid residues in at least four of the seven sequences. The underlinedresidues which form the consensus sequence are the most highly conserved among (+)ssRNA viruses. Numbersto the left of the sequence refer to amino acid residue position in the corresponding viral polyproteins. Theconsensus sequence is as defined by Candresse et al., 1990).Results^97^3.10.2^Cysteine ProteaseA second conserved amino acid sequence domain is present N-terminal to the putativepolymerase region of the TomRSV RNA1 polyprotein and that is characteristic of knownviral cysteine proteases including the polio 3C, CPMV 24K and TEV 49K proteases(Argos et al., 1984; Bazan & Fletterick, 1989; Carrington & Dougherty, 1987), theputative proteases encoded by TBRV, GCMV and GFLV RNA1 components (Greif et al.,1988; Le Gall et al., 1989; Ritzenthaler et al., 1991). The following amino acid residues,H-x70-D-x78-CG-x8-GxxxxxGxHxxG where x is any amino acid, beginning at position1283 of the TomRSV RNA1 polyprotein sequence, are in common with known andputative viral cysteine proteases. Fig. 3.25 shows an alignment of this region with similarregions from the nepoviruses TBRV, GCMV and GFLV, as well as the CPMV 24K, TEV49K and the polio 3C proteases.^3.10.3^NTP-Binding DomainThe TomRSV RNA1 polyprotein sequence beginning at amino acid residue 791 containssequences characteristic of purine NTP-binding domains which are composed of twoconserved regions designated the "A" and "B" sites (Walker et al., 1982; Gay & Walker,1983). The A site contains the consensus sequence (G/A)xx(G)xGKS/T, preceded by ahydrophobic region, where x is any amino acid. The B site also contains a hydrophobicregion followed by the sequence D(E/D). The A and B sites are separated by ca. 40residues. Figure 3.26 shows an amino acid sequence alignment of this conserved domainwith the similar regions from the RNAl-encoded polyproteins of TBRV, GCMV, GFLV,the 58K protein of CPMV, the 2C protein of polio, and the CI protein of TEV. TomRSV,TomRSVTBRVGCMVGFLVCPMVTEVPolio*^*^***^*******^1,283^H--X70--D--X73--NSPEDCGALLVAHLEGGYKIIGMHVAG1,270^H--X55 --D--X68--SRNDDCGMIILCQIKGKMRVVGMLVAG1,256^H--X55--D--X68--SRNNDCGMLLTCQLSGKMKVVaMLVAG1,284^H--X66--D--X68--AKKYDCGALAVAVIQGIPKVIAMLVSG987^H--X53--D--X67--TIPEDCQSLVIAHIG2KHKIVGVHVA22,083^H--X34--D--X64--TKDGQCGSPLVSTRD2--FIVGIHSAS1,607^H-- X44--D--X56--TRAGQCGGVITC--TG--KVIGMHVGGConsensus^H^D^CG^G^G H GFig. 3.25 Amino acid sequence alignment of the viral cysteine protease domains from TomRSV, TBRV,GCMV, GFLV, CPMV, TEV and poliovirus. Xn refers to the number of amino acids separating the conserveddomains. Asterisks indicate identical amino acid residues in at least four of the seven sequences. Underlined residuesare the most highly conserved among known and putative viral cysteine proteases (Bazan & Fletterick, 1989).Results^9 9* ** *^****^ *****TomRSV 791^WVYLY2GPROGKSLFAQSFMN--X29--AICCVDDLSSCETBRV^776^WIYLFGQRHCGKSNFMATLDN--X28--TFFHVDDLSSVKGCMV^758^WIYLWGPSHCGKSNFMDVLGM--X28--TIMEIDDLSSIKGFLV^776^WVYIFGASQSGKTTIANSIII--X30--ACVKVDDFYAIECPMV^489^TIFFQGKSRTGKSLIMSQVTK--X29--PFVLMDDFAAVVTEV^1342^DFLVRGAVGSGKSTGLPYHLS--X68 --DFVIIDECHVNDPolio 1250^CLLVHGSPGTGKSVATNLIAR--X27--GVVIMDDLNQNPConsensus G G GKS^ DD^T E"A" site^ "B" siteFig. 3.26 Amino acid sequence augment of the "A" and "B" sites that form theNTP-binding domains of proteins encoded by TomRSV, TBRV, GCMV, GFLV,CPMV, TEV and poliovirus. X n refers to the number of amino acid residuesseparating the A and B sites. Highly conserved residues among NTP-binding domains(Walker et al., 1982; Gay & Walker, 1983) are underlined. Asterisks indicate identicalamino acid residues in at least three of seven sequences. Numbers to the left of thesequence refer to amino acid residue position in the viral polyprotein.Results^100along with the other nepoviruses and CPMV, do not retain the second conserved G in the"A" site.The CPMV 58K, TEV CI and polio 2C proteins also contain a region ofhydrophobic amino acids at their C-termini. The CPMV 58K and polio 2C proteins aremembrane associated and it has been suggested that the hydrophobic regions constitutetransmembrane spanning domains (Goldbach & van Kammen, 1985; Takeda et al., 1986;Takegami et al., 1983). The region C-terminal to the NTP-binding domain of the TomRSVRNA1 polyprotein was analyzed for hydrophobic regions (Kyte & Doolittle, 1982). Thehydrophobic stretch of amino acids, LLLVLAAVILILFF was found, beginning at aminoacid residue 1,168, and is predicted to be a transmembrane domain by the method of Argoset al. (1982). It is possible that the TomRSV putative NTP-binding domain andtransmembrane domain are located on the same protein, following proteolytic processing ofthe large polyprotein which would be equivalent to the CPMV 58K, TEV CI and polio 2Cproteins (see section 3.10.7 below).3.10.4^VPg ProteinIn polio, CPMV and GFLV, the amino acid sequence for the small 5' VPg is locatedbetween the NTP-binding and protease domains (Racaniello & Baltimore, 1988; Goldbach& Rezelman, 1983; Zabel et al., 1984; Pinck et al., 1991). It is difficult to determinewhether the VPg of TomRSV is also at this location since VPgs are small and not highlyconserved. The location of the TomRSV VPg therefore could not be unequivocallydetermined from amino acid sequence alignments. However, considering the conservationin genomic organization among these viruses it is very likely that the TomRSV VPg is alsopresent at this site. It is possible that the TomRSV VPg is located between severalResults^101predicted protein cleavage sites between the NTP-binding and protease domains (seesection 3.10.7).^3.10.5^Protease Co-FactorAnother conserved amino acid sequence, F-x27-W-x i-L-x23-E, is located near the N-terminal region of the TomRSV RNA1 polyprotein beginning at amino acid residue 482.This conserved amino acids are present in the nepo- and comovirus polyprotein sequence(Ritzenthaler et al., 1991). In CPMV, the sequence is present in the 32K protein which is aco-factor for the 24K protease during cleavage of the CPMV M RNA polyprotein at theQ/M site (Voss et al., 1988). An alignment of the CPMV, TBRV, GCMV, GFLV andTomRSV sequences is shown in Fig.^N-Terminal RegionAs indicated in section 3.9, the N-terminal regions of the RNA1 and RNA2 polyproteinsare very similar. The regions beginning shortly after AUG441 (amino acid residue 139) ofboth the RNA1 and RNA2 polyproteins could be aligned with the N-terminal regions of theTBRV and GCMV RNA1 polyproteins (Fig. 3.28) but not with the N-terminal region ofGFLV RNA1 polyprotein sequence. The function, if any, of the polypeptide encoded inthis region is unknown.Results^102*^* * * *^*^*^* *TomRSV 482^EDDT I - -X13 --IEELWRWSLEW- -X11- - IJ - -X21 - -FAB_TBRV^463 ERECI - -X13 - -IEVMIKKVKD_W--X11- -1_1- -X2 1 - - .LT-IEGCMV^462^FREAL- -X13 - -VEVLIARVKSW- -X11 - -L.- -X21 - -LIEGFLV^471^EDVTM - -X13 - -LKKIWEKLSEkl- -X11 - -L- -X21 - -1,VECPMV^192 EEKMV- -X13 - -L S QLWDK IVQTL1- -X11 - -L- -X21- -LVEConsensus^F^W^L^L EFig. 3.27 Amino acid sequence alignment of the protease co-factor domain fromTomRSV, TBRV, GCMV, GFLV and CPMV. Xn refers to the number of amino acidsseparating the conserved residues. Asterisks indicate identical amino acid residues in atleast three of five sequences. The underlined residues are the most highly conserved asdetermined from these alignments. The second L residue is not conserved in TomRSVwhere it is replaced by another hydrophobic residue F. Numbers to the left of thesequence refer to amino acid residue position in the viral polyprotein.*^A^ *^*^A^AA*^A A^***A^A^*TomRSV RNA1 138 LLCSYKSGVSSPPPPMTQRQQFAAIKRRLVQKGQQIIREL----IRARKAAKYAAFAARKKAAAVTomRSV RNA2 138 LLRPCK-GEAPPPPPLTQRQQFAALKKRLAVKGQQIIREH----IRARKAAKYAAIAKAKKAAALTBRV RNA1 189 KLNKAKALGEAHRSAVARAQAKAEVLREFEPSPQQIQRALEAQIFADRLSRKYAALTARVRAKRAGCMV RNA1 189 KLTKANALGAAHRSAVATAQAKAEVLREFEPSPAHIQIAVKAHIFAEKLSRKYADLTAQVRARRA**^A^*^A^A*^ AAA A^A^A*^ATomRSV RNA1 200 AAQKARAEAPRLAAQKAAIAKILRDRQLVSLPPPPPPSAARLAAEAELASKSASLQRLKAFTomRSV RNA2 199 AAVKAAQEAPRLAAQKAAISKILRDRDVAALPPPPPPSAARLAAEAELASKAESLRRLKAFTBRV RNA1 255 AARELREKELFLETQDLLNAPLLPPMEKVGIERKY-RKVRPTGSNVTSTPKPNVLENLCPFGCMV RNA1 255 AARDLRAKEIYLEIVDLLGAPLLSIPQQIKIKGKYLR--RSVAAEVEVPHTRNPMAELVPYFig. 3.28 Amino acid sequence alignment of the N-terminal regions of TomRSV RNA1 and RNA2 polyproteins andTBRV and GCMV RNA1 polyproteins. Asterisks indicate conserved residues in all four sequences and a caret indicatesidentical amino acid residues in three of four sequences. Dashes are spaces inserted into the sequence to maximize thealignmnent. Numbers to the left of the sequence refer to amino acid residue position in the viral polyprotein. The alignment wasgenerated using the multiple sequence alignment program of Feng & Doolittle (1987).Results^1043.10.7^Proteolytic Processing of the RNAl-encoded PolyproteinThe TomRSV RNA1 polyprotein was analyzed for potential protease cleavage sites. Thesequence was scanned for the known cleaveage sites Q/S, Q/M, Q/G, E/G and E/S that areutilized during the cleavage of the picorna-, poty- and comoviruses (Hellen et al., 1989;Pahnenberg, 1990; Wellink et al., 1986). The location of these sites was then compared tothe known cleavage sites for GFLV and CPMV, as well as the putative sites for the RNA1-encoded polyproteins of the nepoviruses TBRV, GCMV and GFLV (Pinck et al., 1991;Ritzenthaler et al., 1991; Le Gall et al., 1989; Greif et al., 1988). Fig. 3.29 shows thegenomic organization and the cleavage sites for the CPMV B RNA polyprotein comparedwith those proposed for the TomRSV RNA1 polyprotein. A likely cleavage site betweenthe protease and the RDRP domain could occur at the Q/M dipeptide located at amino acidposition 1,465-1,466. This site aligns perfectly with the Q/G cleavage site of CPMV.Three potential cleavage sites were found between the NTP-binding and protease domains.It is possible that cleavage at the protease side could occur at either the Q/S site located atposition 1,236-1,237, or at the Q/G site at position 1,239-1,240. A potential cleavage siteat the NTP-binding domain side occurs at the Q/S site at position 1,212-1,213. The regionbetween the two Q/S sites is 24 amino acids in length. This is identical in size to theGFLV VPg protein and only four amino acid residues shorter that the CPMV VPg protein.These proteins are too small to be aligned. A putative cleavage site between the proteaseco-factor domain and the NTP-binding domain is the Q/G site at position 620 -621. Thissite corresponds closely with the Q/S site of CPMV.1I^I^I II^I-^IIII^Ii1 I^IResults^105ESEGQGQMQSQ/S9^Q/G^Q/S Q/G Q/M^0 1^BM^•^1111 I^■^I66 K Protease 67K NTP-binding 2.7K 25K^83K RDRPcofactor VPg proteasetransmembraneregionQ/S^Q/S Q/M Q/G1^o I^Eza^i^in^■^ITomRS V RNA1polyprotein1CPMV BRNApolyprotein32K^58K^4K 24K^87K RDRPProtease^NTP-binding^VPg proteasecofactor^proteintransmembraneregionFig. 3.29^Analysis of the TomRSV RNA1 polyprotein sequence for protease cleavage sites.The TomRSV polyprotein sequence was scanned for the dipeptides E/S, E/G, Q/G, Q/M and Q/S(shown at top) which are common cleavage recognition sites within como-, poty- and picornaviruspolyproteins (Hellen et al., 1989; Palmenberg et a1.,1990; Wellink et al., 1986). Putative cleavagesites are indicated as thickened lines in the top table as well as in a diagrammatic representation ofthe TomRSV RNA1 polyprotein sequence. Cleavage of the CPMV BRNA is shown as acomparison and was used to help determine likely cleavage sites. The sizes of the cleaved productsare indicated along with their known and putative functions. Conserved amino acid sequencedomains between TomRSV and CPMV are indicated and helped to limit the location of potentialcleavage sites.Results^1063.11 Genomic Organization of TomRSV RNA1 and RNA2The putative genomic organization of TomRSV RNA1 and RNA2 has been determinedbased on analysis of the nucleotide sequences of RNA1 and RNA2, the amino acidsequences translated from the large ORFs on RNA1 and RNA2, and comparisons of thenucleotide and amino acid sequences with those of related viruses. A diagrammaticrepresentation of the structure and function of TomRSV sequences are shown in Fig. Translation Initiation of TomRSV RNAResults from section 3.6 indicate that either AUG78 and/or AUG441 could function astranslation initiation sites on RNA1 and RNA2 which would give rise to either a 244 kDaand/or 231 kDa translation product from RNA1 and a 207 kDa and/or 194 kDa translationproduct from RNA2. The resolution of these possible translation products in vitro bySDS-polyacrylamide gel electrophoresis (SDS-PAGE) would be difficult. Therefore, inorder to determine whether AUG78 and/or AUG441 could initiate translation in vitro, acDNA construct (pMRD14A, see below) was made which contained the TomRSV 5' RNAsequence, with both potential AUGs present and a truncated RNA2 ORF. The sizes of thetranslation products initiated from either AUG78 or AUG441 (41 kDa and 28 kDa,respectively) could then be readily distinguished by SDS-PAGE.In protoplasts, relative levels of expression from AUG78 and AUG441 wereassayed by fusing the 5' TomRSV sequence in-frame with the reporter gene 13-glucuronidase (GUS) (plasmid p35SGUSO4, see below). The relative levels of expressionfrom AUG78 and AUG441 were assayed by comparing the relative rates of GUS activityUAA5723^1■11••■•AAARNA2(7272)AUG7 AUG441 (Q/G, E/G)AUG7 AUG441(Q/S)(Q/G)^(Q/S) (Q/G) (Q/M)Elprotease^NTP-binding^proteaseco-factor putative helicaseVPgtransmembranesequenceRNA 1(8214)UAA 6668■=1=m1=11=1AAApolymerasetandem^transport protein^coat proteinrepeatsFig. 3.30^Genomic organization of TomRSV RNA1 and RNA2 showing the 5' and 3' noncodingregions, indicated by the solid lines and the large polyproteins by the boxed regions. Above the amino acidsequence are the putative cleavage sites and below the sequence are the putative functions. Conservedamino acid sequence domains are indicated by the small shaded boxes. Regions of similar shading at theN-termini of the RNA1 and RNA2 polyproteins indicate amino acid sequence similarity.Results^108from protoplasts that had been transfected with different versions of p35SGUSO4 in whichthe potential initiation sites were altered to non-AUG triplets.3.12.1^Plasmid ConstructsThe clone pMRD14A was generated from pMRD14 (which is a full length clone ofTomRSV RNA2 linked to the T7 RNA polymerase promoter; see section 3.13.1) bydigesting with HindIII and religating the compatible ends of the large fragment (Fig. 3.31).RNA transcripts, generated from the bacteriophage T7 RNA promoter located upstream ofthe 5' TomRSV sequences in pMRD14A using T7 DNA-dependent RNA polymerase,were translated in vitro in wheat germ extracts and analyzed as described below.Construction of plasmid p35SGUSO4, which is summarized in Fig. 3.32, wasderived via the following intermediates. In two separate sets of restriction enzymedigestions, the 571 by EcoRII Hadll fragment from pMRD14A, which contained the T7RNA polymerase promoter and the 5' region of TomRSV RNA2, and the 2.3 kbpSacI/XbaI fragment containing the TomRSV 3' region were gel purified. PlasmidpAGUS-1 (Skuzeski et al., 1990) was digested with HindIII and the ends were treatedwith mung, bean nuclease. This was then digested with Sad and the 1.8 kbp fragmentcontaining the P-glucuronidase reporter gene (GUS) purified. The 571 by fragmentcorresponding to the TomRSV 5' region and T7 RNA polymerase promoter, the 1.8 kbpfragment corresponding to GUS gene and the 2.3 kbp fragment corresponding to theTomRSV 3' region were ligated into EcoRIISaci digested pMR1 (pMR1 is essentiallypUC19 with most of the multicloning site removed and replaced with part of the Bluescriptmulticloning site; see section 3.12.1) to obtain pR2GUSO4 (Fig. 3.32). pR2GUSO4contains the TomRSV 5' sequence, which includes AUG78 and AUG441, fused in-framewith the GUS ORF followed by the TomRSV 3' noncoding region. The TomRSV/GUSwis4 A.2 long open .e-Hinc1111AUG441deAUG785' noncoding regionT7 promoterHindIII3' noncoding regionpoly(A39)XbalHindIIISac'HaeIIIAUG441AUG78T7 promoter3' noncoding regionpoly(A39)XbalResults^109I HindIII digestionand religationFig. 3.31^Diagram showing the construction of plasmid pMRD14A from pMRD14.pMRD14A lacks the large HindIII fragment of pMRD14 that corresponds to most ofthe TomRSV RNA2 coding sequence. For information concerning pMRD14 seesection■^•I1••""••••••••.4• 17 .AUG441Tom RSV^AUG78noncoding regionT7 promoter.'"SacllEcoRIpoly(A39)Xbal^CaMV 35S promoterHind IllTom RSV3' noncoding regionA -•••••••••■•••■.••■1•►NOS terminaterEcoRIeos oRF.,s ORF.••••.••••••••••CaMV 35S promoterArm 'Hindlil Orr••ARV■•pAGUS-1••••I`11pR2GUSO4Sac(Hind!!!3' noncoding region40111AUG44 1AUG 785' noncoding regionT7 RNA promotor---SacllEcoRINOS terminaterEcoRIpoly(A39 )Xbal AUG441AUG5' noncoding^78regionSXilHirXlllCaMV 35S promoter-•••••..'•••■•.•••■u.•••••••11•11.p35SGUSO43' noncoding regionNOS terBExpectedCaMV35S promoter sequences^T7 promoter^Tom RSV 5' sequenceResults^110GUS OppCATTTCATTTGGAGAGGGTAATACGACTCACTATAGTAAGCG...1--1W 35Sp35SGUSO4\^CATTTCGACTCACTATAGTAAGCG^CaMV 35S sequences^T7 sequences^Tom RSV 5' sequenceResults^111Fig. 3.32 (A) Diagram summarizing the construction of clone p35SGUSO4. (B)Comparison of the sequences at the junction between the CaMV 35S promoter andbacteriophage 1-7 promoter sequences in p35SGUSO4 with what would have been expectedfrom the cloning experiment. Restriction enzyme cleavage sites with an X through themhave been destroyed as a result of the cloning procedure.Results^112ORF in-frame fusion was confirmed by dideoxynucleotide sequencing at the junction.Plasmid pR2GUSO4 was digested with Sad and Xbal, the ends treated with mungbeannuclease and the large fragment containing the entire insert gel purified. The plasmidpAGUS-1 was digested with HindIII1Sacl, the ends treated with mung bean nuclease andthe fragment containing the vector sequence, the cauliflower mosaic virus 35S promoter(CaMV 35S) and the nopaline synthase termination sequence (NOS ter) was gel purified.The two purified fragments were ligated together to create clone p35SGUSO4 (Fig. 3.32).In plasmid p35SGUSO4, the pR2GUSO4 insert is placed between the CaMV 35S and NOSter. The junction between the 35S promoter and insert sequence was analyzed bydideoxynucleotide sequencing which indicated that treatment with mung bean nucleaseresulted in excess digestion by the enzyme at both ends prior to ligation. Fig. 3.32 showsthe junction sequence for clone 35SGUSO4 as compared to what was expected, however,the 5' TATA box of the CaMV 35S promoter was still present (not shown in Fig. 3.32)and the deleted sequences in p35SGUSO4 did not appear to adversely affect the CaMV 35Spromoter activity.AUG78 was changed to AAU by in vitro mutagenesis using the mutagenicoligonucleotide #9 (5' AACAAATGGAGGAATTCAAAAGAAAAGAAA 3', mutationsites are underlined) complementary to nucleotide positions 64-94 as the mutagenic primer(see Material & Methods section 2.10). The mutation also introduced a new EcoRI siteinto the construct. Similary, AUG441 was changed to CUC using the mutagenicoligoncleotide #10 (5' AAGAACAACCTCQAQTGCAGTGGAGG 3', mutated sites areunderlined) complementary to nucleotide positions 430-455 which also introduced therestriction site Xhol. Following the in vitro mutatgenesis experiments, the resulting cloneswere screened for plasmids which contained either the EcoRI or Xhol sites and themutations confirmed by dideoxynucleotide sequencing. The entire insert was sequenced toconfirm that no other mutations had been introduced, and the insert subcloned back intoResults^113p35SGUSO4. The resulting clones contained a mutation at either AUG78 (AAUG78),AUG441 (AAUG441), or both AUG78 and AUG441 (AAUG78+441) (Fig. 3.33).^3.12.2^Site of Initiation In VitroSynthetic RNA transcripts were translated in wheat germ extracts and the translationproducts analyzed by SDS-PAGE followed by fluorography (Fig. 3.34). Transcriptsderived from pMRD 14A resulted in the synthesis of a single predominant band with a M rof 41,000. Translation initiation at AUG78 would give rise to a polypeptide of 41 kDawhereas initiation at AUG441 would give rise to 28 kDa polypeptide. From these results itcan be concluded that initiation in vitro occurs preferentially at AUG78. Translations inrabbit reticulocyte lysates gave similar results (not shown).^3.12.3^Site of Initiation in ProtoplastsEqual amounts of the plasmid constructs p35SGUSO4, AAUG78, AAUG441 andAAUG78+441 were transfected into N. plumbaginifolia protoplasts. Transfectedprotoplasts were incubated for 16-24 h and GUS activity assayed spectrophotometricallyor fluorometrically. Table 3 lists GUS activity for three replicates of each constructassayed spectrophotometrically over five hours. From this data, the rate of GUS activityfor each replicate was determined, averaged for each construct, and compared relative top35SGUSO4, which was arbitrarily assigned an average rate of GUS activity of 1 (Fig.35). Protoplasts transfected with p35SGUSO4 and AAUG441 had similar levels of GUSactivity. Protoplasts transfected with AAUG78 and AAUG78 +441 had similar levels ofGUS activity but were almost an order of magnitude less when compared to p35SGUSO4and AAUG441. Levels of GUS activity from AAUG78 and AAUG78 +441 were 4 foldResults^1 14AUG78^AUG.441p35SGUSO4^--X77--AUG--X360--AUG--AAUG78 --X77--AAU--X360--AUG-AAUG441AAUG78+441Fig. 3.33^Summary of relevant sequences in mutant constructs used to identify theTomRSV AUG codon used for translation initiation in vivo. Xn refers to the number ofviral nucleotides before AUG78 and between AUG78 and AUG441..1/4`P.\''4'std^OSt),\,k-4-v^2>°c°'v' e,'•41KResults^11 5Fig. 3.34^In vitro translation product of synthetic RNA transcripts generatedfrom the T7 promoter of pMRD140 linearized with Xbal. Synthetic transcriptswere translated in wheat germ extracts (Promega) in the presence of 35S-methionine and the products separated by electrophoresis through a 15%polyacrylamide gel containning SDS and fluorographed. A single prominant bandof 41K was detected. Molecular size markers used were the translation productssynthesized from BMV RNA supplied with the translation kit (not shown).Table. 3 Spectrophotometric determination of GUS activity for each plasmid construct in triplicate.Absorbance at 415 nmTime (h) 1Mock2 3 1p35SGUSO42 3 1AAUG4412 3 1AAUG782 3 1AAUG78+4412 30.0 0.001 0.001 0.001 0.001 0.000 0.000 0.000 0.001 0.001 0.000 0.000 0.001 0.001 0.001 0.0010.5 0.000 0.002 0.001 0.090 0.098 0.154 0.111 0.162 0.145 0.009 0.011 0.011 0.012 0.014 0.0131.0 0.001 0.001 0.001 0.208 0.220 0.339 0.256 0.383 0.324 0.023 0.025 0.028 0.028 0.033 0.0312.0 0.002 0.002 0.001 0.482 0.536 0.780 0.577 0.827 0.718 0.055 0.061 0.064 0.066 0.076 0.0773.0 0.003 0.003 0.003 0.757 0.790 1.095 0.850 1.182 1.035 0.082 0.094 0.103 0.099 0.112 0.1054.0 0.007 0.005 0.002 1.066 0.980 1.470 1.011 1.448 1.257 0.123 0.140 0.132 0.146 0.157 0.1605.0 0.011 0.005 0.002 1.295 1.204 1.683 1.423 1.698 1.567 0.149 0.793 0.197 0.191 0.222 0.224Rate of 0.0020 0.0008 0.0003 0.2673 0.2473 0.3503 0.2380 0.3499 0.3166 .02078 .03811 .03807 .03821 .04366 .04321ChangeaError 0.0087 0.0088 0.0049 0.0100 0.0100 0.0099 , 0.0098 0.0099 0.0099 0.0100 0.0099 0.0098 0.0099 0.0098 0.0099a determined from the slope of A415 values verses time using the computer program Crickett GraphTM on a Macintoshcomputer.^11 7MOCK^p35SGUSO4 AAUG441 AAUG78^AAUG78+441Fig. 3.35^Graph showing the relative rates of GUS activity determined for eachof the four plasmid constructs p35SGUSO4, AAUG441 , AAUG78 and AAUG78441 inN. plumbaginifolia protoplasts. Rates of GUS activity are relative to p35SGUSO4 whichwas arbitrarily given a value of 1.Results^118above background. These results confirm the in vitro translation results that translationinitiation of TomRSV RNA1 and RNA2 begins at AUG78.3 . 1 3 Synthesis of Full-Length cDNA Clones to TomRSV RNAFull-length cDNA clones corresponding to TomRSV RNA1 and RNA2 were constructeddownstream of the bacteriophage 17 promoter from which RNA transcripts, correspondingto RNA1 and RNA2, could be synthesized in vitro. Full-length clones were constructedfrom the cDNA clones used previously to determine the nucleotide sequence of TomRSVRNA1 and RNA2. The 5' and 3' termini of TomRSV RNA were cloned using specificoligonucleotides as primers, such that the 5' termini was linked directly to the T7 promotersequence, and the 3' termini ended with a poly(A) sequence followed by the uniquerestriction enzyme cleavage site (Xbal) not present in either the RNA1 or RNA2 sequence.3 . 1 3 . 1^Synthesis of a Full-Length Clone of TomRSV RNA2The 5' terminus of TomRSV RNA2 was cloned downstream of the T7 RNA promoterusing synthetic oligonucleotide # 5 , (5'TCCCCGCGGTAATACGACTCACTATAG(AT)(AT)AGCGAAAAATCTGGT 3' wherethe nucleotide positions surrounded by brackets are degenerate for the nucleotidesindicated) which contains the restriction enzyme site Sacll, the T7 promoter sequence andthe 5' terminal 17 nucleotides of the viral RNA sequence. Since the first two nucleotides atthe 5' terminus of TomRSV RNA could not be determined (see section 3.3), theoligonucleotide was made degenerate at these two sites for either an A or T (the first andsecond nucleotides of other nepo-, como-, poty -and picornaviruses are often A or U,Results^119usually two U residues). The cloning strategy is summarized in Fig. 3.36. First-strandcDNA to TomRSV viral RNA was primed using oligo #2 which is complementary toresidues 2,201-2,216 of RNA2. The RNA used as a template for cDNA synthesis wasfrom a different viral isolate from that used for the original cloning experiments. Theresulting cDNA was amplified using PCR and oligonucleotide primers #2 and #5. Theproduct was digested with SacIll Sinai and the 0.4 kb fragment corresponding to the 5' endof TomRSV RNA together with the T7 promoter was gel purified and cloned intoSacII/SmaI digested Bluescript. Several clones were sequenced and found to contain anumber of nucleotide changes compared to the original TomRSV cDNA clones. In order tominimize the number of changes introduced, compared to the original TomRSV sequence,the SacIllTaql fragment (150 bp) from one of these clones (plasmid p58) was ligatedtogether with theT aqIl S Ina fragment (250 bp) from clone 25P6 (see Fig. 3.1) intoSacIlSntal digested Bluescript resulting in p3TA. p3TA contained 6 nucleotide changescompared to the originally sequenced clones (A to G at position 36, C to T at position 40,C to T at position 44, T to C at position 55, C to T at position 98 and C to T at position113). In addition, p3TA had a T and an A at the first and second positions, respectively.p23TA19 which contains over half of the 5' region of TomRSV RNA2 was created byligating together the following three fragments: the large SinaI/KpnI fragment from p3TA,the insert specific 1.8 kb SmallAatlf fragment of RNA2 clone 035 (see Fig. 3.1) and theinsert specific AuditKpnl 2.9 kb fragment of clone K6 (see Fig. 3.1). The cDNA clone inp84 (see Fig. 3.36) corresponds to the 3' termini of TomRSV RNA and was obtained bypriming first-strand cDNA synthesis with oligo #6 (5'CTAGTCTAGATTTTITTTTTITTIT 3') together with the same stock of RNA used toobtain the 5' clone p58 (see above). Oligo #6 introduced a unique Xbal site at the 3' end ofthe cDNA following the poly(A) sequence. The cDNA was digested with XbaI/EcoRI toyield a 1.3 kb fragment corresponding to the TomRSV RNA 3' terminal sequence whichwas cloned into the XballEcoRI sites of Bluescript. The insert in p84 was sequenced andSmalTaql^p23TA19EcoRIN Oligo #6 3' `11TH"1111"1 rl IAGATCTGATC 5'TlagiS II^maiSad^Smal Oligo2TomRSV RNA2 cDNA fragmentAetfpK6PstI7PvuIIKpnl PstlXbalpK6KpnIe/pst1fiKpnIPstlPvull3 XbalIKpnlResults^120Oligo #5 5' TCCCCGCGGTAATACGACTCACTATAG(A/I)(A/T)AGCGAAAAATCTGGT 3'L Sad^T7 promoter^TomRSV 5' sequence —complementary to TomRSV 3' poly (A)^XbalResults^i 2Fig. 3.36 Construction of a full-length TomRSV RNA2 cDNA clone used forpreparation of synthetic RNA inoculum. Full-length cDNA was cloned into pUC19downstream from a T7 promoter. cDNA inserts corresponding to TomRSV sequence arerepresented by thick lines. Plasmid inserts in p58 and 84 were derived using oligo#5 andoligo#6, respectively, as primers during cDNA synthesis. The differential origin of therestriction enzyme fragments are indicated by shading. See text for further details.Results^12 2found to contain three nucleotide changes compared to clone K6 (A to G at position 7,114,T to C at position 7,124 and T to C at position 7,189) and had 39(T) residues preceding theXbaI site which would correspond to the poly(A) tail in the transcribed RNA. p184 wasgenerated by ligating the small 370 by insert specific PvullabaI fragment from p84 andthe 370bp PstIlPvull fragment from clone K6 together into XbaIlPstI digested Bluescript.p184 was digested with Pstl/KpnI and ligated together with the 1.8 kbp KpnI/PstIfragment from K6 to obtain pR2184. Finally, the complete TomRSV RNA2 clone(pMRD14) was generated by ligation of the following three fragments: the insert specific2.5 kbp KpnIIXbaI fragment from pR2184, the insert specific 4.7 kbp SacIIIKpnIfragment from p23TA19 and plasmid pMRI digested with SacIII XbaI. Clone pMR1 (notshown) is essentially pUC19 with a different multicloning site and was generated asfollows. Plasmid pUC19 was digested with SacIlHindIII which cut at either ends of themulticloning site, and the plasmid purified from the small multicloning site fragment.Bluescript was also digested with SacI/HindIII and the multicloning site fragment sequencegel purified. The Bluescript multicloning site was then ligated into pUC19 to producepMR1. Following each ligation step leading to the construction of pMRD14, the resultingplasmid constructs were extensively analyzed by restriction enzyme digests to make surethat the number, orientation and integrity of cloned inserts were as expected. To obtainRNA transcripts from clone pMRD14, the plasmid was linearized with XbaI, treated withmungbean nuclease (optional) and then incubated with bacteriophage T7 RNA polymerasein the presence of the four ribonucleotides. Resulting transcripts should have a singlenonviral G residue at the 5' end followed by a U and an A, and a single nonviral U residueat the 3' end.Results^123^3.13.2^Synthesis of a Full-Length Clone of TomRSV RNA1The strategy used to construct of a full-length clone corresponding to RNA1 is summarizedin Fig. 3.37. p15SP1, which corresponds to the 5' half of RNA1, was constructed byligating together the following three fragments. The large SmaTIPstl fragment from p3TA(see previous section), the small insert specific SinallHind111 fragment from B54 (see Fig.3.1) and the insert specific HindIII/PstI fragment from G82 (see Fig. 3.1). To constructthe full-length cDNA clone to RNA1 (pMR10) the following three fragments were ligatedtogether: the Sac111Hpal fragment containing the vector sequence along with the 3' terminalsequence of TomRSV RNA from pMRD14 (see Fig 3.36), the insert specific PstIlHpalfragment from J27 (see Fig. 3.1) and the complete SacIIIPstl insert from pl5SP9. Plasmidconstructs were tested following each ligation step as for pMRD14 above. RNA transcriptswere generated from pMR10 after linearization with Xbal as described for pMRD14.^3.13.3^In Vitro Translation of pMRD14 and pMR10In vitro RNA transcripts generated from pMRD14 and pMR10 were tested in wheat germextracts for the synthesis of large translation products corresponding in size to the RNA1and RNA2 encoded polyproteins (Fig 3.38). RNA transcripts from pMRD14, whichcorrespond to TomRSV RNA2, resulted in the synthesis of a ca. 200K translation productwhile those from pMR10, which correspond to TomRSV RNA1, resulted in the synthesisof a slightly larger translation product. These results correspond well with the predictedsizes of the TomRSV RNA1 and RNA2 translation products which are 244 kDa and 207kDa, respectively.Results^124T7 promoterSmaI^HindIIIPstI^StnaIHinduISmaISacIISacII14444•p15SP1PstI•PstIPstI4).#4'^\ \HindIIIG82^,40•ti■0 ,^••.11.0 ,S•Sad" .f^pMRD14^5,S\,,.........____._."s„,211S:,.., HpaI....?\poly(A39 )- • tiJti•••:.HpaI^ir•r••1rXbaIVPstIResults^1 2 5Fig. 3.37 Construction of a full-length TomRSV RNA1 cDNA clone used for thepreparation of synthetic RNA inoculum. The full-length cDNA was constructed next theT7 promoter in pUC19. cDNA inserts are indicated by the thick lines. The differentialorigin of restriction fragments is indicated by shading. (see text for further details).200 K•• •Results^12 6'SPcc"•c,^‘S.°•c,S'\ ÷e'c'Fig. 3.38^In vitro translation products of RNA transcripts derived from pMR10and pMRD14, which correspond to TomRSV RNA1 and RNA2, respectively. 2 tgof synthetic RNA2 was translated in wheat germ extracts in the presence of 35S-methionine and the products analyzed by electrophoresis through 7.5%polyacrylamide gels containing SDS. 14C-methylated protein markers (Amersham)were used as size markers (not shown). The large amount of smaller translationproduct is probably due to premature termination of synthesis as suggested by timecourse experiments in which the amount of full-size protein increases over time (notshown).Results^1273 . 1 3. 4^Inoculation of RNA Transcripts onto PlantsTranscripts generated from pMRD14 and pMR10 (5 to 20 pg of each RNA) were rubinoculated onto Chenopodiuin amaranticolor leaves that had been previously dusted withfine carborundum. Both capped and uncapped transcripts were used. Control plantsinoculated with a dilution series of TomRSV viral RNA (1 to 100 ng) gave local lesionsymptoms 4 -5 days post-inoculation, however, plants inoculated with synthetic TomRSVRNA did not show symptoms even 4 weeks post-inoculation. This suggests that thesynthetic transcripts were not infectious. The number of local lesions on control plantsinoculated with virion RNA was directly related to the amount of RNA used in theinoculum. As little as 5 ng of virion RNA could be reproduceably detected as 5 +/- 2lesions on an inoculated leaf. This would suggest that the synthetic transcripts were at least40,000 times less infectious than wild type RNA. It is possible however that thetranscripts were infectious but correspond to a mutant of TomRSV which no longer giveslocal leasions on Chenopodiuin amaranticolor .DISCUSSION4.1 Structure and Function of TomRSV RNAFig. 4.1 shows the genomic organization of TomRSV and compares it with the nepovirustomato blackring (TBRV), the comovirus cowpea mosaic (CPMV), the potyvirus tobaccoetch (TEV), and the picornavirus polio. TomRSV has the largest genome of any of theseviruses, due in part to the unusually large size of the 3' noncoding regions (discussed inmore detail below), and has the potential to encode two large polyproteins of 244 and 207kDa. The following is a discussion of the structure, function and possible proteolyticprocessing of the large polyproteins encoded by TomRSV RNA1 and RNA2.4.1.1^Structure and Function of TomRSV RNA1TomRSV RNA1 is 8,214 nucleotides in length [excluding the 3' poly(A) tail], which issimilar to the known and estimated sizes of the RNA1 components for the othernepoviruses that have been studied. The structure of TomRSV RNA1 is similar to thatproposed for TBRV (see Fig. 4.1) as well as the other sequenced nepoviruses, grapevinechrome mosaic (GCMV) and grapevine fanleaf (GFLV) (not shown). Compared to TBRV(and the other nepoviruses as well), TomRSV RNA1 has a much longer 3' noncodingregion of 1547 nucleotides compared to 304 nucleotides for TBRV. The C-terminal ca.two thirds of the TomRSV RNA1-encoded polyprotein codes for a number of proteinspotentially involved in RNA replication, the order of which is conserved among allnepoviruses and the other members of the viral supergroup to which it belongs (see Fig.4.1). In the N to C-terminal orientation these sequences include a nucleotide binding128trans?^CP^ co-factor^VPgTomRSV 1 I^^I—AAA ^ES El 11 =I 1. ^NTP^pro^pol^trans?^CP^ co-factor^VPg TBRV -I^I^I I^—AAA^--I^El^©^II am^•^E AAA^NTP^pro^polCPtrans^L^S^co-factor^VPg CPMV —I^I^I I—AAA^ --I. DI El^ii_ ma^IN^I-AAA^NTP^pro^polCPVP2 VP3 VP1^VPgPolio^ —I^I^I^II Ei I 11121^1.1 I" AAA^pro NT ^pro polVPg^CPTEV^ -1^I^I^I^El II^mzi^1.1 L^I-AAAtrans? HC-pro^pro?^NTP^pro^polFig. 4.1 The proposed genomic organization of TomRSV compared to the nepovirus TBRV, the comovirus,CPMV,the picornavirus polio and the potyvirus TEV. Lines indicate noncoding sequence and the barsrepresent the polyprotein sequence encoded by the long open reading frames. Vertical lines through thebars indicate known and putative cleavage sites. The conserved amino acid sequence domains areindicated by the similarly shaded boxes, 0 protease co-factor, El NTP-binding domain, Ma proteasedomain and NI RNA-dependent RNA polymerase. Abbreviations are coat protein (CP), transportprotein (trans), NTP-binding domain (NTP), protease (pro), RNA-dependent RNA polymerase (pol),helper component (HC) and polyadenylate (AAA).AAADiscussion^130domain (NTP-binding), which is likely part of an RNA helicase, a viral cysteine (Cys)protease domain and an RNA-dependent RNA polymerase domain (RDRP). It has alsobeen shown for the picorna- and comoviruses as well as for the nepovirus GFLV that theVPg is located between the helicase and protease genes and, by analogy, it is very likelythat TomRSV also codes for it's VPg at this location. It has been suggested that this groupof proteins, conserved between all members of the viral supergroup, can only function as aunit during viral replication, hence their conservation as a group (Eggen & van Kammen,1988). The N-terminal amino acid sequences of the nepovirus RNA1 and comovirus BRNA contain an additional conserved domain. This region of CPMV encodes the 32Kprotein which has been shown to function as a protease co-factor (Vos et al., 1988). Basedon amino acid sequence similarities, it is probable that TomRSV, as well as the othernepoviruses, also code for a protease co-factor at this location. In the correspondinggenomic location, the picomaviruses encode the 2A protease. There is additional codingsequence present N-terminal of the putative protease co-factor domain in TomRSV and theother nepoviruses which is not present in the CPMV. This sequence is also not present inthe picorna- or potyviruses. The function of this region in TomRSV or the othernepoviruses is unknown.^Cysteine ProteaseEssentially all viruses that use a polyprotein translation strategy encode the proteolyticenzymes required for polyprotein maturation. Cleavage of the CPMV B RNA polyproteinthrough several successive proteolytic steps has been suggested to play an importantregulatory role in viral RNA replication (Peters et al., 1992; Eggen & van Kammen, 1988).The putative TomRSV RNA1 encoded protease was identified by amino acid comparisonswith the picornavirus 3C protease (30)m) (Palmenberg, 1979), the comovirus CPMV 24KDiscussion^131protease (Verver et al., 1987), the potyvirus tobacco etch (TEV) 49K protease (Carrington& Dougherty, 1987), the nepovirus GFLV 24 kDa protease (Margis & Pinck, 1992) aswell as the putative proteases encoded by the nepoviruses TBRV, GCMV and (Greif et al.,1988; Le Gall et al., 1989), all of which retain a conserved core region thought to be theactive site (see Fig. 3.25). Site-directed mutagenesis of the putative protease active site inpoliovirus, CPMV and GFLV results in reduced or abolished protease activity (Ivanoff etal., 1986; Dessens & Lomonossoff, 1991; Margis & Pinck, 1992).Known proteases can be divided into 4 groups depending on whether their activesites include either a serine (Ser), cysteine (Cys), aspartic acid (Asp) or the divalent cationzinc (Zn2+) (Neurath, 1984). Early studies compared 3CPro with the cellular Cys proteaseswhen inhibitor studies identified a Cys residue as a principle catalytic site (Pelham, 1978).The viral Cys residue and an additional histidine (His), which could be aligned with a Hisresidue present at the active site of the cysteine protease, were the only conserved featuresbetween the cellular Cys and viral proteases. It was therefore suggested, that these twogroups of enzymes were unrelated, and that functional similarity was a result of convergentevolution. Similarity was also noted between the cellular serine (Ser) and viral proteaseswhich was also attributed to convergence. The concept of convergence was challengedafter the completion of TBRV and GCMV nepovirus sequences, which did not retain theconserved His residue, and the observation that the sobemovirus Ser protease hadsignificant sequence similarity with 3CPr° (Gorbalenya et al., 1989). Two subclasses ofSer proteases have been identified which have the same spatial arrangement of the catalytictriad His/Asp/ Ser with very different tertiary structures and are represented by trypsin,which has a twin 0-barrel motif, and subtilisin which has an a-helix 0-barrel structure.These two subgroups have been cited as examples of convergent evolution (Neurath,1984). Primary and secondary structural patterns have since suggested that the viral Cysproteases are in fact evolutionarily related to the cellular trypsin-like Ser proteases (Bazan &Fletterick, 1988). These patterns include the conservation of the trypsin-like protease,Discussion^132catalytic triad, His-57/Asp-102/Ser-195 (Kraut et al., 1977) (where the numbers refer toposition of the amino acids in trypsin), with the replacement of Cys for Ser in the active siteand the conservation of the 12 i3 strands and the loops that define that active center andbinding site of the enzyme. It perhaps not surpising therefore, that replacement of Cyswith Ser at the putative protease active site in CPMV results in only reduced proteaseactivity while a similar amino acid replacement at the putative GFLV protease active site hasno effect on protease activity (Dessens & Lomonossoff, 1991; Margis & Pinck, 1992).The conserved His residue, initially identified in 3CPr° from comparisons with the Cysproteases, has since been suggested to be important for recognition of the cleavage site(Bazan & Fletterick, 1988).The most common cleavage sites for the 3C-like protease of the picorna-, como-and poty- viruses include glutamine/methionine (Q/M), glutamine/serine (Q/S),glutamine/glycine (Q/G), glutamate/serine (E/S) and glutamate/glycine (E/G) (for a reviewsee Hellen et al., 1989) . The previously identified His residue however, is not conservedamong the nepoviruses TBRV, GCMV and GFLV. This has led Ritzenthaler et al., (1991)to speculate that the unusual cleavage sites determined and proposed for these viruses[arginine/alanine (R/A) and lysine/glycine (K/G)] are a result of this amino acid change.Interestingly, the TomRSV protease domain sequence retains the His residue whichsuggests that the TomRSV protease may have more in common with the proteases of thepicoma-, como- and potyviruses. The putative cleavage sites determined for the TomRSVpolyprotein include Q/G, Q/S and Q/M, sites which are also more common with thepicorna-, como- and potyviruses then the other nepoviruses.Discussion^1334.1.1.2^NTP-Binding DomainHydrolysis of nucleotide triphosphates (ATP and GTP) is the energy source that drivesvirtually all major biochemical reactions. Many of the proteins involved in this process canbe identified by a widely conserved amino acid sequence pattern or NTP-binding site(Walker et al., 1982). The NTP-binding site is composed of two motifs, the N-terminal"A" site and the C-terminal"B" site, separated by 18 to 502 amino acid residues. Theconserved "A" and "B" site residues are: (hydrophobic stretch) (G/A)xx(G)xGKS/T and(hydrophobic stretch) D(E/D) respectively (see also Fig. 3.26). The hydrophobic stretchesare defined as at least two hydrophobic residues within the five amino acid positionspreceding the "A" site and at least three hydrophobic residues within the five amino acidpositions preceding the "B" site. Residues in brackets are not necessarily conserved in allcases (Gorbalenya & Koonin, 1989) and x refers to any amino acid. X-ray studies ofsome of these proteins indicate that both motifs fold in a beta-sheet - turn - alpha-helixstructural unit ( La Cour et al., 1985; Fry et al., 1986). The "A" site binds the phosphorylmoiety of ATP or GTP at the central G-rich loop (Walker et al., 1982) whereas the " B"site chelates the Mg2+ of Mg-NTP at the invariant D residue (Gorbalenya et al., 1989).Proteins containing the NTP-binding domain of both cellular and viral origin have beengrouped into four main supergroups (Gorbalenya & Koonin, 1989).It has been suggested that viral proteins with the conserved NTP-binding domainare helicases that unwind duplex DNA or RNA during replication, transcription,recombination and repair. Evidence that a (+)ssRNA virus-encoded protein with an NTP-binding domain does in fact bind NTP comes from studies with the tobamovirus, tobaccomosaic virus (TMV) p126 (Evans et al., 1985). More recent studies with the potyvirusplum pox virus (PPV) CI protein, which also contains an NTP-binding domain, showedthat this protein has helicase activity which is dependent on the hydrolysis of NTP to NDPand Pi (Lain et al., 1990). Although the potyviruses belong to the same viral supergroupDiscussion^134as the picorna-, como- and nepoviruses, the potyvirus NTP-binding domain has beengrouped separate from these other viruses (Gorbalenya & Koonin, 1989). The potyvirusCI protein has nine amino acid sequence motifs in common with the animal flavi-, andpestivirus NTP-domain containing proteins as well as the cellular translation initiationfactor eIF-4A-related group of helicase-like proteins. The nepo-, como- and picornavirusNTP-domain containing proteins share seven amino acid domains in common with theanimal papovavirus, SV40, NTP-domain containing large T-antigen, a protein which hasDNA-dependent ATPase and DNA and RNA helicase activities (Stahl et al., 1986;Scheffner et al., 1989).Site-directed mutagenesis of conserved residues in the NTP-binding motif of thepoliovirus 2C protein also indicates that this protein has a role in virus replication andproliferation. The only revertants that were obtained in that study were of wild-typesequence, underscoring the importance of these residues (Mirzayan & Wimmer, 1992).The finding that the TomRSV RNA1 polyprotein also contains an NTP-binding domain isstrong evidence that this region has NTPase activity. Whether or not this is associated withhelicase activity is less certain. Amino acid sequence alignments of the putative helicaseproteins from a number of RNA viruses has identifed seven helicase motifs which includethe "A" and "B" sites of the NTP-binding domain. These seven motifs are conserved tovarying degrees between the different (+)ssRNA virus, and have been useful in definingthe three viral supergroups (Gorbalenya & Koonin, 1989; Habili & Symons, 1989).TomRSV, like the picorna-, como- and other nepoviruses, only contains the two NTP-binding domains and there is no direct experimental evidence from the picorna-, como-, ornepoviruses that the NTP-binding domain has an associated helicase activity. Globalsimilarity between the amino acid sequences surrounding the NTP-binding domain regionof TomRSV and the other viruses is very low, making it difficult to align these regionsother then at the conserved motifs (see Fig. 3.26). However, in common with the otherputative picorna-, como- and nepovirus NTP-binding proteins, the TomRSV sequence hasDiscussion^135a very hydrophobic region C-terminal to the NTP-binding domain which is a possibletransmembrane spanning sequence. Purification of the picorna- and comovirus replicationcomplexes has been hampered by the fact that these complexes are tightly associated withmembranes (Butterworth et al., 1976; Caliguiri & Tamm, 1970; Dorssers et al., 1984). Itis possible that this region of the NTP-binding protein is responsible for the attachment ofthe replication complex to lipid membranes in the picorna- como- and nepoviruses,including TomRSV.^RNA-Dependent RNA Polymerase DomainRNA-dependent RNA polymerases (RDRP) are encoded by all RNA viruses and werefirst identified as components of double-stranded and negative stranded RNA virusparticles (for a review see Ishihama & Nagata, 1988). Among these viruses, it is firstrequired that mRNAs be synthesized before the viral genes can be translated, hence thenecessity to package a polymerase along with the viral RNA. In the case of retroviruses,virions were shown to contain an RNA-dependent DNA polymerase (RDDP). A proteinisolated from cells infected by poliovirus, and which was shown to be encoded at the C-terminus of the poliovirus polyprotein, could act as an RDRP (Lundquist et al., 1974).The RDRP consensus sequence was first defined when the polio RDRP amino acidsequence was aligned with the putative RDRP's from a number of plant, animal andbacterial viruses (Kamer & Argos, 1984). Since then, every RNA virus sequenced,whether single-stranded or double-stranded, of positive or negative polarity, has beenfound to code for a protein which has an RDRP domain, or in the case of the retroviruses,an RDDP domain, which is very similar. The ubiquity of the RDRP domain and itsfundamental role in viral replication, makes this sequence, together with the NTP-bindingdomain, an attractive marker to study the molecular evolution of RNA viruses.Discussion^136Recognition of a third RNA virus supergroup was based largely on the relationshipsbetween the different viral RDRP and NTP-binding domains which also supported theclassification of the original two supergroups (Koonin, 1991a; Habili & Symons, 1989).As a result, others have suggested that the NTP and RDRP domains be used for thetaxonomic grouping of RNA plant viruses (Candresse et al., 1990). Analysis of theTomRSV RDRP domain described in this thesis (see Fig. 3.24) places it comfortablyamong the other nepo- and comoviruses and more distant from the picorna- andpotyviruses. The size of the putative TomRSV RDRP protein is similar to that of CPMVand smaller than that of TBRV (see Fig. 4.1) (and the other nepoviruses also, which arenot shown in Fig. 4.1). The proposed N-terminal cleavage site which would release theputative RDRP protein in TomRSV and TBRV (and the other nepoviruses also) and thedetermined cleavage site for CPMV (see Fig. 4.1), can be aligned, as well as the followingamino acid residues, including the RDRP domain. It appears that the C-terminal region ofthe TomRSV RDRP containing polyprotein is truncated relative to the other nepoviruses byca. 10%, and in this regard more closely resembles the RDRP containing protein ofCPMV. It may also be possible that the loss of nonessential sequences within the largeTomRSV RNA1 molecule is do to packaging constraints within the virion.^Protease Co-FactorA conserved amino acid sequence domain was identified in the TomRSV RNA1polyprotein sequence that is also present in the equivalent regions of the TBRV, GCMVand GFLV RNA1 -encoded polyproteins and the CPMV B RNA-encoded 32K protein (seeFig. 3.27). This domain may constitute an important functional site for a protease co-factorprotein. Ritzenthaler et al. (1991) identified this domain in GFLV, TBRV and GCMV bycomparisons with the CPMV 32K protein. The comovirus CPMV 32K protein has beenDiscussion^137shown to act as a co-factor for the CPMV 24K protease during processing of the MRNAencoded polyprotein at the Q/M site (Vos et al., 1988; Peters et al., 1992). This wasdetermined after it was shown that processing of translation products produced fromsynthetic RNA transcripts corresponding to CPMV B and M RNA in rabbit reticulocytelysates is abolished if the gene for the 24K protease is disrupted, whereas, if the gene forthe 32K protein is disrupted, and not that of the 24K protein, then only cleavage at the Q/Msite in the CPMV M RNA-encoded polypeptide is affected. Although the 32K protein isrequired for processing at this site by the 24K protease, it does not alter the normaldipeptide recognition pattern of the 24K protease. It is therefore unclear what function the32K provides for processing by the 24K protease. The 32K protein has also been shownto regulate the processing of the B RNA polyprotein by slowing down the rate of cleavage(Peters et al., 1992). There is, however, no experimental evidence that the conservedamino acid sequences between the CPMV 32K protein and the nepovirus sequencesfunctions to assist the 24K protease. Therefore, it is possible that the CPMV 32K proteinhas an additional, and as yet unspecified, role during viral infection, in common with theequivalent proteins encoded by the nepoviruses.^VPgIt has previously been shown that TomRSV has a 5' linked VPg protein with a Mr of 4000+/- 900 that is essential for infectivity (Mayo et al., 1982) (see sections 1.1.2,1.2,1.3).Among other members of the picornavirus-like supergroup, the gene coding for the VPg islocated between the putative helicase and protease genes making it likely that the TomRSVVPg is also located at this position. In this study, the putative TomRSV VPg was localizedby analyzing the possible cleavage sites between the putative helicase and proteasesequences (see section 3.10.7). Due to the small size of the VPg proteins it is difficult toDiscussion^138determine conserved amino acid residues, however, identification of the Q/S dipeptide as apossible cleavage site N-terminal of the putative TomRSV VPg would result in a Ser at theN-terminus, similar to the GFLV and CPMV VPg proteins. For GFLV and CPMV it isthought that linkage of the VPg to the viral RNA occurs between the N-terminal Serhydroxyl group and the genomic RNA 5' phosphate (Zabel et al., 1984; Pinck et al.,1991).^RNA1 N-Terminal Coding RegionThe N-terminus of TomRSV, and the other nepoviruses, RNA1-encoded polyproteins,includes an additional ca. 30K of sequence compared to the comovirus CPMV. Whether ornot TomRSV and the other nepoviruses encode a single large protein that includes thisregion together with the putative protease co-factor domain or encodes several smallerproteins is unknown. In vitro translation of TBRV RNA1 identified a 50K productcorresponding to the N-terminus but it was unclear whether it was a premature terminationproduct or cleavage product. A separate 30K protein could not be detected but the presenceof such a product could also not be ruled out (Demangeat et al., 1990). In addition, there isamino acid sequence identity between TomRSV, TBRV and GCMV at their N-termini (seeFig 3.28). The significance or possible function of these sequences is not known.Surprisingly, of the N-terminal 277 amino acids of the TomRSV RNA1 polypeptide,corresponding to a polypeptide of ca. 29 kDa, 87% are identical to the equivalent region ofthe TomRSV RNA2-encoded polyprotein. This is discussed further in section 4.3.Discussion^1394.1.2^Structure and Function of TomRSV RNA2TomRSV RNA2 is 7273 nt in length excluding the 3' poly(A) tail. RNA2 contains a singlelong ORF of 5723 nucleotides with the capacity to code for a polyprotein of 207K. Thislong ORF is preceded by a short 5' noncoding sequence of 77 nucleotides and followed bya long 3' noncoding sequence of 1550 nucleotides (see Fig. 3.14). The sizes of severalnepovirus RNA2 components vary in size from just under 4,000 nt to over 7,000 nt, withTomRSV having the largest RNA2 component (see Table 3). TomRSV is the firstnepovirus from the third subgroup of nepoviruses (Martelli, 1975) (see section 1.2) forwhich the nucleotide sequence has been determined. Comparison of the genomicorganization of TomRSV RNA2 with those of TBRV, GCMV and GFLV (Fig. 4.2),reveal a number of similarities and differences. The 5' noncoding region of TomRSVRNA2 is shorter (77 nt vs 217 to 232 nt), however, the 3' noncoding region is muchlonger (1550 nt vs 241 to 301 nt) which adds considerably to the length of TomRSVRNA2. The TomRSV RNA2 coding region is also considerably larger (5723 nt vs 3327 to4073 nt) than the other sequenced nepoviruses. In vitro studies on the processing ofTBRV and GCMV RNA2 polyproteins has identified three cleavage products (NH2-50K-46K-59K-COOH and NH2-44K-46K-56K-COOH, respectively), the 46K and 59K TBRVDiscussion^1 4 0Table 4.^Sizes of the RNA2 components of several nepoviruses.Virus^ Size of RNA2 (nucleotides) Tomato blackring^4,6621^(det.)Grapevine chrome mosaic^4,4412^(det.)Grapevine fanleaf^3,7743^(det.)Arabis mosaic 3,9004^(est.)Tobacco ringspot^3,9004^(est.)Tomato ringspot^7,273^(det.)Cherry leafroll 6,5004^(est.)Myrobalan latent ringspot^6,0004^(est.)1 Meyer et. al. (1986)2 Brault et. al. (1989)3 Serghini et. al. (1990)4 Murant et. al. (1981)GFLV(3,774 nt)VpGAUG/ 288Trans? coat proteinpoly(A)UAG4361AUG AUG441178^I Trans?^coat proteinUA AI^5724TomRSV(7,273 nt) VpG poly(A)AUG AUG8 I^233 Trans?UIAG3560Coat protein^iTBRV(4,662 nt)VpG poly(A)AUG218 Trans?UAG4190Icoat protein GCMV(4,441 nt)VpG poly(A) 1 KbFig. 4. 2^Comparison of the RNA2 components of the nepoviruses TomRSV, GFLV, TBRV and GCMV.The thick lines represent the RNA molecule with a 5' VPg and a 3' poly(A) tail The bars represent the amino acidsequence translated from the long open reading frame beginning at the first in-frame AUG codon and terminatingat the next in-frame stop codon. In both TomRSV and GFLV the second in-frame AUG is also indicated.Identical shading in bars indicate amino acid sequence similarity.b■141,Discussion^142products as well as a 57K product have also been detected in vivo (Demangeat et al., 1991;Demangeat et al., 1992). The 57K protein has been identified as a proteolytic breakdownproduct of the 59K coat protein and can be detected in virus particles. It is unknown howmany cleavage products are produced from the TomRSV RNA2 polyprotein however it isproposed that there are at least two and possibly three or four (see Fig. 4.2). The viral coatprotein sequence is located at the C-terminus of all four nepovirus RNA2 polyproteins,preceded by sequences thought to be involved in cell-to-cell movement (see Fig 4.2). Thecoat protein and putative cell-to-cell movement protein sequences would account for justover half of the potential coding sequence in TomRSV RNA2. Additional 5' codingsequence in TomRSV RNA2 has the capacity to produce an ca. 100K polypeptidecompared to a 44-46K polypeptide from TBRV and GCMV RNA2. The extra TomRSVcoding sequence can be accounted for by two unique features. The N-terminal 277 aminoacids (29.5 kDa) share extensive similarity (87%) with the N-terminus of the TomRSVRNA1 polyprotein and is probably a result of a duplication from the RNA sequence(discussed further in section 4.2). A further 144 amino acids (16.3 kDa) constitutes a setof three tandem repeats. The tandem repeats are located between the polyprotein sequencesshared between RNA1 and RNA2 and the putative transport region. A possible functionin cell-to cell movement for the tandem repeats is suggested in section, however,there is no evidence to ascribe a function for the N-terminal seqences of the TomRSVRNA2 polyprotein. However, by analogy with the nepovirus CLRV, the additional N-terminal sequences of the TomRSV RNA2 polyprotein may code for a separate protease(Ponz, 1987), a factor required for cis replication (Wellink et al., 1992) or perhaps aspecific nematode transmission factor analogous to the potyvirus aphid-transmission helpercomponent (Pirone, 1981; Majia et al., 1985).Discussion^1434.1.2.1^Coat ProteinThe putative TomRSV coat protein sequence was identified by comparing the deducedRNA2-encoded amino acid sequence with the previously determined amino acidcomposition of TomRSV coat protein and by amino acid sequence similarity with knownnepovirus coat protein sequences (see Table 2, Fig 3.3.20). The calculated size of theTomRSV coat protein is 60-62 kDa, depending on which of two possible cleavage sites isused. The size of the TomRSV coat protein determined by SDS-PAGE is 58K (Allen &Dias 1977). This size discrepancy may be the result of further processing of the C-terminal amino acids of the putative TomRSV coat protein sequence as has been suggestedfor the TBRV coat protein (Demangeat, et al., 1992). A relatively low (21.4-23.4%), butsignificant amount of amino acid sequence similarity was detected between the putativeTomRSV coat protein and those of the other nepoviruses TBRV, GCMV and GFLV(alignment shown in Fig. 3.20). This low level of homology is not surprising consideringTomRSV is serologically unrelated to any other definitive member of the nepovirus group(Stace-Smith, 1984). Despite the low conservation of amino acid sequence between thenepovirus coat proteins, it is probable that they have similar three dimensional structures.The picornavirus virion shell is composed of 60 copies each of three major(VP1,VP2 and VP3) and one minor (VP4) coat proteins. Each of the major proteins ischaracterized by an eight-stranded anti-parallel 13-barrel. The comovirus virion shell iscomposed of two coat proteins, L which is 42 kDa and S which is 24 kDa, (Lomonossoff& Johnson, 1991). The S protein has a single eight-stranded anti-parallel 0-barrel domainlabelled A, and the L protein has two such domains labelled C and B. The C, B and Adomains are equivalent to the picornavirus VP2, VP3 and VP1 proteins, respectively, intheir position within the shell and in their genomic locations. It is probable that theTomRSV coat protein is equivalent to the picorna- and comovirus coat proteins, andcontains all three of the eight-stranded anti-parallel 13-barrel domains within a single largeDiscussion^144protein. It has been suggested that the three major picornavirus coat proteins, the twocomovirus coat proteins and the single nepovirus coat protein probably evolved by geneduplication of a coat protein containing a single eight-stranded anti-parallel f3-barrel domain(Rossmann & Johnson, 1989). X-ray studies reveal that other simple icosahedral RNAplant viruses code for a single coat protein species containing a single eight-stranded anti-parallel n-barrel domain. Three copies of this gene is equivalent to the picornavirusVP1+VP2+VP3 or comovirus L+S. Recently, the structure of the cellular protein, tumornecrosis factor (TNF), has been determined. This protein has an eight-stranded anti-parallel13-barrel domain that is similar to that of viral coat protein domains (Jones et al., 1989).Furthermore, TNF functions as a trimer with a packing arrangement which is somewhatsimilar to that found in the viral systems.^Movement ProteinThe CPMV M RNA-encoded 58/48K movement protein sequence was compared to theequivalent region of the TomRSV RNA2 polyprotein sequence determined in this study,but only limited sequence similarity was observed (not shown). A high degree of similarityhowever was observed between a portion of the TomRSV sequence and the correspondingregion of another nepovirus GFLV (see Fig. 3.21), but not with sequences in either TBRVor GCMV. Little sequence similarity exists between the TMV 30K movement protein andthe putative movement proteins from related viruses, which suggests that movementproteins in general are not highly conserved (Koonin et al., 1991b). Attempts to align theputative movement proteins from many different plant viruses including the nepovirusesTBRV and GCMV, and the comoviruses CPMV and red clover mosaic (RCMV), haveidentified the conserved motifs, "LPL" , "G" and "D" (Koonin et al., 1991b) which werealso present in the cellular heat shock protein HSP90. Based on this observation, theDiscussion^145authors speculated that viral movement proteins may function in a manner similar tochaperonins. Surprisingly, the nepo- and comovirus putative movement proteins could begrouped together with the putative movement proteins of the plant caulimoviruses andclosteroviruses even though these are double-stranded DNA viruses with icosahedralcapsids and single-stranded RNA viruses with flexous rod shaped particles, respectively.The TBRV and GCMV polyproteins share extensive amino acid sequence similaritythroughout their entire length, the most highly conserved regions being the putativetransport protein sequences (Brault et al., 1989). Likewise, the most conserved regions ofamino acid sequences between TomRSV and GFLV are also the putative transport proteinsequences. However, there is only limited sequence conservation present between theTBRV/GCMV and TomRSV/GFLV sequences. It is possible that these two groups ofnepoviruses have evolved transport functions with unique mechanisms of action but thatstill involve the formation of tubular structures. In both TomRSV and GFLV the "LPL"motif is reduced to "P". In addition, only TomRSV contains the "G" motif and neitherTomRSV nor GFLV have the "D" motif (A Mushegian, personal communication). Thesedifferences, along with the absence of surrounding amino acid sequence identity, make theputative nepovirus movement proteins particularly heterogeneous.N-terminal of the TomRSV putative movement protein sequence are three aminoacid repeats of 53, 53 and 38 amino acids (see Fig. 3.22). It is interesting that within thissequence the tripeptide "LPL" is repeated three times while the dipeptide "LP" is repeated afurther 10 times. Although this region is separated from the putative TomRSV movementprotein sequence by ca. 280 amino acids, it is not known whether it is part of the same or adifferent mature protein following processing of the RNA2 polyprotein translation product.Nevertheless, the repetition of the LPL or LP sequence is striking and may suggest a rolefor these regions in TomRSV intercellular movement.Discussion^1464.1.2.3^N-Terminal Amino Acid SequenceIf the tandem repeats are not involved in intercellular movement than the function of ca.50% of the TomRSV RNA2 polyprotein sequence is in question. Possible additionalfunctions within this region can only be speculated upon. TomRSV RNA2 may code foran additional protease that cleaves the RNA2 polyprotein. In vitro translation of purifiedCLRV RNA2 resulted in proteolytic processing of the translation product in the absence ofRNA1 (Ponz et cll., 1987). It is possible that another characteristic of nepoviruses withlarge RNA2 components is that they code for an additional protease that is conserved at theN-terminus of TomRSV RNA2. It may also be possible that TomRSV RNA2 codes for aprotein required for cis replication. Recent evidence suggests that the CPMV M RNApolyprotein N-terminal-encoded 58K protein is required for replication of the M RNA bythe B RNA-encoded replication complex (Wellink et al., 1992) (see section 4.4). Finally,TomRSV RNA2 may code for a specific nematode transmission factor analogous to thepotyvirus aphid transmission factor (Pirone, 1981; Majia et al., 1985). Based on studies ofpseudo-recombinants of RNA1 and RNA2 from different strains of the nepovirusraspberry ringspot (RRV) and TBRV, it was demonstrated that RNA2 is determinant fornematode transmission (Harrison et al., 1974; Hanada & Harrison, 1977; Harrison &Murant, 1977b). Elucidation of the function of the TomRSV RNA2 N-terminalpolyprotein will be an exciting future study.4.1.3^Polyprotein ProcessingProcessing of the TomRSV RNA1 and RNA2-encoded polyproteins is most likelyaccomplished by a protease encoded by RNA1 (see section, possibly with the helpof a co-factor (see section also encoded by RNA1. Viral proteases encoded by theDiscussion^147picornaviruses, comoviruses and potyviruses preferentially cleave the viral polyprotein atonly a few common dipeptides (Q/S, Q/M, Q/G, E/G, E/S) (Hellen et a/.,1989). A numberof probable cleavage sites were determined by comparing the TomRSV sequences withrelated viral sequences and their known and putative cleavage sites. Although it is probablethat RNA2 encodes more than two cleavage products it was only possible to predict thecleavage site that would release a coat protein of 60-62 kDa. Four probable cleavage siteswere predicted from the RNA1 polyprotein sequence resulting in the following products:NH2-66K-67K-2.7K-25K-83K-COOH. Although the cleavage sites lysine- alanine(K/A), arginine- glycine (R/G), cystine- serine (C/S) and glycine- glutamate (G/E) havebeen identified in the nepoviruses TBRV, GCMV and GFLV (Demangeat et al., 1990;Brault et al., 1989; Serghini et al., 1990; Pinck et al., 1991), these dipeptide sites werenot seriously considered as processing sites of the TomRSV polyproteins. The TomRSVprotease domain contains an element in common with poliovirus ( and which is also presentin the como- and potyviruses) involved in cleavage recognition (Bazan & Fletterick, 1989),that is not conserved among TBRV, GCMV of GFLV (see section 4.1.1). Therefore, itwas thought that the cleavage sites processed in the TomRSV sequence would have more incommon with the picorna-, poty-, and comoviruses than the other nepoviruses. It must bestressed, however, that identification of these putative cleavage sites is tentative.4.2 Nucleotide Sequence Similarity between RNA1 and RNA2There is extensive nucleotide sequence similarity between the 5' termini, (88.8% for 907nt) and between the 3' termini, (1533 nt.) of TomRSV RNA1 and RNA2. Suchextensively duplicated sequences are rare among the small RNA viruses. Including thethree internal nucleotide repeats within RNA2, over 35% of the total genomic sequence ofTomRSV consists of duplicated sequence. RNA viruses with more than one genomicDiscussion^148component often display some nucleotide sequence similarity in their 5' and 3' noncodingregions. These sequences are thought to be important replicase recognition sites for acommon replicase, are generally imperfectly repeated and less than 300 nt in length. ForTBRV, GCMV and GFLV, sequence similarity between the 5' noncoding regions of theirtwo RNA components is ca. 80% for over 200 nt. The 3' noncoding regions of TBRV andGCMV are identical between their respective RNA1 and RNA2 components (301 and 241nt respectively) while those between GFLV are 80% identical for just over 200 nt.Nucleotide sequence identity between the 5' noncoding regions and between the 3'noncoding regions of the nepovirus RNAs is limited to only a few nucleotides at theextreme 5' and 3' termini, respectively. Both the 5' and 3' noncoding regions of TBRV,GCMV, GFLV and CPMV are rich in U residues (32.5%-44.8% and 40.5%-48%respectively see Table 2). The high U content may be an important, more general,recognition signal required for replication. The 5' most 77 nt, before the first in-frameAUG codon of TomRSV RNA, has a similarly rich U content (44.2%) whereas theTomRSV 3' noncoding region is relatively low in U content (31.2%). However, the 3'most ca.110 nt of TomRSV RNA has a U content similar to that of the other nepoviruses 3'noncoding region (44.2%). It is possible therefore that the additional 3' noncodingsequence of TomRSV RNA has a function other than for replicase recognition. The 3'noncoding region of RNA1 and RNA2 from another nepovirus, cherry leafroll (CLRV),has also been determined. CLRV belongs to the subgroup of nepoviruses having largeRNA2 components, as does TomRSV, and like TomRSV, has a very long 3' noncodingregion of 1.5 Kb which is identical between RNA1 and RNA2 components (Scott et al.,1992). Like TomRSV, the total CLRV 3' noncoding region is not particularly U rich(31.2%) however the 3' most ca. 145 nt also has a high U content (44.1%). Despite thesegross similarities, only two short regions of sequence identity (see Fig. 3.18) could bedetected between the TomRSV and CLRV 3' noncoding regions and these were outside ofthe U rich region. These conserved nucleotide sequences may be important for someDiscussion^149unspecified role in the virus life cycle. The extensive 3' noncoding sequences, conservedbetween the two respective RNA components of TomRSV and CLRV, may becharacteristic for other nepoviruses with large RNA2 components. That both TomRSV andCLRV retain such extensive 3' noncoding sequence, perfectly conserved within their twoRNA species, suggests an important function for these sequences which is discussedfurther in section Amino Acid Sequence Similarity between TomRSV RNA1 andRNA2 Encoded PolyproteinsThe nucleotide sequence identity shared at the 5' terminus of RNA1 and RNA2 includes aportion of the coding regions of these RNAs, and therefore, the N-terminal polypeptideshave extensive amino acid sequence similarity. Beginning at the first AUG codon atnucleotide position 78 of both RNA1 and RNA2, 87% of the first 277 amino acids areidentical. The first 132 amino acids are identical followed by another 145 amino acids ofwhich 75.3% of the positions are identical. The N-terminal amino acid sequence beginningat residue 138 can be aligned with the N-terminal regions of the TBRV and GCMV RNA1polyproteins. This suggest that the sequence identity at the 5' termini of TomRSV RNA1and RNA2 could be due to a duplication of the TomRSV RNA1 5' sequence at the 5' endof RNA2. Duplication of this sequence could have happened either relatively recentlywithin the TomRSV genomic sequence, or may have been a more ancestral event followedby subsequent loss in some of the descendants i.e. TBRV, GCMV, GFLV, CPMV etc..,but not others, ie. TomRSV. An example of a different virus with a bipartite genomewhich has duplicated nucleotide sequence containing coding sequence, that is conservedbetween the two RNAs, are the different strains of the tobravirus, tobacco rattle (TRV)RNA 1 and RNA2 components. Varying amounts of nucleotide sequence duplicated fromDiscussion^150the 3' termini of the larger RNA1 component to the 3' termini of the smaller RNA2component (Bergh et al., 1985; Angenent et al., 1986), include potential coding regions.The N-terminal region of the nepovirus RNA1 polyprotein is unique to the nepoviruses andhas no known function.4 . 4 RNA RecombinationThe extensive amount of nucleotide sequence identity between the 5' ends and the 3' endsof TomRSV RNA1 and RNA2, as well as the internal nucleotide repeats within RNA2,could have arisen by two different mechanisms. The two similar sequences could haveeach arisen independently, but due to strong selection pressure, eventually became identical(convergent evolution). Alternatively, the two sequences are a result of a duplication event.In regard to the former mechanism, it is possible that the replicase evolved to require amore specific recognition sequence, providing a selective pressure towards the convergenceof sequence elements at the termini of the two RNA components. Although possible, sucha hypothesis seems unlikely. As discussed in section 1.1.3, RNA viruses with multipartitegenomes exhibit a certain amount of nucleotide sequence similarity between their 5' terminiand between their 3' termini, which are thought to be important for recognition by the viralreplicase. In general, there is very little nucleotide sequence complementarity between the5' and 3' termini within viral RNAs, which suggests that different replicase factors may beinvolved in recognition of the 5' and 3' termini or that the termini are recognized withdifferent affinities by the replicase (see section 1.1.3). If this were true, than it is difficultto imagine that not only does the TomRSV replicase require a very long and specificrecognition sequence at the 3' termini of the positive strand, but that it also requires anothervery different, but also long and very specific, recognition sequence at the 3' termini of thenegative strand for replication of TomRSV RNA. Furthermore, inference from strains ofDiscussion^151TRV, which have extensive nucleotide sequence identity at their 3' termini (see precedingsection), indicates that perfect identity is not required for replication (Angenent et al.,1989). A pseudo-recombinant, made up of RNA1 and RNA2 from two strains of thetobravirus TRV which have several nucleotide differences in their 3' noncoding regions,could replicate efficiently in plants while maintaining their individual sequences.It is more likely that sequence similarity observed between RNA1 and RNA2 andwithin RNA2 is the result of duplication of nucleotide sequence from one RNA species tothe other, and in the case of RNA2, repeated duplication within internal coding sequence.Duplication of sequence at the 5' ends of RNA1 and RNA2 was already suggested insection 4.3 after it was shown that the predicted N-terminal coding sequences of theTomRSV RNA1 and RNA2 ORFs matched the N-terminal sequence predicted from bothTBRV and GCMV RNA1 ORFs. The internal repeats within RNA2 most likely arose fromtwo tandem duplications, while the origin of the 3' terminal sequence which is repeated onRNA1 and RNA2 is unknown. The generation of duplicated sequences would seem torequire recombination between RNA segments during replication (for reviews on RNArecombination see King et al., 1987; Lai, 1992). Briefly, for RNA recombination betweendifferent molecules, it is believed that during replication, the replicase disassociates fromthe template, together with the partially synthesized nascent complementary strand, andthen reassociates with a second RNA template, either at the same or different location,followed by complete synthesis of the now recombinant complementary RNA molecule.During homologous RNA recombination, reassociation with the second template moleculeis mediated by complementarity between the template and the partially synthesized RNAmolecule that is complexed with the replicase.Nonhomologous recombination does not require complementarity between the twoRNAs and could explain how the 5' and 3' duplicated sequences have arisen. Duplicationof the sequences between TomRSV RNA1 and RNA2 would require at least twononhomologous recombination events, once between the 5' ends and once again betweenDiscussion^152the 3' ends. For example, it is possible that the replicase and it's associated nascent 1.5 kbRNA transcript corresponding to the TomRSV RNA1 3' terminus, disassociated from thetemplate and then initiated synthesis downstream of the coat protein coding region onTomRSV RNA2 resulting in the duplication of RNA1 3' terminal 1.5 kb sequence onTomRSV RNA2. A similar scenerio can be envisioned for the duplication of RNA1 5'terminal sequences on TomRSV RNA2 during (+) strand RNA synthesis. In addition,there are the three internal repeated sequences within RNA2 which would require at leastone nonhomologous recombination event.If nonhomologous recombination is responsible for identity between the 3' endsand between the 5' ends, it is still difficult to explain how (or why) these sequences havebeen maintained. Fidelity during replication by RNA polymerases has been measuredindependently several times and is several orders of magnitude lower then that that of DNApolymerases (10 -3 to 10 -4 compared to 10 -8 to 10 -10 ) (Steinhauer & Holland, 1986;Steinhauer & Holland, 1987). Such a high error rate would result in rapid changes withinthe duplicated sequence in the absence of strong selection pressure. It is possible that thesequence duplications occurred recently and have not had the opportunity to diverge. Amore probable explanation for the maintenance of nucleotide sequence similarity betweenthe 5' termini and between the 3' termini is that these sequences are frequently exchangedthrough homologous RNA recombination.Evidence for homologous RNA recombination is relatively rare among the manydifferent RNA virus groups, but one of the best studied examples occurs in the closelyrelated picornaviruses (reviewed in: King et al., 1987; Lai, 1992). Another welldocumented case is the animal coronaviruses ( reviewed in: Lai, 1990), and homologousRNA recombination has also been demonstrated to occur in the plant bromoviruses(Bujarski, 1986). It has been argued that homologous recombination between twopicornavirus RNA molecules, each carrying deleterious point mutations introduced as aconsequence of the high en-or rate of RNA-dependent RNA polymerases, is necessary forDiscussion^153the maintenance of a viable genome (see King et al., 1987). This is of particular concernamong the picornaviruses since it has also been suggest that replication of the picomavirusRNA is only efficiently accomplished by the viral-encoded replicase genes translated fromthe same RNA that is to be replicated (ie. replication occurs preferentially in cis) (Bernsteinet al., 1986; Kirkegaard & Baltimore, 1986).Considering the evolutionary relationship between the nepoviruses andpicornaviruses, it is not difficult to suppose that homologous RNA recombination couldoccur during the replication of TomRSV. In TomRSV, frequent homologousrecombination between the 3' ends and between the 5' ends of RNA1 and RNA2 wouldaccount for the perfect conservation of nucleotide sequence. Although RNA recombinationmay explain how the terminal sequence identities were generated and maintained, it is stillleft to be answered why the maintenance of these identical sequences is necessary. Byanalogy with the suggested replication strategy of picornaviruses (see above), thereplication associated proteins encoded by TomRSV RNA1 may function inefficiently intrans. If this were true, efficient replication of RNA2 would not occur. It is possible that,in TomRSV, replication begins in cis with RNA1 and that trans replication of RNA2 occursonly following disassociation and reassociation of the initial negative-strand transcript withthe corresponding region in RNA2. A similar mechanism involving recombination duringpositive-strand RNA synthesis could account for sequence conservation between the 5'termini of RNA1 and RNA2. A similar mechanism, occuring during positive-strandtranscription, has recently been proposed for leader-primed generation of subgenomicmessenger RNAs in coronavirus (see Lai, 1990). The size of the duplicated sequences inTomRSV may be the minimum required to facilitate efficient replication through RNArecombination between RNA1 and RNA2. Alternatively, the entire length may not berequired for recombination but may serve other important functions in addition to apostulated role in recombination.Discussion^154Recently it has also been demonstrated that replication of the comovirus CPMV B-RNA can only be efficiently replicated in cis by the replication associated proteins inprotoplasts, and that replication of the M RNA requires the presence of the M RNAencoded 58K protein, the latter is postulated to direct the B RNA-encoded replicationcomplex to the M RNA (Wellink et al., 1992). There is no experimental evidence forhomologous RNA recombination in the comovirus.4 . 5 Translation InitiationAnalysis of the TomRSV RNA genome for ORFs, identified the first AUG triplet, atposition 78, as the start site for the long ORFs in both RNA1 and RNA2. The sequencecontext of this AUG was compared to the consensus sequences surrounding the AUGinitiation codon as defined by Kozak (1986) in animals and by Ltitcke et al., (1987) inplants, and was found to be suboptimal (see Fig. 3.15). A second in-frame AUG atposition 441 is in a favorable Kozak context for the initation of translation (see Fig 3.15).A widely accepted model for translation initiation in eukaryotes (Kozak, 1989), states that40S ribosomal subunits, along with initiation factors, recognize the mRNA via the 5'm7GpppX cap structure (were X is any nucleotide) and scans down the RNA to the firstAUG codon at this point the 60S ribosomal subunit and various elongation factors enter thecomplex to initiate translation. Translation may initiate at a second downstream in-frameAUG site if the first AUG is in a suboptimal context for translation initiation. Translationin picornaviruses and CPMV, which are not capped, can be initiated by direct binding ofribosomes to internally located sequences (Pelletier et al., 1988; Kaminski et al., 1990;Verver et al, 1991). As a result, translation initiation in CPMV M RNA can occur at thesecond and/or third in-frame AUG (in addition to the first in-frame AUG), the 8th or 9thAUG in poliovirus RNA and the 11th AUG in encephalomyocarditis picornavirus RNA.Discussion^155Translation initiation of the potyvirus plum pox virus (PPV) RNA has been shown to occurat the second in-frame AUG and not the first in-frame AUG in vitro, and is suggested toalso occur at the second AUG in vivo (Riechmann et al., 1991). For PPV, it was alsodemonstrated that initiation at the second in-frame AUG does not occur by internal entry ofthe ribosomes.The site(s) of translation initiation in the TomRSV RNA sequence were assayedboth in vitro and in protoplasts (see section 3.12). Translation initiation was only detectedfrom AUG78 both in vitro and in protoplasts. This is surprising, considering the poorsequence context of this AUG codon (see Fig 3.15). The high U content of the 5' regionof TomRSV (see Table 1) suggests that it is not highly structured and therefore, translationat AUG78 may occur more efficiently than would be predicted by the sequence context ofAUG78 (Kozak, 1986). The possiblity that the U rich 5' noncoding regions may be atranslational enhancer sequence is fortified by the observation that TMV and TEV 5'translation enhancer sequences (see section 1.1.3) have a high AU content, as does the 5'noncoding region of TomRSV RNA .4.6 Full-Length cDNA Clones of RNA1 and RNA2The functional analysis of genes and noncoding sequences is greatly assisted by thegeneration of precise mutations in the sequences to be studied and observing the effects ofthese changes in vitro or, preferably, in vivo . There are many methods available for the invitro manipulation of DNA sequences which are not available for RNA sequences. As aresult mutational analysis of viral RNA genomes is difficult and imprecise. However, ithas been demonstrated for many RNA viruses that in vivo generated virion sense RNAtranscripts, derived from a full-length cDNA copy of the viral RNA placed under thecontrol of a bacteriophage RNA promoter, are infectious when inoculated onto plants. It isDiscussion^156now possible to make desired mutations in the viral cDNA copy, generate RNA transcriptscorresponding to the altered sequence, inoculate the RNA onto plants, and assay fordifferences compared to plants inoculated with RNA transcripts generated from theunaltered cDNA sequence. The result is a very powerful system for analyzing RNA virusgene function and expression.For future analyses of the TomRSV RNA sequence, attempts were made toconstruct full-length cDNA clones corresponding to RNA1 and RNA2 fused to thebacteriophage T7 RNA promoter in the hopes that the resulting transcripts would bebiologically active upon inoculation onto plants. Full-length clones could not besynthesized using oligonucleotides complementary and homologous the 3' and 5' terminirespectively, of TomRSV RNA. Failure to obtain full-length clones by this method may bedue to the large size of the TomRSV RNAs and/or extensive secondary structure whichwould decrease the efficiency of the polymerase elongation reactions. Full-length cDNAclones were therefore pieced together from separate cDNA clones corresponding to the 5'and 3' ends of RNA1 and RNA2 together with previously sequenced cDNA clones (seesection 3.3). Full-length clones corresponding to RNA1 (pMR10) and RNA2 (pMRD14)were constructed. Inoculation of transcripts, derived from these clones using T7 RNApolymerase, onto a sensitive local lesion host for TomRSV, failed to elicit an infectiousresponse when compared to plants innoculated with virion RNA. As little as 5 ng ofTomRSV virion RNA was capable of reproduceably eliciting small necrotic spots on theinoculated leaves one week post inoculation, whereas, even 40 tg of synthetic RNAtranscript was unable to produce a similar response.There are a number of possible explanations for this result. One is that the VPgprotein which is covalently linked to the 5' termini of virion RNA is important forinfectivity. It was previously determined that protease treatment of RNA extracted fromvirus particles from six nepoviruses; TBRV, TomRSV, raspberry ringspot, strawberryringspot, tobacco ringspot and arabis mosaic virus decreased or abolished infectivity (MayoDiscussion^157et al., 1982). Similar treatment of CPMV RNA had no effect on infectivity. Biologicallyactive transcripts have been successfully constructed from full-length clones of CPMV,poliovirus and the potyvirus tobacco vein mottling virus, all of which normally have a 5'VPg but which are not required for infectivity (Vos et al., 1988; Van der Werf et al., 1986;Riechmann et al., 1990). Since infectivity of CPMV synthetic transcripts is increased withthe addition of a m7G(5')pppG (cap) to the 5' terminus, it was suggested that the capstructure could partially compensate for the VPg by increasing the stability of the viralRNA. Capping of synthetic TomRSV transcripts however, did not result in infectivity inplants.For the potexvirus, white clover mosaic, it was shown that the length of the 3'poly(A) sequence had a significant effect on the infectivity of synthetic transcripts (Guilfordet al., 1991). The poly(A) tail in the TomRSV synthetic transcripts is 39 nucleotides inlength and ends in a single nonviral T residue. During the analysis of different cDNAclones corresponding to the 3' end, poly(A) tail lengths longer than 41 nt could not befound. However, an isolate of TomRSV has been reported that did not have a detectable 3'poly(A) tail (Mohan & Chen, 1987), so the length of the poly(A) tail sequence may not beas critical in TomRSV. Addition of nonviral 5' nucleotides can also have a significanteffect on the infectivity of synthetic transcripts (Ahlquist et al., 1984b) TomRSVtranscripts include a single nonviral G residue at the 5' terminus, which is part of the T7promoter sequence, and which may prevent infectivity. The two most 5'nucleotides of theTomRSV viral RNA sequence could not be determined in this study, so in the synthetictranscripts these two nucleotides are a U and an A. The majority of nepo-, como- poty-and picornaviruses have at their 5' termini a U residue followed by either another U or A.It is possible that the 5' terminal sequence of TomRSV RNA is other than UA, which couldhave a dramatic effect on infectivity of the synthetic transcripts. Changes in these twonucleotide positions may allow infectivity. The full-length cDNA clones corresponding toRNA1 and RNA2 were constructed from cDNA clones derived from two isolates ofDiscussion^158TomRSV. The majority of the sequence used to construct the full-length clones werederived from the original isolate used to determine the sequence of TomRSV. Sequencescorresponding to the 5' most 123 bases and 3' most 210 bases of full-length cDNA cloneswere obtained from RNA obtained from a second viral isolate. As a result, there are 6nucleotide differences at the 5' end and 3 nucleotide differences at the 3' end present in thesynthetic RNA sequence compared to the sequence as determined from the origionalTomRSV isolate. It is possible that these nucleotide changes may also have an adverseeffect on infectivity. It is also possible that one of the other cDNA clones used to constructthe full-length clones contained a deleterious mutation. As discussed previously (seesection 4.2.3), viral-encoded RNA polymerases have a very high error rate. Due to thehigh error rate, populations of RNA viruses are a heterogeneous mixture of relatedgenomes (quasispecies). These differ from the "consensus sequence" with individualmolecules having different levels of fitness. It has been estimated that from a wild-typepopulation of Q13, the "wild-type" sequence may comprise only 15% of the RNAmolecules. In poliovirus infected tissue, only 10-15% of viral RNA may actually be viable(for a review see: Domingo et al., 1985). This suggests a high probability that any onecDNA clone may incorporate alterations that could effect viability. Additional errors in thecDNA sequence could have been incorporated due to the high error rate of the reversetranscriptase used for cloning.Recently, the use of plasmid DNA containing full-length viral cDNA clones insertedbetween the CaMV 35S promoter and a polyadenylation signal as a direct DNA inoculumresulted in infectivity (Mori et al., 1991; Maiss et al., 1992). Fidelity of the 5' and 3'sequences in these constructs is evidently less critical then in inoculations with the RNAtranscripts. Placing the TomRSV sequences between the CaMV 35S promoter and apolyadenylation signal might have alleviated some of the potential problems just mentioned.In particular, the stability of the RNA would be less critical, since RNA would becontinuously produced from the plasmid DNA, possibly counteracting the need for a VPg.Discussion^159The additional nonviral nucleotides at the 5' and 3 ends would be also be less critical, andthe RNA transcripts would have a polyadenylation signal which could result in a longer 3'poly(A) sequence. The production of infectious TomRSV synthetic transcripts would bean important tool for future studies on the functions of different regions of the TomRSVRNA genome.BIBLIOGRAPHYABOUHAIDAR, M. G. & BANCROFT, J. B. (1978). The initiation of papaya mosaic virus RNA.Canadian Journal of Microbiology 29, 151-156.ABAD-ZAPATERO, C., ABEL-MEGUID, S. S., JOHNSON, J. E. LESLIE. A.G .W., RAYMENT, I.,ROSSMANN, M. G., SUCK, D. & TSUKIHARA, T. (1980). Structure of southern bean mosaicvirus at 2.8 A resolution. Nature 286, 33-39.AHLQUIST, P., BUJARSKI, J., KAESBURG, P. & HALL, T. C. (1984). Localization of the replicaserecognition site within brome mosaic virus RNA by hybrid-arrested RNA synthesis. Plantmolecular Biology 3, 37-44.AHLQUIST, P., FRENCH, R., JANDA, M. & LOESCH-FRIES, L. S. (1984b). Multicomponent RNAplant virus infection derived from cloned viral cDNA. Proceedings of the National Academy ofSciences, U.S.A. 81, 7066-7070.AHLQUIST, P., LUCKOW, V. & KAESBERG, P. (1981). Complete nucleotide sequence of bromemosaic virus RNA3. Journal of Molecular Biology 153, 23-38.ALLEN, W. R. & DIAS, H. F. (1977). Properties of the single protein and two nucleic acids of tomatoring-spot virus. Canadian Journal of Botany 55, 1028-1037.ALLISON, R., JOHNSTON, D. E., and DOUGHERTY, W. G. (1986). The nucleotide sequence of thecoding region of tobacco etch virus genomic RNA: Evidence for the synthesis of a singlepolyprotein. Virology 154, 9-20.ALWINE, J.C., KEMP, D.J. & STARK, G.R. (1977). Method for detection of specific RNAs in agarosegels be transfer to diazobenzyloxymethly-paper and hybridization with DNA probes. Proceedingsof the National Academy of Sciences, U.S.A. 74, 5350-5354.ANGENENT, G. C., POSTHUMUS, E., BREDERODE, F. T., and BOL, J. F. (1989). Genomestructure of tobacco rattle virus strain PLB: Further evidence on the occurrence of RNArecombination among tobraviruses. Virology 171, 271-274.ANGENENT, G. C., LINTHORST, H. J. M., BELKUM, A. F. VAN., CORNELISSEN, B. J. C., andBOL, J. F. (1986). RNA 2 of tobacco rattle virus strain TCM encodes an unexpected gene.Nucleic Acids Reserch 14, 4673-4682.ARGOS, P. (1988). A sequence motif in many polymerases. Nucleic Acids Research. 16, 9909-9919.ARGOS, P., KAMER, G., NIKLIN, M. J. H., and WIMMER, E. (1984). Similarity in gene organizationand homology between proteins of animal picornaviruses and a plant comovirus suggest commonancestry of these virus families. Nucleic Acids Research. 12, 7251-7267.160ARGOS, P., RAO, J. K. M. & HARGRAVE, P. A. (1982). Structural prediction of membrane boundproteins. European Journal of Biochemistry 128, 565-576.ATABEKOV, J. G. & TALIANSKY, M. E. (1990). Expression of a plant virus-coded transport functionby different viral genomes. Advances in Virus Reasearch 38. 201-248.BAILEY , J. M. & DAVIDSON, N. (1976). Methylmercury as a reversible denaturing agent for agarosegel electrophoresis. Analytical Biochemistry 70, 75-85.BAUR, E. (1904). Zur Aetiologie der infectiosen Panachierug. Ber. Dtsch. Bot. Ges. 22, 453-460.BAZAN, J. F., and FLETTERICK, R. J. (1989). Comparative analysis of viral cysteine protease structuralmodels. FEBS Letters. 249, 5-7.BAZAN, J. F., and FLETI'ERICK, R.J. (1988). Viral cysteine proteases are homologous to the trypsin-like family of serine proteases: Structural and functional implications. Proceedings of theNational Academy of Sciences, U.S.A. 85, 7872-7876.BEIJERINCK, M. W. (1898). Over een contagium vivum fluidum als oorzaak van de vlekziekte dertabaksbladen. Versl. Gewone Verglad. Wis- Natuurkd. Afd., K. Akad. Wet. Amsterdam 7, 229-235.BENNER, S. A., ELLINGTON, A. D. & TAUER, A. (1989). Modern metabolism as a palimpest of theRNA world. Proceedings of the National Academy of Sciences, U.S.A. 86, 7054-7058.BERGH, S. T., KOZIEL, M. G., HUANG, S., THOMAS, R. A., GILLY, D. P., and SIEGEL, A. (1985).The nucleotide sequence of tobacco rattle virus RNA-2 (CAM strain). Nucleic Acids Reserch. 13,8507-8518.BERNSTEIN, P., PELTZ, S.W. & ROSS, J. (1989). Poly(A)-poly(A)-binding protein complex is amajor determinant of inRNA stability in vitro. Molecular Cell Biology 9, 659-670.BERNSTEIN, H. D., SARNOW, P., and BALTIMORE, D. (1986). Genetic complementation amongpoliovirus mutants derived from an infectious cDNA clone. Journal of Virology 60, 1040-1049.BOUZOUBAA, S., QUILLET, L., GUILLEY, H., JONARD, G. & RICHARDS, K. (1987). Nucleotidesequence of beet necrotic yellow vein virus RNA1. Journal of General Virology 68, 615-626.BOUZOUBAA, S., ZIEGLER, V., BECK, D., GUILLEY, H., RICHARDS, K. & JONARD, G. (1986).Nucleotide sequence of beet necrotic yellow vein virus RNA2. Journal of General Virology 671689-1700.BRADFORD, M. M. (1976). A rapid and sensitive method for the quantitation of of microgram quantitiesof protein utilizing the principle of protein-dye binding. Analytical Biochemistry 72, 248-254.161BRAULT, V., HIBRAND, L., CANDRESSE, T., LE GALL, 0. & DUNEZ, J. (1989). Nucleotidesequence and genetic organization of Hungarian grapevine chrome mosaic nepovirus RNA2.Nucleic Acids Research 17, 7809-7819.BROADBENT, L., (1976). Epidemiology and control of tomato mosaic virus. Annual Review ofPhytopathology 14, 75-102.BRUENN, J. A. (1991). Relationships among the positive and double-stranded RNA viruses as viewedthrough their RNA-dependant RNA polymerases. Nucleic Acids Research 19, 217-226.BUJARSKI, J. J., and KAESBERG, P. (1986). Genetic recombination between RNA components of amultipartite plant virus. Nature 321, 528-531.BUT IERWORTH, B. E., SHIMSHICK, E. J. & YIN, F. H. (1976). Association of the poliovirus RNAcomplex with phospholipid membranes. Journal of Virology 19, 457-466.BUZAYAN, J. M., GERLACH, W. L. & BRUENING, G. (1986). Non-enzymatic cleavage and ligationof RNAs complementary to a plant virus satellite RNA. Nature 323,349-353.CALIGUIRI, L. A. & TAMM, I. (1970). Characterization of poliovirus specific structures associated withcytoplasmic membranes. Virology 42, 112-122.CAMBELL, R. N. (1985). Longevity of Olpidium brassicae in air-dry soil and the persistance of thelettuce big-vein agent. Canadian Journal of Botany 63, 2288-2293.CANDRESSE, T., MORCH, M. D., and DUNEZ, J. (1990). Multiple alignment and hierarchicalclustering of conserved amino acid sequences in the replication-associated proteins of plant RNAviruses. Research in Virology 141, 315-329.CAPOOR, S. P. (1949). The movement of tobacco mosaic virus and potatovirus X through tomatoplants. Annual of Applied Biology 36, 307-319.CARRINGTON, J. C. & DOUGHERTY, W. D. (1988a). A viral cleavage site casette: Identification ofamino acid sequences required for tobacco etch virus polyprotein processing. Proceedings of theNational Academy of Sciences, U.S.A 85,3391-3395.CARRINGTON, J. C., CARY, S. M., and DOUGHERTY, W. D. (1988b). Mutational analysis oftobacco etch virus polyprotein processing: cis and trans proteolytic activities of polyproteinscontaining the 49-kilodalton proteinase. Journal of Virology. 62, 2313-2320.CARRINGTON, J. C. & DOUGHERTY, W. G. (1987) Small nuclear inclusion protein encoded by aplant potyvirus genome is a protease. Journal of Virology 61, 2540-2548.CARRINGTON, J. C. & FREED, D. D. (1990). Cap-independant enhancement of translation by a plantpotyvirus 5' non-translated region. Journal of Virology 63, 1590-1597.162CECH, T. R. & BASS, B. L. (1986). Biological catalysis by RNA. Annual Review of Biochemistry 55,599-629.CITOVSKY, V., KNORR, D., SCHUSTER, G. & ZAMBRYSKI, P. (1990). The P30 movement proteinof tobacco mosaic virus is a single-stranded nucleic acid binding protein. Cell 60, 637-647.CONVERSE, R. H. & STACE-SMITH, R. (1971). Rate of spread and effect of tomato ring spot virus onred raspberry in the field. Phytpathology 61, 1104-1106.CRAWFORD, N. M. & BALTIMORE, D. (1983). Genome-linked protein VPg of poliovirus present asVPg and VPg-Up in poliovirus-infected cells. Proceedings of the National Academy of Sciences,U.S.A. 80, 7452-7455.CUOZZO, M., O'CONNEL, K. M.,KANIEWSKI, W., FANG, R. X., CHUA, N. H. & TUMER, N. E.(1988). Viral protection in transgenic tobacco plants expressing the cucumber mosaic virus coatprotein or its antisense RNA. Bio-Technology 6, 549-554.D'ALESSIO, J. M., NOON, M. C., LEY, H. L., III & GERAD, G. F. (1987). One tube double-strandedcDNA synthesis using cloned M-MLV reverse transcriptase. Focus 9 (1), 1-4. Gaithersburg:Bethesda Research Laboratories.DAVIES, J. W. & HULL, R. (1982). Genome expression of plant positive-strand RNA viruses. Journalof General Virology 61, 1-14.DELARUE, M., POCH, 0., TORDO, N., MORAS, D. & ARGOS, P. (1990). An attempt to unify thestructure of polymerases. Protein Engineering 3, 461-467.DE MAJIA, M. V. G., HIEBERT, E., PURCIFULL, D. E., THORNBURY, D. W. & PIRONE. (1985).Identification of potyviral amorphous inclusion protein as a non-structural virus-specific proteinrelated to helper component. Virology 142, 34-43.DEMANGEAT, G., C., HEMMER, 0., REINBOLT, J., MAYO, M. A. & FRITSCH, C. Virus-specificproteins in cells infected with tomato black ring nepovirus: evidence for proteolytic processing invivo. Journal of General Virology in press.DEMANGEAT, G., HEMMER, 0., FRITSCH, C., LE GALL, 0. & CANDRESSE, T. (1991). In vitroprocessing of the RNA-2-encoded polyprotein of two nepoviruses: tomato black ring virus andgrapevine chrome mosaic virus. Journal of General Virology 72, 247-252.DEMANGEAT, G., GREIF, C., HEMMER, 0. & FRITSCH, C. (1990). Analysis of the in vitrocleavage products of the tomato black ring virus RNA-1-encoded 250K polyprotein. Journal ofGeneral Virology 71, 1649-1654.163DESSENS, J. T. & LOMONOSSOFF, G. P. (1991). Mutational analysis of the putative catalytic triad ofthe cowpea mosaic virus 24K protease. Virology 184, 738-746.DEZOEIEN, G. A. & GAARD, G. (1969). Possiblities for inter- and intracellular translocation of someicosahedral plant viruses. Journal of Cell Biology 40, 814-823.DIENER, T. 0. & SCHNEIDER, I. R. (1966). The two components of tobacco ringspot virus nucleicacid: Origin and properties. Virology 29, 100-105.DOMINGO, E., MARTiNEZ-SALAS, E., SOBRINO, F., CARLOS DE LA TORRE, J., PORTELA,AGUSTIN, ORTiN, J., L6PEZ-GALINDEZ, C., PEREZ-BRENA, P., VILLANUEVA, N.,NAJERA, R., VANDEPOL, S., STEINHAUER, D., DEPOLO, N. & HOLLAND, J. (1985).The quasispecies (extremely heterogeneous) nature of viral RNA genome populations: biologicalrelevance - a review. Gene 40, 1-8.DORSSERS, L., VAN DER KROL, S., VAN DER MEER, J., VAN KAMMEN, A. & ZABEL, P.(1984). Purification of cowpea mosaic virus RNA replication complex: identification of a virus-encoded 110,000 dalton polypeptide responsible for RNA chain elongation. Proceedings of theNational Academy of Sciences, U.S.A. 81, 1951-1955.DOUGHERTY, W. G. & HEIBERT, E. (1985). Genome structure and gene expression of plant RNAviruses. In Moleculat Plant Virology (editor J. W. Davies) vol II, 23-81EGGEN, R. & VAN KAMMEN, A. (1988). RNA replication in comoviruses, In RNA Genetics, vol. 1,49-69. (P. Ahlquist, J. Holland & E. Domingo Ed.) Boca Raton: CRC Press.EGGAN, R. KAAN, A., GOLDBACH, R. & VAN KAMMEN, A. B. (1988). Cowpea mosaic virus RNAreplication in crude membrane fractions from infected cowpea and Chenopodium amaranticolor.Journal of General Virology 69, 2711-2720.EVANS, R. K., HALEY, B. E. & ROTH, D.A. (1985). Photoaffinity labelling of a viral induced proteinfrom tobacco. Journal of Biological Chemistry 260, 7800-7808.F.ENG, D. & DOOLITTLE, R. F. (1987). Progressive sequence alignment as a prerequisite to correctphylogenetic trees. Journal of Molecular Evolution 25, 351-360.FERNANDEZ-MUNOZ. R. & DARNELL, J. (1976). Structural differences between the 5'-termini of viraland cellular mRNA in poliovirus infected cells: possible basis for the inhibition of host proteinssynthesis. Journal of General Virology 18, 719-726.FICKETT, J. W. (1982). Recognition of protein coding regions in DNA sequences. Nucleic AcidsResearch 10, 5303-5318.FRAENKEL-CONRAT, H. (1956). The role of nucleic acid in the reconstitution of active tobacco mosaicvirus. Journal of American Chemical Society 78, 882-883.164FRAENKEL-CONRAT, H. & WILLIAMS, R. C. (1955). Reconstitution of active tobacco mosaic virusfrom its inactive protein and nucleic acid components. Proceedings of the National Academy ofSciences, U.S.A. 41, 690-698.FRANSSEN, H., MOERMAN, M., REZELMAN, G. & GOLDBACH, R. (1984). Evidence that the32,000-dalton protein encoded by the bottom-component RNA of cowpea mosaic virus is aproteolytic processing enzyme. Journal of Virology 50, 183-190.FRENCH, R. & AHLQUIST, P. (1988). Characterization and engineering of sequences controlling in vivosynthesis of brome mosaic virus subgenomic RNA. Journal of Virology 62, 2411-2420.FRY, D. C., KUBY, S. A. & MIDVAN, A. S. (1986). ATP-binding site of adenylate kinase:mechanistic implications of its homology with ms-encoded p21. F1-ATPase and other nucleotide-binding proteins. Proceedings of the National Academy of Sciences, U.S.A. 83, 907-911.GALLIE, D. R., FEDER, J. N., SCHIMKE, R. T. & WALBOT, V. (1991). Functional analysis of thetobacco mosaic virus tRNA-like structure in cytoplasmic gene regulation. Nucleic Acids Research19, 5031-5036.GALLIE, D. R. & WALBOT, V. (1990). RNA pseudoknot domain of tobacco mosaic virus canfunctionally substitute for a poly(A) tail in plants and animal cells. Genes & Development 4,1149-1157.GALLIE, D. R., SLEAT, D. E., WATTS, J. W., TURNER, P. C. & WILSON, M. A. (1988).Mutational analysis of the tobacco mosaic virus 5'-leader for altered ability to enhance translation.Nucleic Acids Research 16, 883-893.GALLIE, D. R., SLEAT, D. E., WATTS, J. W., TURNER, P. C. & WILSON, T. M. A. (1987). The 5'leader sequence of tobacco mosaic virus RNA enhances the expression of foreign gene transcriptsin vitro and in vivo. Nucleic Acids Research 15, 3257-3273.GAY, N. L. & WALKER, J. E. (1983). Homology between human bladder cacrinoma oncogene productand mitochondrial ATP-synthase. Nature 301, 262-264.GERGEN, J. P., STERN, R. H. & WENSINK, P. C. (1979). Filter replicas and permanent collections ofrecombinant DNA plasmids. Nucleic Acids Research 7, 2115-2136.GIERER, A. & SCHRAMM, G. (1956). Infectivity of ribonucleic acid from tobacco mosaic virus.Nature 177, 702-703.GODEFROY-COLBURN, T., THIVENT, T. & PINCK, L. (1985). Translational discrimination betweenfour RNA's of alfalfa mosaic virus. A qualitative evaluation. European Journal of Biochemistry147, 549-548.165GOLDBACH, R. (1987). Genomic similarities between plant and animal RNA viruses.Microbiological Sciences 4, 197-202.GOLBACH, R. (1986). Molecular evolution of plant RNA viruses. Annual Review of Phytopathology24, 289-310.GOLDBACH, R. & VAN KAMMEN, A. (1985). Structure, replication and expression of the bipartitegenome of cowpea mosaic virus. In 'Molecular Plant Virology" (J. W. Davies, Ed.) 2, 83-120.GOLDBACH, R. & REZELMAN, G. (1983). Orientation of the cleavage maps of the 200-kilodaltonpolypeptide encoded by the bottom component RNA of cowpea mosaic virus. Journal of Virology46, 614-619.GORBALENYA, A, E., BLINOV, V. M., & KOONIN, E. V. (1985). Prediction of nucleotide-bindingproperties of virus specific proteins from their primary structure. Molecular Genetics 11, 30-36.GORBALENYA, A. E. & KOONIN, E. V. (1989a). Viral protein containing the purine NTP-bindingsequence pattern. Nucleic Acids Research 17, 8413-8440.GORBALENYA, A. E., BLINOV, V. M., DONCHENKO, A. P. & KOONIN, E. V. (1989b). An NTP-binding motif is the most conserved sequence in a highly diverged monophyletic group of proteinsinvolved in positive strand RNA viral replication. Journal of molecular Evolution 24, 256-268.GORBALENYA, A. E., DONCHENKO, A. P., BLINOV, V. M., and KOONIN, E. V. (1989c). Cysteineproteases of positive strand RNA viruses and chymotrypsin-like serine proteases. FEBS Lett.243, 103-114.GORBALENYA, A. E., BLINOV, V. M. & KOONIN, E. V. (1985). Prediction of nucleotide-bindingpropterties of virus specific proteins from their primary structure. Molecular Genetics 11, 30-36.GREIF, C., HEMMER, 0. & FRITSCH, C. (1988). Nucleotide sequence of tomato black ring virusRNA-1. Journal of General Virology 69, 1517-1529.GUILFORD, P., BECK, D. L. & FORS IER, R. L. S. (1991). Influence of the poly(A) tail and putativepolyadenylation signal on the infectivity of white clover mosaic potexvirus. Virology 182, 61-67.HABILI, N. & SYMONS, R. H. (1989). Evolutionary relationship between luteoviruses and otherRNA plant viruses based on sequence motifs in putative RNA polymerases and nuceic acidhelicases. Nucleic Acids Research 17, 9543-9555.HAMMOND, A. W. & D'ALESSIO, J. M. (1986). Removal of 5' extensions with mung bean nucleaseusing positive selection plasmids. Focus 8, 4-6.166HANADA, K. & HARRISON, B. D. (1977). Effects of virus genotype and temperature on seedtransmission of nepoviruses. Annuals of applied Biology 85, 79-92.HARRIS, K. F. & MARAMOROSCH, K. (1977). In Aphids as virus vectors. Academic Press INC.NewYork, New York.HARRISON, S. C. (1983). Virus structure: high resolution perspectives. Advances in Virus Research28, 175-239.HARRISON, B. D. & BARKER, H. (1978). Protease-sensitive structure needed for infectivity ofnepovirus RNA. Journal of General Virology 40, 711-715.HARRISON, B. D. & MURANT, A. F. (1977a). Nepovirus group. CMI/AAB Description of PlantViruses, no. 185.HARRISON, B. D. & MURANT, A. F. (1977b). Nematode transmissibility of pseudo-recombinantisolates of tomato black ring virus. Annuals of applied Biology 86, 209-212.HARRISON, B. D., MURANT, A. F., MAYO, M. A. & ROBERTS, I. M. (1974). Distribution ofdeterminants for symptom production, host range and nematode transmission between the twoRNA components of raspberry ringspot virus. Journal of General Virology 22, 233-247.HARRISON, B. D., MURANT, A. F. & MAYO, M. A. (1972). Evidence for two functional RNAspecies in raspberry ringspot virus. Journal of General Virology 16, 339-348.HARRISON, S. C., OLSON, A., SCHUTT, C. E., WINKLER, F. K. BRICOGNE, G. (1978). Tomatobushy stunt virus at 2.9 A resolution. Nature 276, 368-373.HAYES, R. J. & BUCK, K. W. (1990). Complete replication of a eukaryotic virus RNA in vitro by apurified RNA-dependant RNA polymerase. Cell 63, 363-368.HELLEN, C. U. T., KRAUSSLICH, H. & WIMMER, E. (1989). Proteolytic processing of polyproteinsin the replication of RNA viruses. Biochemistry 28, 9881-9890.HENIKOFF, S. (1984). Unidirectional digestion with exonuclease HI creates targeted breakpoints for DNAsequencing. Gene 28, 351-359.HERSHEY, A. D. & CHASE, M. (1952). Independant functions of viral protein and nucleic acid ingrowth of bacteriophage. Journal of General Physiology 36, 39-56.HOGEL, J. M., MAEDA, A. & HARRISON, S. C. (1986). Structure and assembly of turnip crinklevirus, I. X-ray crystallographic structure analysis at 3.2 A resolution. Journal of MolecularBiology 191, 625-638.167HOLLAND, J., SPINDLER, K., HORODYSKI, F., GRABAU, E., NICHOL, S., and VANDEPOL, S.(1982). Rapid evolution of RNA genomes. Science 215, 1577-1585.HOLNESS, C. L., LOMONOSSOFF, G. P., EVANS, D. & MAULE, A. J. (1989). Identification of theinitiation codons for translation of cowpea mosaic virus middle component RNA using sitedirected mutagenesis of an infectious cDNA clone. Virology 172, 311-320.HUEZ, G., CLEUTER, Y., BRUCK, C., VAN VLOTEN-DOTING, L. & GOLDBACH, R. (1983).Translational stability of plant viral RNAs microinjected into living cells. Influence of a poly(A)segment. European Journal of Biochemistry 130, 205-209.HULL, R. (1989). The movement of viruses in plants. Annual Review of Phytopathology 27, 213-240.ISHIHAMA, A. & NAGATA, K. (1988). Viral RNA polymerases. CRC Critical Reviews inBiochemistry 23, 27-65.ISHIKAWA, M., MESHI, T., WATANABE, Y. & OKADA, Y. (1988). Replication of chimeric tobaccomosaic viruses which carry heterologous combinations of replicase genes and 3' non-codingregions. Virology 164, 290-293.IVANOFF, L. A., TOWATARI, T., RAY, J., KORANT, B. D. & PET I EWAY, JR, S. R. (1986).Expression and site-specific mutagenesis of the poliovirus 3S protease in Escherichiacoli. Proceedings of the National Academy of Sciences, U.S.A. 83, 5392-5396.JANDA, M., FRENCH, R. & AHLQUIST, P. (1987). High efficiency T7 polymerase synthesis ofinfectious RNA from cloned brome mosaic virus cDNA and effects of 5' extensions of transcriptinfectivity. Virology 158, 259-262.JASPERS, E. M. J. (1985). Interaction of alfalfa mosaic virus nucleic acid and protein. In "PlantMolecular Biology" (J. W. Davies, Ed.), CRC Press, Boca Raton, Florida. vol I, 155-221JEFFERSON, R. A., KAVANAGH, T. A. & BEVAN, M. W. (1987). GUS fusions: P-glucuronidase asa sensitive and versatile gene fusion marker in higher plants. EMBO Journal 6, 3901-3907.JONARD, G., RICHARDS, K. E., GUILLEY, H. & HIRTH, L. (1977). Sequences from the assemblynucleation region of TMV RNA. Cell 11, 483-493.JONES, A. T., KINNINMONTH, A. M. & ROBERTS, I. M. (1972). Ultrastructural changes indifferentiated leaf cells infected with cherry leaf roll virus. Journal of General Virology 18, 61-64.JONES, E. Y., STUART, D. I. & WALKER, N. P. C. (1989). Structure of tumour necrosis factor.Nature 338, 225-228.JONES, R. A. & FRIBOURG, C. E. (1977). Beetle, contact, and potato true seed transmission of Andeanpotato latent virus. Annual Review of Applied Biology 86, 123-149.168JOYCE, G. F. (1989). RNA evolution and the origins of life. Natrue 338, 217-224.JUPIN, I., QUILLET, L., NIESBACH-KLOSGEN, U., BOUZOUBAA, S., RICHARDS, K., GUILLEY,H. & JONARD, G. (1990). Infectious synthetic transcripts of beet necrotic yellow vein virusRNAs and their use in investigating structure-function relations. In " Viral Genes and PlantPathogenesis " (T.P.Pirone and J. G. Shaw, Eds), pp. 187-204. Springer-Verlag, New York.KAMER, G., and ARGOS, P. (1984). Primary structural comparison of RNA-dependent polymerases fromplant, animal and bacterial viruses. Nucleic Acids Research. 12, 7269-7282.KAMINSKI, A., HOWELL, M. T. & JACKSON, R. J. (1990). Initiation of encephalomyocarditis virusRNA translation: the authentic site is not selected by a scanning mechanism. The EMBO Journal9, 3753-3759.KING, A. M. Q. (1988). Genetic recombination in positive strand RNA viruses. In "RNA Genetics" (E.Domingo, J. J. Holland and P. Ahlquist, Eds) CRC Press, Boca Raton, Florida. Vol 2, 149-165.KING, A. M. Q., ORTLEPP, S. A., NEWMAN, J. W. I., and McCAHON, D. (1987). Geneticrecombination in RNA viruses. In "The Molecular Biology of the Positive Strand RNA Viruses"(D. J. ROWLANDS, M. A. MAYO, and B. W. J. MAHY, Eds.) Academic Press, London, pp.129-152.KIRKEGAARD, K., and BALTIMORE, D. (1986). The mechanism of RNA recombination in poliovirus.Cell 47, 433-443.KOONIN, E. V. (1991a). The phylogeny of RNA-dependant RNA poymerases of positive-strand RNAviruses. Journal of General Virology 72, 2197-2206.KOONIN, E. V., MUSHEGIAN, A. R., RYABOV, E. V. & DOLJA, V. V. (1991b). Diverse groups ofplant RNA and DNA viruses share related movement proteins that may possess chaperone-likeactivity. Journal of General Virology 72, 2895-2903.KOZAK, M. (1989). The scanning model for translation: an update. The Journal of Cell Biology 108,229-241.KOZAK, M. (1986). Point mutations define a sequence flanking the AUG initiation codon that modulatestranslation by eukaryotic ribosomes. Cell 44, 283-292.KOZAK, M. (1986). Influences of mRNA secondary structure on initiation by eukaryotic ribosomes.Proceedings of the National Academy of Sciences U. S. A. 83, 2850-2854.KRAUT, J. (1977). Serine proteases: structure and mechanism of catalysis. Annual Review ofBiochemistry 46, 331-358.169KUNKEL, T. A., ROBERTS, J. D. & ZAKOUR, R. A. (1987). Methods in Enzymology. 154 367-382.KY'lE, J. & DOOLITTLE, R. F. (1982). A simple method for displaying the hydropathic character of aprotein. Journal of Molecular Biology 157, 105-132.LA COUR, T. F. M., NYBORG, L., THIRUP, S. & CLARCK, B. F. C. (1985). Structural details of thebinding of guanosine diphosphate to elongation factor Tu from Escherichia coli as studied by X-raycrystallography. The EMBO Journal 4, 2385-2388.LAI, M. M. C. (1992). RNA recombination in animal and plant viruses. Microbiological reviews 56, 61-79.LAI, M. M. C. (1990). Coronavirus: Organization, replication and expression of genome. Annual Revueof Microbiology 44, 303-333.LAIN, S., RIECHMANN, J. L. & GARCIA, J. A. (1990). RNA helicase: a novel activity associated witha protein encoded by a positive RNA virus Nucleic Acids Research 18, 7003-7006.LAWSON, C., KANIEWSKI, W., HALEY, L., ROZMAN, R., NEWILL, C., SANDERS, P. &TUMER, N. E. (1990). Engineering resistance in a mixed virus infection in a commercialpotato cultivar: resistance to potato virus X and potato virus Y in transgenic Russet Burbank.Bio/Technology 8, 127-134.LE GALL, 0., CANDRESSE, T., BRAULT, V. & DUNEZ, J. (1989). Nucleotide sequence of Hungariangrapevine chrome mosaic nepovirus RNA1. Nucleic Acids Research 17, 7795-7807.LIPMAN, D. J., and PEARSON, W. R. (1985). Rapid and sensitive protein similarity searches. Science227, 1435-1441.LITVAK, S., CARRE, D. S. & CHAPEVILLE, F. (1970). TYMV RNA as a substrate of the tRNAnucleotidyltransferase. FEBS Letters 11, 316-319.LOESCH-FRIES, L. S., JARVIS, N. P., KRAHN, K. J., NELSON, S. E. & HALL, T. C. (1985).Expression of alfalfa mosaic virus RNA4 cDNA transcripts in vitro and in vivo. Virology 146,177-187.LOMONOSSOFF, G. P. & JOHONSON, J. E. (1991). The synthesis and structure of comovirus capsids.Progamming Biophysics and molecular Biology 55, 107-137.LOMONOSSOFF, G. P., SHANKS, M. & EVANS, D. (1985). The structure of the cowpea mosaic virusreplicative form RNA. Virology 144, 351-362.LOMONOSSOFF, G. P. & SHANKS, M. (1983). The nucleotide sequence of cowpea mosaic virus BRNA. EMBO Journal 2, 2253-2258.170LUNDQUIST, R. E., EHRENFELD, E. & MAIZEL, J V JR. (1974). Isolation of a viral polypeptideassociated with poliovirus RNA polymerase. Proceedings of the National Academy of Sciences,U.S.A. 71, 4773-4781.LUTCKE, H. A., CHOW, K. C., MICKEL, F. C., MOSS, K. A., KERN, H. F. & SCHEELE,G. A. (1987). Selection of AUG initiation codons differs in plants and animals. EMBO Journal6, 43-48.MAISS, E., TIMPE, U., BRISSKE-RODE, A., LESEMANN, D. E. & CASPER, R. (1992). Infectiousin vivo transcripts of a plum pox potyvirus full-length cDNA clone containing the cauliflowermosaic virus 35S RNA promoter. Journal of General Virology 73, 709-713.MANDAHAR, C. L. (1990). Virus Transmission in " Plant Viruses" vol II. CRC Press, Bota Raton,Florida.MANDAHAR, C. L. (1981). Virus transmission through seed and pollen. In Plant Diseases and Vectors:Ecology and Epidemiology (K. Maramorosch & K. F. Harris Ed.) Academic Press, New York.MARGIS, R. & PINCK, L. (1992). Effects of site-directed mutagenesis on the presumed catalytic triad andsubstrate-binding pocket of grapevine fanleaf nepovirus 24-kDa proteinase. Virology 190, 884-888.MARSH, L. E., POGUE, G. P. & HALL, T. C. (1989). Similarities among plant virus (+) and (-) RNAtermini imply a common ancestry with promoters of eukaryotic +RNA. Virology 172, 415-427.MARSH, L. E., DREHER, T. W. & HALL, T. C. (1988). Mutational analysis of the core and modulatorsequences of the BMV RNA3 subgenomic promoter. Nucleic Acids Research 16, 981-995.MARSH, L. E. & HALL, T. C. (1987). Evidence implicating a tRNA heritage for the promoters ofpositive strand RNA syntheses in brome mosaic and related viruses. Cold Spring HarborSymposium of Quantitative Biology 52, 331-341.MARTELLI, G. 0. (1975). Nematode Vectors of Plant Viruses, (Lamberti, Taylor & Seinhorst, Ed.).London & New York: Plenum. pp 223.MASASHI, M., MISE, K., KOBAYASHI, K., OKUNO, T. & FURUSAWA, I. (1991). Infectivity ofplasmids containing brome mosaic virus cDNA linked to the cauliflower mosaic virus 35S RNApromotor. Journal of General Virology 72, 234-246.MATTHEWS, R. E. F. (1991). Plant Virology. Academic Press, Inc. San Diego, California.MATTHEWS, R. E. F. (1985a). Viral taxonomy for the non-viralogist. Annual Review of Microbiology39, 451-474.171MATTHEWS, R. E. F. (1985b). Viral taxonomy. Microbiological Sciences 2, 74-75.MAYER, A. (1886). Ueber die mosaikkrankheit des tabaks. Landwirtsch. Vers.-Stn. 32, 451-467.MAYO, M. A., MURANT. A. F. & HARRISON, B.D. (1971). New evidence for the structure ofnepoviruses. Journal of General Virology 12, 175-178.MAYO, M. A., BARKER, H. & HARRISON, B. D. (1979). Polyadenylate in the RNA of fivenepoviruses. Journal of General Virology 43, 603-610.MAYO, M. A., BARKER, H. & HARRISON, B. D. (1982). Specificity and properties of the genome-linked proteins of nepoviruses. Journal of General Virology 59, 149-162.MESHI, T., ISHIKAWA, M., MOTOYOSHI, F., SEMBA, K. & OKADA, Y. (1986). In vitrotranscription of infectious RNAs from full-length cDNAs of tobacco mosaic virus. Proceedings ofthe National Academy of Sciences, U.S.A. 83, 5043-5047.MEYER, M., HEMMER, 0., MAYO, M. A. & FRITSCH, C. (1986). The nucleotide sequence oftomato black ring virus RNA-2. Journal of General Virology 67, 1257-1271.MILLER, W. A., DREHER, T. W. & HALL, T. C. (1985). Synthesis of brome mosaic virussubgenomic RNA in vitro by internal initiation on (-)-sense genomic RNA. Nature 313, 68-70.MOHAN, S. & CHEN, T. A. (1987). Absence of 3' polyadenylation in the RNA of a tomato ringspotvirus (TomRSV) isolate. Abstract Phytopathology 77, 1617.MORRISON, D. A. (1979). Transformation and preservation of competent bacterial cells by freezing.Methods in Enzymology 68, 326-331.MURANT, A. F., TAYLOR, M., DUNCAN, G. H. & RASCHKE, J. H. (1981). Improved estimates ofmolecular weight of plant virus RNA by agarose gel electrophoresis and electron microscopy afterdenaturation with glyoxal. Journal of General Virology 53, 321-332.MURANT, A. F., MAYO, M. A., HARRISON, B. D. & GOOLD, R. A. (1972). Properties of virus andRNA components of raspberry ringspot virus. Journal of General Virology 16, 327-338.NEEDLEMAN, S. B. & WUNSCH, C. D. (1970). General method applicable to the search for similaritiesin the amino acid sequence of two proteins. Journal of Molecular Biology 48, 443-453.NEURATH, H. (1984). Evolution of proteolytic enzymes. Science 224, 350-357.NOLLER, H. F. (1984). Structure of ribosomal RNA. Annual Review of Biochemistry 53, 119-162.172PALMENBERG, A. C. (1990). Proteolytic processing of picornaviral polyprotein. Annual Review ofMicrobiology 44, 603-623.PALMENBERG, A. C. PALLANSCH, M. A. & RUECKERT, R. R. (1979). Protease required forprocessing picornavirual coat protein resides in the viral replicase gene. Journal of Virology 32,770-778.PEARSON, W. R. & LIPMAN, D.J. (1988). Improved tools for biological sequence comparison.Proceedings of the National Academy of Sciences, U.S.A. 85, 2444-2448.PELHAM, H. R. B. (1978). Translation of encephalomyocardidtis virus RNA in vitro yields an activeproteolytic processing enzyme. European Journal of Biochemistry 85, 457-461.PELLETIER, J. & SONENBERG, N. (1988). Internal initiation of translation of eukaryotic mRNAdirected by a sequence derived from poliovirus RNA. Nature 334, 320-325.PETERS, S. A., VOORHORST, G. B., WERY, J. WELLINK, J. & VAN KAMMEN, A. (1992). Aregulatory role for the 32K protein in proteolytic processing of cowpea mosaic virus polyproteins.Virology 191, 81-89.PINCK, M., REINBOLT, J., A. M., LOUDES, A. M., LE RET, M. & PINCK, L. (1991). Primarystructure and location of the genome-linked protein (VPg) of grapevine fanleaf nepovirus. FEBSLetters 284, 117-119.PINCK, M., YOT, P., CHAPEVILLE, F. & DURANTON, H. M. (1970). Enzymatic binding of valineto the 3' end of TYMV-RNA. Nature 226, 954-956.PIRONE, T. P., (1981). Efficiency and selectivity of the helper-component-mediated aphid transmission ofpurified potyviruses. Phytopathology 71, 922-923.POCH, 0., SAUVAGEUT, I., DELARUE, M. & TORDO, N. (1989). Identification of four conservedmotifs among the RNA-dependant polymerase encoding elements. EMBO Journal 8, 3867-3874.PONZ, F., ROWHANI, A., MIRCETICH, S. M., and BRUENING, G. (1987). Cherry leafroll virusinfections are affected by a satellite RNA that the virus does not support. Virology 160, 183-190.PRICE, W. C. (1936). Specificity of acquired immunity from tobacco-ring-spot diseases.Phytopathology 26, 665-675.PRODY, G. A., BAKOS, J. T., BUZAYAN, J. M., SCHNEIDER, I. R. & BRUENING, G. (1986).Autocatalytic processing of dimeric plant virus satellite RNA Science, 231, 1577-1580.173PRUFER, D., TACKE, E., SCHMITZ, J., KULL, B., KAUFMANN, A. & ROHDE, W. (1992).Ribosomal frameshifting in plants: a novel signal directs the -1 frameshift in the synthesis of theputative viral replicase of potato leafroll luteovirus. EMBO Journal 11, 1111-1117.QUADT, H. & JASPERS, E. M. J. (1989). RNA polymerases of plus-strand RNA viruses of plants.Molecular Plant Microbe Interactions 2, 219-223.RACANIELLO, V. R., and BALTIMORE, D. (1981). Molecular cloning of poliovirus cDNA anddetermination of the complete nucleotide sequence of the virual genome. Proceedings of theNational Academy of Sciences, U.S.A. 78, 4887-4891.RAMSDFIL, D. C. (1987). Viral replication, translation, and assembly of nepoviruses. Current topics inVector Research vol. 3, 167-176.RANDLES, J. W., HARRISON, B. D., MURANT, A. F. & MAYO, M. A. (1977). Packaging andbiological activity of the two essential RNA species of tomato black ring virus. Journal ofGeneral Virology 36, 187-194.RAO, A. L. N., DREHER, T. W., MARSH, L. E. & HALL, T. C. (1989). Telomeric function of the t-RNA-like structure of brome mosaic virus RNA. Proceedings of the National Academy ofSciences, U.S.A. 86, 5335-5339.REZAIN, M. A. & FRANCKI, R. I. B. (1973). Replication of tobacco ringspot virus: 1. Detection of alow molecular weight double-stranded RNA from infected plants. Virology 56, 238-249.RIECHMANN, J. L., LAIN, S. & GARCIA, J. A. (1991). Identification of the initiation codon of plumpox potyvirus genomic RNA. Virology, 185, 544-552.RIECHMANN, J. L., LAIN, S. & GARCIA, J. A. (1990). Infectious in vitro transcripts from a plum poxpotyvirus cDNA clone. Virology 177, 710-716.RIGBY, P. W. J., DIECKMANN, M., RHODES, C. & BERG, P. (1977). Labelling deoxyribonucleicacid to high specific activity in vitro by nick translation with DNA polymerase I. Journal ofMolecular Biology 113, 237-251.RITZENTHALER, C., VIRY, M., PINCK, M., MARGIS, R., FUCHS, M. & PINCK, L. (1991).Complete nucleotide sequence and genetic organization of grapevine fanleaf nepovirus RNA1.Journal of General Virology 72, 2357-2365.ROBERTS, I. M. & HARRISON, B. D. (1970). Inclusion bodies and tubular structures in Chenopodiumamaranticolor plants infected with strawberry latent ringspot virus. Journal of General Virology 7,47-54.ROBINSON, D. J., BASKER, H., HARRISON, B. D. & MAYO, M. A. (1980). Replication of RNA-1of tomato black ring virus independently of RNA-2. Journal of General Virology 51, 317-326.174ROCHON, D. M. (1985). In vivo and in vitro encapsidation of host RNA by tobacco mosaic virus coatprotein. Ph.D Thesis, Wayne State University.ROSSMAN, M. G., & JOHNSON, J. E. (1989). Icosaliedral RNA virus structure. Annual Review ofBiochemistry 58, 533-537.SAMBROOK, J., FRITSCH, E.F., (1989). Molecular Cloning a Laboratory Manual. second edition. ColdSpring Harbor Laboratory Press.SAMSON, R. W. & IMLE, E. P. (1942). A ring-spot type of virus disease of tomato. Phytopathology32, 1037-1047.SANGER, F., NICKLEN, S. & COULSON, A. R. (1977). DNA sequencing with chain-terminatinginhibitors. Proceedings of the National Academy of Sciences, U.S.A. 74, 5463-5467.SCHNEIDER, I. R., WHITE, R. M. & CIVEROLO, E. L. (1974). Two nucleic acid-containingcomponents of tomato ringspot virus. Virology 57, 139-146.SCOTT, N. W., COOPER, J. I., LIU, Y.Y. & HELLEN, C. U. T. (1992). A 1.5 kb sequence homologyin the 3'-terminal regions of RNA-1 and RNA-2 of a birch isolate of cherry leaf roll nepovirus isalso present, in part, in a rhubarb isolate. Journal of General Virology 73, 481-485.SERGHINI, M. A., FUCHS, M., PINCK, M., REINBOLT, J., WALTER, B. & PINCK, L. (1990).RNA2 of grapevine fanleaf virus: sequence analysis and coat protein cistron location. Journal ofGeneral Virology 71, 1433-1441.SHAFFNER, M., KINIPPERS, R. & STAHL, H. (1989). RNA unwinding activity of SV40 large Tantigen. Cell 57, 955-963.SKUZESKI, J. M., NICHOLS, L.M. & GESTELAND, R. F. (1990). Analysis of leaky viral translationtermination codons in vivo by transient expression of improved13-glucuronidase vectors. PlantMolecular Biology 15, 65-79.SLEAT, D. E., HULL, R., TURNER, P. C. & WILSON, T. M. A. (1988). Studies on the mechamismof translation enhancement by 5'-leader sequence of tobacco mosaic virus RNA European Journalof Biochemistry 175, 75-86.SOUTHERN, E. M. (1975). Detection of specific sequences among DNA fragments separated by gelelectrophoresis. Journal of Molecular Biology 98, 503-517.STACE-SMITH, R. (1984). Tomato ringspot virus. Commonwealth Aricultural Bureaux/Association ofApplied Biologists, Description of Plant Viruses, no. 290.175STACE-SMITH, R., RIECHMANN, M. E. & 'WRIGHT, N. S. (1965). Purification and properties oftobacco ringspot virus and two RNA-deficient components. Virology, 24, 487-494.STADEN, R. (1984). Measurements of the effects that coding for a protein has on a DNA sequence andtheir use for finding genes.STAHL, H., DROGE, P. & KNIPPERS, R. (1986). DNA helicase activity of SV40 large tumor antigen.The EMBO Journal 5, 1939-1944.STANLEY, W. M., (1935). Isolation of a crystalline protein possessing the properties of tobacco-mosaicvirus. Science 81, 644-645.STAUFFACHER, C. V., USHA, R., HARRINGTON, M., SCHMIDT, T., HOSUR, M. V., JOHNSON,J. E. (1987). Sturucture of cowpea mosaic virus at 3.5 A resolution. In Crystallography andMolecular Biology (D. Moras, J. Drenth, B. Strandberg, D. Suck, D. Wilson Ed.). PlenumPublishing, New York, pp 293-308.S'IEINHAUER, D. A., and HOLLAND, J. J. (1987). Rapid evolution of RNA viruses. Annual Revue ofMicrobiology 41, 409-433.S FEINHAUER, D. A., and HOLLAND, J. J. (1986). Direct method for quantification of extremepolymerase error frequencies at selected single base sites in viral RNA. Journal of Virology 57,219-228.STRAUSS, J. H. & STRAUSS, E. G. (1988). Evolution of RNA viruses. Annual Review ofMicrobiology 42, 657-683.TABOR, S. & RICHARDSON, C. C. (1987). DNA sequence analysis with a modified bacteriophage T7DNA polymerase. Proceedings of the National Academy of Sciences, U.S.A. 84, 4767-4771.TAKEDA, N., KUHN, R. J., YANG, C. F., TAKEGAMI, T. & WIMMER, E. (1986). Initiation ofpoliovirus plus-strand RNA synthesis in a membrane complex of infected HeLa cells. Journal ofVirology 60, 43-53.TAKEGAMI, T., KUHN, R. J., ANDERSON, C. W. & WIMMER, E. (1983). Membrane-dependanturidylyation of the genome-linked protein VPg of poliovirus. Proceedings of the NationalAcademy of Sciences, U.S.A. 80, 7447-7451.TAUTZ, D. & RENZ, M. (1983). An optimized freeze-squeeze method for the recovery of DNA fragmentsfrom agarose gels. Analytical Biochemistry 132, 14-19.TAYLOR, C. E. & ROBERTSON, W. M. (1975). Aquisition, retention and transmission of viruses bynematodes. In: Nematode Vectors of Plant Viruses (F. Lambeth, C. E. Taylor & J. W. Seinhorsteditors) Plenum Press, London and New York, 253-275.176TAYLOR, J. M., ILLMENSEE, R. & SUMMERS, J. (1976). Effective transcription of RNA into DNAby avian sarcoma virus polymerase. Biochimica et biophysica acta 442, 325-330.TOBIN, G. J., YOUNG, D. C. & FLANEGAN, J. B. (1989). Self-catalyzed linkage of poliovirus terminalprotein VPg to poliovirus RNA. Cell 59, 511-519.TONEGUZZO, F., GLYNN, S., LEVI, E., MJOLSNESS, S. & HAYDAY, A. (1988). Use of achemically modified T7 DNA polymerase for manual and automated sequencing of supercoiledDNA. Bio-Techniques 6, 460-469.TREMAINE, J. H. & STACE-SMITH, R. (1968). Chemical composition and biophysical properties oftomato ringspot virus. Virology 35, 102-107.UPPAL, B. N. (1934). The movement of tobacco mosaic virus in leaves of Nicotiana sylvestris. IndianaJournal of Agricultural Sciences 4, 865-873.VAN DER WERE, S., BRADLEY, J., WIMMER, E., STUDIER, F. W. & DUNN, J. J. (1986).Synthesis of infectious poliovirus RNA by purified T7 RNA polymerase. Proceedings of theNational Academy of Sciences, U.S.A. 83, 2330-2334.VANKAN, P., EDOH, D. & FILIPOWICZ, W. (1988). Structure and expression of the U5 snRNA geneof Arabidopsis thaliana. Conserved upstream sequence elements in plant U-RNA genes. NucleicAcids Research 16, 10425-10440.VAN LENT, J., STORMS, M., VAN DER MEER, F., WELLINK, J. & GOLDBACH, R. (1991).Tubular structures involved in movement of cowpea mosaic virus are also formed in infectedcowpea protoplasts. Journal of General Virology 72, 2615-2623.VAN LENT, J., WELLINK, J. & GOLDBACH, R. (1990). Evidence for the involvement of the 58K and48K proteins in the intercellular movement of cowpea mosaic virus. Journal of General Virology71, 2198-223.VAN VLOTEN-DOTING, L. & NEELEMAN, L. (1982). Translocation of plant viral RNA's.Encyclopedia of Plant Physiology, News Service 14B, 337-367.VAN WEZFNBEEK. P., VERVER, J., HARMSEN, J., VOS, P. & VAN KAMMEN, A. (1983).Primary structure and gene organization of the middle-component RNA of cowpea mosaic virus.The EMBO Journal 2, 941-946.VERVER, J., GOLDBACH, R., GARCIA, J. A. & VOS, P. (1987). In vitro expression of a full-lengthDNA copy of cowpea mosaic virus B RNA: identification of the B RNA encoded 24-kd protein asa viral protease. The EMBO Journal 6, 549-554.VERVER, J., LE GALL, 0., VAN KAMMEN, A. & WELLINK, J. (1991). The sequence betweennucleotides 161 and 512 of cowpea mosaic virus M RNA is able to support internal initiation oftranslation in vitro. Journal of General Virology, 72, 2339-2345.177VOS, P., JAEGLE, M., WELLINK, J., VERVER, J., EGGEN, R., VAN KAMMEN, A. B. &GOLBACH, R. (1988). Infectious RNA transcripts derived from full-length DNA copies of thegenomic RNAs of cowpea mosaic virus. Virology 165, 33-41.VOS, P., VERVER, J., JAEGLE, M., WELLINK, P. & GOLDBACH, R. (1988). Two viral proteinsinvolved in the proteolytic processing of the cowpea mosaic virus polyprotein. Nucleic AcidsResearch 16, 1967-1985.VRATI, S., MANN, D. A. & REED, K. C. (1987). Alkaline northern blots: transfer of RNA fromagarose gels to Zeta-ProbeTM membrane in dilute NaOH. Molecular Biology Reports 1, (3), 1-4Richmond: Bio-Rad.WALKER, J. E., SARASTE, M., RUNSWICK, M. J. & GAY, N. J. (1982). distantly related sequencesin the a and 13-subunits of ATP synthase, myosin, kinases and other ATP-requiring enzymes and acommon nucleotide binding fold. The EMBO Journal 2, 945-951.WEINER, A. M. & MAIZELS, N. (1987). tRNA-like structures tag the 3' ends of genomic RNAmolecules for replication: Implications for the origin of protein synthesis. Proceedings of theNational Academy of Sciences, U.S.A. 84, 7383-7387.WELLINK, J., LE GALL, 0., VERVER, J., VAN BOKHOVEN, H. & VAN KAMMEN, A. (1992).Replication and translation of the cowpea mosaic virus RNAs are tightly linked. ThirdInternational Symposium on Positive Strand RNA Viruses Clearwater, Florida, abstract S7-2.WELLINK, J., VERVER, J., KROON, G. & VAN KAMMEN, A. (1990). Initiation of translation ofcowpea mosaic virus M RNA. VIHth International Congress of Virology Abstracts, Berlin,abstract W53-005.WELLINK, J. & VAN KAMMEN, A. B. (1989). Cell-to-cell transport of cowpea mosaic virus requiresboth the 58K/48K proteins and the capsid proteins. Journal of General Virology 70, 2279-2286.WELLINK, J., REZELMAN, G., GOLDBACH, R. & BEYREUTHER, K. (1986). Determination of theproteolytic processing sites in the polyprotein encoded by the bottom-component of cowpeamosaic virus. Journal of Virology 59, 50-58.WILLSON, M. A. (1985). Nucleocapsid disassembly and eary gene expression by positive-strand RNAviruses. Journal of General Virology 66, 1201-1207.WILLSON, M. A. (1984). Cotranslational diassembly of tobacco mosaic virus in vitro. Virology 137,353-356.WOLF, S., DEOM, C. M., BEACHY, R. N. & LUCUS, W. J. (1989). Movement protein of tobaccomosaic virus modifies plasmodesmatal size exclusion limit. Science 246, 377-379.178XIONG, Y. & EICKBUSCH, T. H. (1990). Origin and evolution of retroelements based upon their reversetranscriptase sequences. EMBO Journal 9, 3353-3362.ZABEL, P., MOERMAN, M., LOMONOSSOFF, G., SHANKS, M. AND BEYREUTHER, K. (1984).Cowpea mosaic virus VPg: sequencing of radiochemically modified protein allows mapping of thegene on B RNA. EMBO Journal 3, 1629-1634.ZAITLIN, M. & HULL, R. (1987). Plant-virus-host interactions. Annual Review of Plant Physiology38, 291-315.ZIMMERN, D. (1986). Evolution of RNA viruses. In RNA Genetics, vol. 1, 49-69. (P.Ahlquist, J. Holland & E. Domingo Ed.) Boca Raton: CRC Press.ZIMMERN, D. (1977). The nucleotide sequence at the origin for assembly on tobacco mosaic virus RNA.Archives of Virology 106, 15-22.179


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items