UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Retrovirus-like promoters in the human genome Feuchter, Anita E. 1991

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


831-ubc_1992_spring_feuchter_anita_e.pdf [ 3.88MB ]
JSON: 831-1.0086639.json
JSON-LD: 831-1.0086639-ld.json
RDF/XML (Pretty): 831-1.0086639-rdf.xml
RDF/JSON: 831-1.0086639-rdf.json
Turtle: 831-1.0086639-turtle.txt
N-Triples: 831-1.0086639-rdf-ntriples.txt
Original Record: 831-1.0086639-source.json
Full Text

Full Text

RETROVmUS-LrKE PROMOTERSIN THE HUMAN GENOMEANITA E. FEUCHTERB. Sc. (Ron)., Queen’s University, 1986A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHYinTHE FACULTY OF GRADUATE STUDIES(Genetics Programme)We accept this thesis as conformingto the required standardTHE UNIVERSITY OF BRITISH COLUMBIADECEMBER 1991© Anita Feuchter, 1991In presenting this thesis in partial fulfilment of the requirements for an advanceddegree at the University of British Columbia, I agree that the Library shall make itfreely available for reference and study. I further agree that permission for extensivecopying of this thesis for scholarly purposes may be granted by the head of mydepartment or by his or her representatives. It is understood that copying orpublication of this thesis for financial gain shall not be allowed without my writtenpermission.______________________Department of ,4Q;(Q/The University of British ColumbiaVancouver, CanadaDate 2’- ,‘/9/DE.6 (2/88)ABSTRACTSeveral families of repetitive sequences related to integrated retroviruses (proviruses)have been identified in the human genome. The largest of these families, the RTVL-Hfamily, has close to 1000 members, in addition to several hundred solitary long terminalrepeats (LTRs). The similarity of these LTRs in structure and organization to the LTRs ofproviruses suggest that they may act as transcriptional regulators of gene expression.To test this hypothesis, I initially examined the ability of different RTVL-H LTRs todrive expression of the reporter gene chloramphenicol acetyltransferase (CAT) in a variety ofhuman and murine cell lines. These studies revealed that RTVL-H LTRs are heterogeneousin their ability to regulate the expression of linked genes. Although all of five LTRs testedcould promote expression of the CAT gene, their relative promoter activities as well as range ofactivities varied widely. One LTR, H6, displayed strong promoter activity in human(NTera2Dl, 293, Hep2), monkey (COS-1), and mouse (3T3) cells. RNA mapping studieslocalized the transcription start site to the expected location in the H6 LTR. RTVL-H LTRswere also shown to contain sequences that could increase transcription from the human iglobin promoter and be influenced by SV4O enhancer sequences.Over the course of these studies, I observed a difference in RTVL-H promoter activityin CV-1 and COS-1 cells (COS-1 cells are African green monkey kidney cells transformed bySV4O and CV-1 cells are their “untransformed” parent cell line). To examine the possibilitythat this effect was mediated by SV4O encoded proteins, LTR-CAT constructs werecotransfected with expression vectors encoding SV4O proteins. The results of these studiesshowed that SV4O T antigen could activate expression from certain RTVL-H LTRs 5-30 fold.In addition, this transactivation effect was observed in two CV- 1 cells lines containing stablyintegrated LTR-CAT constructs. These results demonstrate that a known transformingprotein can alter the transcriptional capabilities of RTVL-H LTRs.The results of the above studies suggested that RTVL-H LTRs may have the ability toinfluence the expression of unrelated cellular genes. Thus, a differential screening strategywas used to identify five cDNA clones that appear to have been promoted from RTVL-H LTRs.11In one case, clone AF-4, the LTR has been shown to be normally linked to a CpG island.Another clone, AF-3, appears to be a new insertion of an RTVL-H element into a large genewith a region of homology to yeast CDC4. The third clone , AF-5, is the result of a splicing eventbetween an RTVL-H element and a novel gene with two regions of homology withphospholipase A2.Taken together, these results suggest a general evolutionary role for RTVL-H LTRs inthe regulation of gene expression and raise the possibility that activation or rearrangementsinvolving these sequences may alter the normal regulation of cellular genes and thuscontribute to human disease.111ABSTRACT.TABLE OF CONTENTSLIST OF TABLESLIST OF FIGURESArUKTflWI .p1ipT,nwrFTABLE OF CONTENTS.11ivCHAPTER I INTRODUCTIONHuman RetroelemMurine Endogenous Retroviruses 7Human Endogenous Retroviruses 13RNA Polymerase II Promoters and the Regulation of Transcription 20Proviral Long Terminal Repeats 27Interactions of Retroelements with the Host Genome 32LTRs as Mutagenic Agents 38Thesis Objectives 46References 47MATERIALS AND METHODSDescription of LTRsPlasmid ConstructionsCeliTFUNCTIONAL HETEROGENEITY OF A LARGE FAMILY OF HUMANLTR-LIKE PROMOTERS AND ENHANCERST t,.n,l11ntsnCHAPTER V HUMAN ENDOGENOUS LTRS REGULATE THE EXPRESSION OFDIVERSE CELLULAR TRANSCRIPTSCHAPTER IICHAPTER IIICHAPTER IV.62.63.64LraIlslel.uoIIsGeneration of PB-3 CAT Cell Lines 65CAT Assays 65RNA Isolation 65Primer Extension Analysis 66Probes 66Southern and Northern Analysis 67Library Screening 67DNA Sequencing and Computer Analysis 68References 6971Results 72Discussion 88References 92SV4O LARGE T ANTIGEN TRANSACTIVATES THE LONG TERMINALREPEATS OF A LARGE FAMILY OF ENDOGENOUS RETROVIRUS-LIKESEQUENCESTn+r,,1,in+,nn- 95Results and Discussion 96References 111CHAPTER VIIntroduction 114Results 116Discussion 137References 142SUMMARY AND CONCLUSIONSAnalyses of RTVL-H Promoter/Enhancer Activity 147Transactivation of RTVL-H LTRs by SV4O Large T Antigen 149Cellular Transcripts Regulated by RTVL-H LTRs 149Concluding Remarks 1511ivLIST OF TABLESTABLE 1 Relative Promoter Activities of RTVL-H LTRs in Human Cells 78VLIST OF FIGURESFigure 1-1. Schematic representation of the structure of a typical provirus compared withthat of a typical RTVL-H element 18Figure 1-2. Comparison of an RTVL-H LTR with a prototypical type C viral LTRFigure 1-3. Some common mechanisms of LTR mediated mutagenesis 39Figure 3-1. Construction of plasmids used to test the promoter activity of RTVL-H LTRs. .75Figure 3-2a,b,c. Assay of RTVL-H LTR promoter activity in different cell lines 76Figure 3-2d,e,f. Assay of RTVL-H LTR promoter activity in different cell lines 77Figure 3-3. Sequence comparison of RTVL-H LTRs 80Figure 3-4. Bidirectional promoter activity of RTVL-H LTRs 82Figure 3-5a. Primer extension analysis 83Figure 3-5b,c. Mapping of transcription start sites 84Figure 3-6. Activation of RTVL-H LTRs by the addition of a downstream enhancer 86Figure 3-7. RTVL-H LTRs can enhance transcription from an upstream promoter 87Figure 4-1. Assay of RTVL-H promoter activity in CV-1 and COS-1 cells 98Figure 4-2. Competition analysis of the PB-3 LTR 99Figure 4-3. SV4O large T antigen can increase the promoter activity of certain RTVL-HLTRs 101Figure 4-4a. Deletion analyses of sequences involved in the T response (a) Sequences ofthe H6, PB-3, and H7 LTRs (all reported previously, Mager, 1989) 103Figure 4-4b,c. Deletion analyses of sequences involved in the T response. (b,c) The activityof the H6-46 CAT and H6CAT constructs is compared in (b) NTera2Dl cells,(c) COS-1 cells 104Figure 4-5. Southern analysis of plasmid sequences transfected into COS-1 and CV-1cells 106Figure 4-6. Stimulation of different LTR subtypes by T 107Figure 4-7. Trans-activation of stably integrated PB-3CAT constructs by SV4O large Tantigen 109Figure 5-1. Strategy for isolating cellular LTR promoted genes 118Figure 5-2. 5’ termini and structure of the NTera2Dl eDNA clones 120Figure 5-3. Mapping analysis of clone AF-4 122Figure 5-4. DNA sequence of clone AF-3 124viFigure 5-5. Amino acid sequence comparison of the putative AF-3 portein with conservedrepetitive segments of the yeast CDC4 gene product (Yochem and Byers,1987). 125Figure 5-6. Detection of AF-3 related transcripts 127Figure 5-7. Comparison of AF-3 and AF-3a 129Figure 5-8. Identification of a spliced RTVL-Hlgenomic transcript 131Figure 5-9. Nucleotide and predicted amino acid sequence of clone AF-5 132Figure 5-10. Amino acid sequence comparison of the putative AF-5 protein with group I(bovine pancreatic) PLA2, group II (Crotallus atrox. venom) PLA2, and aconsensus of PLA2 sequences (from Kramer et al., 1989) 134Figure 5-11. Detection of AF-5 related transcripts 136viiACKNOWLEDGEMENTSI would like to take this opportunity to thank Dr. Dixie Mager for her enthusiasticsupport, encouragement, and guidance over the last 5 years. In addition to her role as myresearch supervisor, she has become a treasured friend and I know that our frequentdiscussions about Science (not to mention Star Trek and L.A. Law) will continue.I am grateful to the members of my supervisory committee: Drs. Connie Eaves, KeithHumphries, and Ross McGillivray for their interest in my work and for their helpfulsuggestions. I am also grateful to the senior staff, postdoctoral fellows, graduate students andtechnical staff of the Terry Fox Laboratory for creating such a stimulating environment inwhich to work and for making my time here so enjoyable.My friends in Vancouver (Doug, Janet, Nancy, Ali, Bonnie, Jackie, Heather, Patty,Margaret, and, of course, the members of the S2 society) deserve special mention for taking mymind off work when I needed it and for insisting that I had some human contact while I waswriting up my thesis. I would especially like to thank Doug, for being the big (OLDER) brotherI never had and for always being there when I needed something, from help in sequencing todefrosting my fridge.Finally, I would like to thank my family. Although they are on the other side of thecontinent, I have felt them with me every day.To my parents: For your unconditional love, your never ending encouragement, andyour generous support throughout the years and for alwaysremembering to ask how dFl3 and the Northerns were doing. Youstill manage to surprise me! I never would have been able to do thiswithout you.To Sid: My best friend in the whole world, for taking the time to be there forme no matter how overworked and overtired you were (whether thatmeant just listening on the phone, reading my introduction over andover again, or travelling all night to see me because I said that Imissed you). The hardest part of the last few years was beingseperated from you. If we got through this we can get through justabout anything.viiiCHAPTER I: INTRODUCTIONHuman Retroelements .31. SINES 32. LINES 43. LTR Containing Sequences 5i)THE-1 Elements 5ii) Endogenous Retroviruses or Retrovirus-like Sequences 5Murine Endogenous Retroviruses 71. “True” Endogenous Retroviruses--MuLVs and MMTVs 72. Provirus-like Elements 10i) Virus-like 30 (VL3O) Sequences 10ii) Intracisternal A Particles (lAPs) 11Human Endogenous Retroviruses 131. Rare or Single Copy Type C Elements 142. Multicopy Elements 153. RTVL-H Elements 17RNA Polymerase II Promoters and the Regulation of Transcription 201. Promoter Structure 202. Enhancer Structure3. Interactions Between Enhancers and Promoters 244. The Role of Transcription Activating Proteins 245. Rearrangements Involving Transcription Activating Proteins 26Proviral Long Terminal Repeats 271. LTR Synthesis 272. LTR Structure 283. Functional Studies of Endogenous LTRs 30Interactions of Retroelements with the Host Genome 321. Contribution of Retroelements to the RNADNA Flow of Other Sequences 332. Insertions of Retroelements 343. Changes in Gene Sequence 354. Induction of Chromosomal Rearrangements 35i) Amplifications 36ii) Deletions 365. Generation of Variability 376. Significance 38LTRs as Mutagenic Agents 381. Insertional Activation by LTRs 40i) Promoter Insertion 40ii) Enhancer Insertion 42iii) Removal of Negative Regulatory Sequences 442. Insertional Inactivation 443. Significance 45Thesis Objectives 46References 471CHAPTER IIntroductionThe eukaryotic genome contains a variety of repetitive sequences, ranging in lengthfrom a few hundred to several thousand base pairs (bp). The majority of these sequences havestructures (such as direct terminal repeats at their 5’ and 3’ ends, 3’ A rich tails, andlorsequence boundaries corresponding exactly to RNA species [Rogers, 1985]), suggesting thatthey arose through transposition via an RNA intermediate (retrotransposition). If so,retrotranspositions could account for the high copy number and dispersed nature of some ofthese elements. Indeed, it has been estimated that at least 10% of the human genome is occupiedby RNA derived sequences (Rogers, 1985). As a group, these sequences are referred to asretroelements to denote their putative RNA origins.In most cases, the specific functions, if any, of these sequences have not been defined.It has, however, become clear that eukaryotic genomes have been shaped by processesinvolving retroelements. For example, in humans various types of retroelements have beencausally implicated in such structural changes as deletions (Higgs et al., 1989; StoppaLyonnet et al., 1990; Mager and Goodchild, 1989), duplications (Maeda and Smithies, 1986;Barsh et al., 1983), and, more recently, transpositions (Kazazian et al.,1988; Morse et al.,1988). These findings have revealed that the arrangement of DNA in organisms is muchmore fluid than had previously been recognized and that considerable scope for adaptation canbe gained from these DNA alterations. Indeed, in many cases, retroelement mediatedchanges have been shown to alter the regulation of cellular genes (Hawley et al., 1982; Ymer etal., 1985; Kongsuwan et al., 1989). The long terminal repeats (LTRs) of endogenousretrovirus-like sequences may be particularly important in this regard. Since LTRs contain2transcriptional control sequences, rearrangements involving these elements have thepotential to affect the expression of neighbouring genes. This study attempts to characterize thetranscriptional control sequences in the LTRs of a large family of human endogenousretrovirus-like sequences, RTVL-H, and to assess their impact on gene expression.HUMAN RETROELEMENTSThe human genome contains several distinct types of retroelements. For the purposesof this discussion, these sequences will be classified as 1) short interspersed elements (SINES),2) long interspersed elements (LINES), and 3) LTR containing sequences, including i) THE-1 elements, and ii) endogenous retroviruses or retrovirus-like sequences.1.SINESThe first indication of the complexity of eukaryotic DNA came from the classic DNArenaturation experiments of Britten and Kohne (1968). In these analyses, Britten and Kohneshowed that most higher organisms possess short stretches of moderately repeated sequencesthat are separated by longer sequences of low copy number. It is now known that Mu elements(so termed because many members contain a single site for the restriction enzyme Alul)account for the majority of the SINES present in primates. These short (—300 bp) sequences arethought to be retropseudogenes derived from 7SL RNA. Alu elements have A rich 3’ terminithat vary in length among individual elements, typical internal RNA polymerase IIIpromoter sequences, and together constitute at least 6% of the human genome (Alus are presentin i05 - 106 copies in the haploid human genome [Weiner et al., 1986; Mitchell et al., 1991]).The great abundance of these sequences has generally been interpreted as evidence for thetranspositional competence of at least some members of this family (Weiner et al., 1986).Indeed, Mu elements are flanked by 5’ and 3’ direct repeats analogous to the target siteduplications generated by bacterial transposon insertions (Weiner et al., 1986). Thisduplication is presumed to be generated by a staggered nick in the target site with the insertedsequences being joined to the separated genomic ends.32. LINESThe major class of LINES in humans is the Line-i (Li) family. Lis are flanked byshort target site duplications, have no LTRs, and terminate in a 3’ A rich tail (Weiner et al.,1986; Fanning and Singer, i987). These sequences are present in 10 - 1O copies per haploidhuman genome, however many Li elements are severely truncated at their 5’ ends.Consequently, the 3’ ends of Lis are present at an approximately five fold higher copy numberthan are the 5’ regions (Voliva et al., i983; Weiner et al., i986). The Li consensus sequencehas two open reading frames (ORFs) on the strand that terminates in the A rich tail, and onefull length Li element containing these two intact ORFs has recently been isolated(Dombroski et al., 1991). These ORFs are highly conserved among mammalian Li families(Fanning and Singer, 1987). Taken together, these features, in addition to the similarity ofregions of the polypeptide predicted by the 3’ ORF (ORF 2) to known reverse transcriptases,suggest that Lis are capable of retrotransposition (Hattori et al., 1986; Swergold, 1990). Indeed,new insertions of Li elements have recently been reported (Kazazian et al., 1988; Morse et al.,1988).A heterogenous population of Li related nuclear transcripts has been observed invarious human and monkey cell lines. These transcripts have been found to be derived fromboth strands of Li elements (Weiner et al., 1986; Skowronski et al., 1988). Moreover, many aremuch longer than the —6 kb Li unit and are linked to unrelated cellular sequences (Weiner etal., 1986). Thus, these transcripts are probably transcribed from the promoters of nearbycellular genes. An internal RNA polymerase II promoter has recently been discovered insome Li elements (Swergold, 1990), suggesting that certain Lis may be able to drive their ownexpression. Indeed, full length (6.5 kb) sense strand Li poly A RNA has been observed in thecytoplasm of the teratocarcinoma cell line NTera2Di (Skowronski et al., 1988) but is notdetectable in their differentiated counterparts (Skowronski and Singer, 1985). Thesetranscripts appear to represent a subset of the total Li array since they contain a distinctconsensus sequence (reviewed in Boeke and Corces, i989). Thus, the Li family most probably4consists primarily of defective sequences with a small number (one or more) of functionalelements that can direct their own expression and that may be involved in retrotransposition.Interestingly, it has recently been shown that a mouse Li element that is defective in ORF 2 isstill capable of retrotransposition (Evans and Palmiter, 1991).3. LTR Containing Sequences1) TEE-i ElementsThe THE-i family consists of about i0 members per haploid human genome inaddition to approximately i04 solitary THE-i LTRs (Weiner et al., 1986; Paulson et al., 1985).These sequences are typically 3.2 kb in length and have five bp target site duplications(Weiner et al., 1986; Paulson et al., 1985). Full length THE-i elements are flanked by 350 bpLTRs. Since the majority of THE-i derived transcripts have been found to be transcribed fromother promoters, possibly by virtue of their location in other transcription units (Paulson et al.,1987), THE-i LTRs do not appear to play a large role in the transcriptional regulation of thesesequences. THE-is do not have any internal homology to known retroviral genes and thushave been considered separately from the endogenous retroviruses or retrovirus-likesequences in this discussion. The observation of THE-i DNA copies in extrachromosomalDNA circles (Paulson et al., 1985) suggests that they may be capable of retrotransposition.ii) Endogenous Retroviruses or Retrovirus-like SequencesThe genomes of most, if not all, vertebrate species contain sequences structurallyanalogous to the integrated form of retroviruses (proviruses). As a rule, these elementscontain regions of homology with one or more of the major retroviral genes i) gag, whichencodes the structural or matrix proteins that function in packaging of the viral genome intothe final nucleocapsid particle, ii) poi, which encodes the replication related reversetranscriptase, protease, and endonuclease activities of the virus, and iii) enu, which encodesthe viral surface glycoproteins and transmembrane proteins (Jaenisch, 1983; Boeke andCorces, 1989). These sequences are flanked by LTRs that contain sequences required for thetranscriptional regulation of the viral gene products, in addition to certain integration5functions (Jaenisch, 1983; Boeke and Corces, 1989). Other similarities with retrovirusesinclude the target site duplications characteristic of integration events, and the specific tRNAprimer binding sites (PBS), located adjacent to the 5’ LTR, that are required for reversetranscription (Boeke and Corces, 1989). In light of the rapid rate of evolution of infectiousretroviruses (1O fold greater than endogenous forms), the clear and extensive homologiesbetween endogenous retroviruses and present day strains of replicating retroviruses have ledto the hypothesis that the endogenous state is the normal evolutionary form of retroviruses(Doolittle et al., 1989), with exogenous sequences being a relatively short lived and ultimatelynon-perpetuating species (Doolittle et al., 1989; Shih et al., 1991). Such bursts of exogenousbehaviour could provide one mechanism for potentially valuable genetic reassortment.Conversely it has also been suggested that endogenous retroviruses are remnants of germ-lineretroviral infections (Varmus, 1983). The truth probably lies somewhere between these twoextremes.Endogenous retroviruses differ from their exogenous counterparts in that they aretransmitted vertically, as cellular genes. Although some of these sequences, the trueendogenous retroviruses encountered most frequently in rodents, encode all the informationrequired to form an infectious retroviral particle, the majority of these elements are defectivein this regard and are not transmitted horizontally. Thus, these defective sequences are mostaccurately referred to as retrovirus-ljkg elements.As are the other types of retroelements, endogenous retroviruses or retrovirus-likesequences are often reiterated in the host cell genome and can range from from one to onethousand copies. Thus, the total DNA of a given organism may contain thousands ofretroviral genomes dispersed among the other cellular genes. Until quite recently, however,these sequences had not been analyzed extensively in humans. In contrast, murineendogenous retroviruses (comprising up to 0.5% of the total mouse genome [Kozak, 1985]) havebeen well studied and murine endogenous LTRs have been among the first characterized.6MURINE ENDOGENOUS RETROVIRUSESThe genomes of laboratory mice contain a complex array of endogenous retroviruses.There are two main classes of “true” endogenous retroviruses, the Murine Leukemia Viruses(MuLVs) and the Mouse Mammary Tumor Viruses (MMTVs), that encode infectious virusesand that have been associated with high incidences of malignancies in different strains ofmice. As far as it is known, laboratory mice are unique in the high levels of participation oftheir endogenous viruses in disease processes (Coffin, 1985). It is likely that this relationshipis a result of the original selection for high incidences of tumors when these lines weregenerated. This selection may then have been accompanied by the selection of a rare set ofphenomena involving endogenous proviruses (Coffin, 1985). In addition to the MuLVs andMMTVs, mice also contain several families of endogenous retrovirus-like sequences, eachcontaining between a hundred and several thousand members, that are not known to encodeinfectious viruses.1. “True” Endogenous Retrovfruses--MuLVs and MMTVsLaboratory mice contain 40 - 60 MuLV related germline proviruses, that can beactivated to produce infectious virus, dispersed throughout their genome (Kozak, 1985; Keshet etal., 1991). These sequences can be classified into two main categories on the basis of the hostrange conferred by their envelope genes, Ecotropic and NonEcotropic (which can be furthersubdivided into Xenotropic, Polytropic, and Modified Polytropic viruses [Stoye and Coffin,1988]). Ecotropic MuLVs can infect mouse cells, but cannot infect cells of heterologous species,while Xenotropic viruses can infect cells of other species, but are not able to infect mouse cells(Kozak, 1985; Keshet et al., 1991). In addition, polytropic and modified polytropic virusesconfer a wider host range to the recombinant viruses often encountered in highlyleukemogenic strains (Stoye and Coffin, 1988). As there are relatively few (0- 6) ecotropicproviruses in the genome of most mouse strains their genetic analysis has been relativelystraightforward. However, due to their complexity and their higher copy numbers, theanalysis of nonecotropic viruses has been more difficult. The recent development by Stoye and7Coffin (1988) of oligonucleotide probes specific for the different classes of nonecotropic viruseshas facilitated studies of the distribution and mutagenic potential of these elements (Frankelet al., 1990).Various mouse strains can be distinguished by their different patterns of ecotropicviral expression. For example, some strains begin producing virus early in life, resulting inchronic viremia (a typical high virus strain is AKR [Risser et al., 1983]). Other strains eitherproduce virus spontaneously later in life (the low virus strains) or do not produce infectiousvirus (negative strains). Ecotropic MuLVs can be experimentally induced in virus positivemice by a variety of mechanisms, such as treatment with irradiation, 5 azacytidine, andinhibitors of protein synthesis (Kozak, 1985), however, the efficiency of induction is lower inthe low virus strains (McCubrey and Risser, 1982). In addition to the differences in patterns ofviral expression, mouse strains are heterogeneous with respect to the numbers and the precisedistribution of these proviruses in their genomes (Kozak, 1985), probably because of recentretroviral integrations. This idea is supported by the observation that novel insertions ofproviruses in germline DNA can result from exogenous viral infection. For example,Jaenisch (1976) demonstrated that infection of preimplantation embryos with Moloney MuLV,in vitro, could result in new insertions. Moreover, the analysis of high virus inbred strainshas suggested that germline viruses can also produce new integrations. It has been estimatedthat new proviral copies appear in AKR mice at least every 12 generations of inbreeding(Buckler et al., 1982), however, these types of events have been rarely observed in low virusmice. Thus, it is likely that new integrations occur by a mechanism involving viralinfection. It should also be stated that in addition to functional proviral MuLV copies, themouse genome also contains defective MuLV copies not known to encode infectious viralparticles. Interestingly, these defective proviral genomes can be complemented byrecombination with other viruses to produce infectious viruses with altered characteristics.Xenotropic MuLVs are present in about 30 - 60 copies per haploid murine genome(Frankel et al., 1990). Only a few of these endogenous proviruses are inducible as infectious8viruses, the rest are thought to represent defective proviruses. Unlike the ecotropic MuLVs,sequences homologous to xenotropic enu genes have been found in all strains of laboratorymice, both those that are virus inducible and those that are non-inducible (Stoye and Coffin,1988). Genetic studies have shown that different strains have similar genomic organization ofthese sequences, suggesting that these sequences were present in the germline after speciation,but before the development of inbred strains (Hoggan et al., 1983). Indeed, the host range ofthese viruses would be expected to prevent the generation of new loci by reinfection andintegration. As with the ecotropic viruses, different mouse strains can be classified as highvirus strains (that produce high titres in vivo as well as in cultured cells), low virus strains(that have a lower level of spontaneous expression, but that can be induced to produce virus inculture), or as lower, non-inducible strains (that only rarely produce virus in vivo) (Kozak,1985).As mentioned previously, high incidences of malignancy have been observed in manystrains of laboratory mice. Results from a large number of studies on virus associatedleukemogenesis have suggested that proviral insertion into specific chromosomal regions isan important step in the generation of many of these malignancies and that once they haveinserted, the proviruses can activate specific sequences within or near the integration site. Forexample, the analysis of somatically acquired proviruses from naturally occurring mouselymphomas, tumors induced by MCF and Moloney MuLV, and tumors associated with theexpression of germline MuLVs have revealed that in many cases the proviruses inserted in amanner consistent with insertional activation of the gene pim-1. (Cuypers et al., 1984). Otherstudies have examined known mouse oncogenes for rearrangement or activation inretrovirus induced tumors. In many cases, these analyses have detected alterations in the cmyb and c-myc genes consistent with provirus induced insertional activation (Mushinski etal., 1983; Shen-Ong et al., 1984).The MMTVs are distinct from MuLVs by virion morphology, by absence of sequencehomology and by their sensitivity to induction by glucocorticoids. As with the MuLVs, inbred9mouse strains differ with respect to the number and chromosomal location of their MMTVproviral genes (Peters et al., 1986): Genetic studies have shown that few MMTV loci areexpressed and the available data suggest that MMTVs represent a heterogeneous class ofretroviruses, several of which are defective (Kozak, 1985). These viruses can also integrateinto mouse chromosomes and can be transmitted either vertically through the germline orhorizontally as infectious agents through the milk of infected females (Coffin, 1985).Increased expression of MMTVs has been linked with the development of mammarycancers in mice. Due to LTR activation by the glucocorticoid response element, most femalemice express elevated levels of these viruses during pregnancy, shed high levels into theirmilk, and in this way infect their offspring (Hynes et al., 1984). As with MuLVs, the role ofMMTVs in tumor formation is thought to be through the transcriptional activation ofprotooncogenes by insertional mutagenesis. Analysis of a number of MMTV integration sitesin murine mammary tumors has revealed two common regions, mt-i and int-2, into which themajority (80 - 90%) of new proviral integrations have occurred (Dickson et al., 1984).Integration of MMTVs has also been linked to nonmammary neoplastic disease. Forexample, males of the high MMTV strain GR show high incidences of lymphoma late in life.In addition, somatically acquired copies of MMTV proviruses have been observed in othertumors.2. Provirus-like Elementsi) Virus-Like 30 (VL3O) SequencesThe VL3O gene family is made up of 100 - 200 provirus-like elements of approximately5.2 kb, flanked by 600 bp LTRs (Keshet et al., 1991). Various VL3O elements have either aproline or a glycine PBS and are dispersed throughout the mouse genome (Keshet and Itin,1982; Keshet et al., 1991). Although VL3Os have internal gag and pol homologies, they do notcontain an env equivalent region since the putative endonuclease domain of the poi gene endsprecisely at the 3’ LTR (Adams et al., 1988). This lack of a functional env gene may accountfor the inability of VL3Os to leave the cell unless assisted by a helper virus. Murine VL3Os are10not known to produce virion structural components, however they can be efficiently packagedand transmitted as pseudotypes of MuLV viruses and can recombine with other viruses(Kozak, 1985). These putative recombination events may account for some of the observedcases of exogenous viral genomes and endogenous proviral genes containing sequencesderived from VL3Os and type C viruses. For example, Harvey and Kirsten sarcoma virusescontain rat VL3O and ras sequences that are flanked by sequences derived from MuLV(Kozak, 1985). In addition, hybrid structures with internal MuLV related gag and pol genesflanked by VL3O sequences and VL3O LTRs have been identified in mouse libraries (Itin andKeshet, 1983).In general, VL3Os have considerable sequence heterogeneity and only a very smallsubset is thought to be transcriptionally active in mouse cell lines (Keshet and Itin, 1982).Accordingly, studies with mouse NIH 3T3 derived VL3O cDNA clones have illustrated that arelatively homogeneous subset, differing primarily in LTR U3 regions, is expressed in thesecells (Carter et al., 1983). In vivo, VL3O RNA is expressed in various adult mouse tissues(Harrigan et al., 1989) and is transcriptionally upregulated in transformed cells (Singh et al.,1985). In vitro studies have shown that VL3O expression is growth regulated and inducible bypeptide growth factors (Singh et al., 1985).ii) Intracisternal A Particles (TAPs)Murine lAPs are non-infectious virus-like particles (resembling immature type Bvirions) found in mouse oocytes, preimplantation embryos, and certain mouse tumor cells(Kozak, 1985; Kuff and Lueders, 1988). These structures form by budding at the membranes ofthe endoplasmic reticulum into the cisternal cavities. TAPs have been found onlyintracellularly and the outer LAP shell is provided by the endoplasmic reticulum as a result ofthe budding process (Kuff and Lueders, 1988). The particles contain polyadenylated RNA,have reverse transcriptase activity and can direct synthesis of the major TAP structuralprotein, p73 (Kuff and Lueders, 1988).11lAPs are encoded by a family of approximately 1000 elements distributed on all mousechromosomes. The structural organization of these lAP genetic elements is similar to that ofretroviral proviruses, with 300 - 500 bp LTRs flanking gag, pol, and env type regions, and aphenylalanine tRNA PBS (Kuff and Lueders, 1988). The internal regions show significanthomology with ‘several infectious retroviruses with Mg2+ dependent polymerases such as SRV1 and MMTV (Mietz et al., 1987) and sequence similarities in the polymerase region with typeB, D, and avian type C viruses have been observed (Chiu et al., 1985). Open reading framesencoding the p73 gag and p47 endonuclease proteins have been observed in different elements,however, the env region has been found to contain multiple stop codons in the many differentlAP elements that have been sequenced in this region (Kuff and Lueders, 1988). Indeed,although the longest lAP elements could potentially encode an envelope glycoprotein of 40 kD,no lAP envelope protein has been observed.lAP genetic elements can be categorized on the basis of structural differences into twogroups, the type I and type II elements. The majority (—70%) of genomic lAPs fall into the type Iclass which is comprised of the full length 7.1 kb elements, a 5.4 kb variant, and severaldeleted forms (Kuff and Lueders, 1988). Type II elements are defined by a characteristic 500 bpinsertion that is not present in the type I elements and have major deletions involving the gagand poi regions.lAP element transcription is directed by the typical retroviral RNA polymerase IIregulatory sequences within the LTRs (Lueders et al., 1984; Kuff and Lueders, 1988). Lowlevels of lAP RNA expression occur during early mouse development and in many adulttissues, however, these sequences are most abundantly expressed in certain mouse tumor cells(Kuff et al., 1972) and in the thymus of young mice (Kuff and Fewell, 1985). Mouse lAPelements can actively transpose and in many cases these new insertions have been shown toalter expression of cellular genes at the insertion site (a number of these events will bereviewed in detail in the section “LTRs as mutagenic agents”). These findings illustrate that12TAPs can be an important source of genetic variability in the mouse and suggest that analogoussequences in other organisms may be involved in similar events.HUMAN ENDOGENOUS RETROVIRUSESThe initial findings suggesting the presence of endogenous retroviruses in humanDNA were the numerous reports of retrovirus-like particles or reverse transcriptase activity innormal human placenta (Kalter et al., 1973; Dirkson et al., 1977), in oocytes (Larsson et al.,1981) and in teratocarcinoma cells (Boiler et al., 1983). Furthermore, antigens of retroviralorigin have also been detected in human placental cells (Suni et al., 1981; Maeda et al., 1983)and there have been many cases in which some evidence of retroviral activity was observed butno exogenous agent had been identified. Since these initial observations were made, severaldistinct families of retrovirus-like elements have been identified in the human genome bothby hybridization with portions of known retroviruses (Martin et al., 1981; Bonner et al., 1982;Callahan et ai., 1982; Ono et al., 1986) and/or putative primer binding sites (Harada et al.,1987), and by chance during investigations of unrelated cellular genes (Mager and Henthorn,1984; Maeda, 1985). More recently, the polymerase chain reaction has also been used to detectendogenous retrovirus-related sequences (Shih et al., 1989). The different families of humanendogenous retroviruses use specific tRNA PBSs, range in copy number from one element toseveral hundred and, together with the THE-i family, constitute a significant fraction (0.3 -0.6% [Leib-Mosch et al., 1990]) of the human genome. Individual members of several of thesefamilies have been analyzed and, in most cases, these elements have the same fundamentalstructure as exogenous proviruses. Full length elements have 5’ and 3’ LTRs and homology tothe three major retroviral genes gag, poi, and enu. However, the human endogenous proviralelements examined to date usually contain deletions or have in frame termination codons inone or more of their genes and thus are defective for the formation of an endogenous retroviralparticle. Nonetheless, many of these elements are expressed at the RNA level in differenttissues, particularly in the placenta (Cohen and Larsson, 1988). This observation is of interest13since the early electron microscope studies originally identified retrovirus-like particles inthe placenta (Kalter et al., 1973; Dirkson et al., 1977) and in teratocarcinoma cells (Boiler etal., 1983). Indeed, although the teratocarcinoma particles have been associated with RNA ofthe approximate size of a retroviral genomic dimer (Lower et al., 1987), the source of this RNAhas not been identified and, as mentioned above, a human endogenous retrovirus that iscapable of encoding complete viral particles or infectious virions has not yet been reported.This situation is in marked contrast to the many examples of infectious, fully functional,endogenous retroviruses that have been observed in the genomes of rodents and othervertebrates. However, as human DNA contains a significant number of endogenousretrovirus-like sequences, only a small proportion of which have been analyzed, the possibilityremains that a small number of these sequences may encode fully functional elements, or thatthe products required for formation of a retroviral virion and/or for retrotransposition may besupplied in trans by several defective elements.1. Rare or Single Copy Type C ElementsThree single copy endogenous proviruses, ERV-1, ERV-3, and S71, have beenidentified in the human genome. Sequence analysis of ERV-1 provided the first evidence thathuman DNA contains retroviral elements. This provirus was originally isolated using thepoi gene of an endogenous chimpanzee retrovirus, CH2, as a probe (Bonner et al., 1982) andwas subsequently mapped to chromosome 18 (O’Brien et al., 1983). ERV-1 is an incompleteelement because it lacks a 5’ LTR and a PBS. Related transcripts have not been observed inhuman placenta, however, it has not been established whether this element is expressed in aspecific tissue or stage of development by an upstream cellular promoter (Larsson et al., 1989).ERV-3 (HERV-R) is a 9.9 kb complete endogenous provirus that was isolated byhomology with CH2 pol and baboon endogenous virus (BaEV) LTR probes (O’Connell et al.,1984). This element has 5’ and 3’ LTRs that contain the typical retroviral transcriptionregulatory sequences, an arginine PBS, retroviral gag, pot, and env genes, and is flanked bytarget site duplications characteristic of proviral integration (O’Connell and Cohen, 1984).14Sequence analysis has revealed in-frame termination codons in the gag and pol genes and inthe enu transmembrane region. However, a 1940 bp open reading frame encoding a putativeenvelope glycoprotein has been identified (Cohen et al., 1985) suggesting that an envglycoprotein of 577 amino acids may be expressed in cells in which this provirus istranscribed. Three major ERV-3 related RNA species have been observed. These transcriptsall initiate at the same site in the 5’ LTR, and use the 5’ splice donor sequence to splice into envsequences. Two of these transcripts extend through the 3’ LTR and undergo a second splicingevent into a cellular Kruppel-like gene (Kato et al., 1987; Kato et al., 1990). High levels ofERV-3 expression have been detected in placental chorionic villi (Kato et al., 1987; Larsson etal., 1989) and in most other tissues that have been analyzed. Interestingly, ERV-3 expressionis significantly lowered in trophoblastic disease (Kato et al., 1988; Larsson et al., 1989), but thesignificance of this observation is not known.S71 was originally isolated by hybridization with simian sarcoma associated virusprobes. This element is an incomplete retroviral genome with type C gag and poi sequencesand a single 3’ LTR (Leib-Mosch et al., 1986).2. Multicopy ElementsA number of multicopy families of human ERVs have been characterized which rangein size from approximately thirty to several hundred members. A number of these familiesare related to known vertebrate retroviruses, while others have short regions of homology withdifferent retroviral regions. Several families of ERVs with homology to MMTV exist in thehuman genome. A subgroup of elements hybridizing with an MMTV gagIpol probe has beentermed HERV-K (CUU) on the basis of their lysine PBS (Callahan et al., 1982; Ono et al., 1986).This family consists of retroviral genomes of 6 - 10 kb and is estimated to have approximately50 members per haploid genome. Several members of the HERV-K family have beencharacterized, and a composite sequence from two separate clones (HERV-K10) has beenderived (Ono et al., 1986). HERV-K10 is a complete retroviral genome of 9.2 kb, with 968 bpLTRs. As the LTRs of HERV-K10 differ by only 2 bp, this provirus may represent a relatively15recent integration event. Indeed, although HERV-K10 is apparently defective, it contains anopen reading frame large enough to allow synthesis of full-length polymerase proteins,including reverse transcriptase (Ono et al., 1986;Leib-Mosch et al., 1990). An 8.8 kb HERV-KmRNA that hybridizes with gag, pot, and enu probes has been observed in several cell lines.Expression of the HERV-K mRNA has been shown to be inducible by treatment with estradiolfollowed by progesterone (Ono et al., 1987). Since the HERV-K LTRs contain consensusglucocorticoid or progesterone response elements, it is likely that transcription of this RNA isregulated by its 5’ LTR.A family of elements, HERV-E, related to MuLV and BaEV retroviruses are present inthe genome in about 70 - 100 copies. This family consists of approximately equal numbers of8.8 kb full-length proviral elements and truncated 6 kb genomes that lack an env gene(Repaske et al., 1983) in addition to solitary LTRs (Steele et al., 1984). The 8.8 kb class ofHERV-E elements has a glutamic acid PBS and LTRs that contain typical transcriptionalregulatory sequences. In fact, HERV-E mRNAs containing LTR and env sequences havebeen identified in the normal placenta, as well as in breast and colon carcinomas (Rabson etal., 1983; 1985; Gattoni-Celli et al., 1986). Sequence analysis of one HERV-E element, 4-1, hasrevealed a 1.3 kb open reading frame in the env region and in frame termination codons inthe gag and pot genes (Repaske et al., 1985).Because proline tRNA is the most commonly used primer for mammalian type Cretroviruses (Harada et al., 1979), it is not surprising that three different families of elementscontaining a proline PBS have been identified (Harada et al., 1987). These families aredistinguished on the basis of LTR sequence differences and are termed HERV-P1, 2, and 3.The HERV-Ps were originally isolated by hybridization to an oligonucleotide probecomplementary to a proline PBS (Harada et al., 1987; Kroger and Horak, 1987) and are eachpresent in the human genome in 10 - 40 copies.RTVL-I elements are 9 kb full length elements with 5’ and 3’ LTRs that have typicaltranscriptional regulatory sequences and a PBS homologous to isoleucine tRNA. This family16was identified during unrelated studies of the human haptoglobin genes and is present in thehaploid genome in 15 - 30 copies (Maeda, 1985). Interestingly, there is evidence for threeindependent insertions of RTVL-I sequences in the haptoglobin gene cluster of higherprimates (Maeda and Kim, 1990).3. RTVL-H elementsThe RTVL-H family (the subject of this study) was originally detected by chanceduring the investigation of three naturally occurring deletions in the f3-globin gene cluster(Mager and Henthorn, 1984). RTVL-H is the largest family of human endogenous retoviruslike sequences and consists of approximately 1000 full length members in addition to severalhundred solitary LTRs, that together constitute 0.2% of the human genome (Mager andHenthorn, 1984). RTVL-H elements are found in other primates, but are not detectable byhybridization in the genomes of non-primate species. In situ hybridization has shown thatthese elements are dispersed on all chromosomes with local concentrations on or near thechromosomal bands lp3i, 7q31, and lipl5 (Fraser et al.,1988). Prototypical RTVL-Hsequences are 5.8 kb, with 5’ and 3’ LTRs of 400 - 450 bp that contain a TATAA box and acanonical polyadenylation signal in positions analogous to those found in other retroviralLTRs, a histidine PBS, and internal homology to retroviral gag and poi sequences (RTVL-Hstructure is shown schematically in figure 1-1) (Mager and Henthorn, 1984; Mager andFreeman, 1987). No sequence similarity to known enu genes has been detected in RTVL-Helements; indeed in the elements sequenced, the pot homology extends to just upstream of the 3’LTR. Therefore, it is unlikely that RTVL-H elements contain a divergent enu gene. Rather,although they were originally termed retrovirus-like, RTVL-H elements may be moresimilar to retrotransposon-like sequences such as murine VL3O, Drosophila melanogastercopia, or the Ty elements of yeast. The large numbers of RTVL-H elements in the humangenome suggests that they may be involved in the generation of rearrangements. This idea issupported by the recent identification of a deletion in two siblings that is the result of17STRUCTURE OF A TYPICAL PROVIRUSgag pol envSTRUCTURE OF A TYPICAL RTVL-H ELEMENT::ai11 pOIFigure 1-1. Schematic representation of the structure of a typical provirus compared with thatof a typical RTVL-H element. Arrows represent long terminal repeats. Regions of sequencesimilarity with retroviral gag, poi and enu genes are shown by the boxes below each element.The open end of the RTVL-H gag box indicates that similarities with known gag genes havenot been detected in this region.18homologous recombination between two LTRs of an RTVL-H element (Mager and Goodchild,1989).RTVL-H expression occurs in a variety of cell lines as well as in some primarytissues. The highest levels of expression have been observed in the human teratocarcinomacell lines NTera2Dl, Teral, and Tera2, the bladder carcinoma cell line 5637, HeLa cells, andin the amnion and chorion of the normal placenta (Wilkinson et al., 1990). Interestingly,expression is down regulated in Tera 2 and NTera2D 1 cells induced to differentiate withretinoic acid but is not affected in Teral, which does not differentiate in response to retinoicacid (Wilkinson and Mager, 1990). Sequence analysis of cDNA clones has revealed that atleast 13 different elements are transcribed in NTera2Dl cells (Wilkinson et al., 1990). In allcells in which RTVL-H is expressed, the major RNA species is 5.4 kb and is detected by probesspanning the entire RTVL-H element. Analysis of cDNA clones has confirmed that thisspecies represents a unit length genomic transcript. In addition to this major species, severalsmaller RNAs that represent splicing events have been observed as well as larger transcriptsof unknown structure (Wilkinson et aL, 1990). Northern analysis using single strandedRNA probes has shown that only the RTVL-H sense strand is transcribed at detectable levelsin NTera2Dl cells (Wilkinson et al., 1990). In addition, primer extension analysis hasshown that the major transcript initiation site maps to the expected location in the 5’ LTR (LTRfunction will be reviewed in a later section). Together, these observations strongly suggestthat the transcription of RTVL-H elements is regulated by the transcriptional controlsequences present in their LTRs. However, although the LTR capabilities of single copyelements can be directly extrapolated from Northern and primer extension analyses, thepromoter abilities of a highly repetitive family of elements cannot be determined from thesetypes of studies. Therefore, the transcriptional capabilities of individual RTVL-H LTRs isunknown.19RNA POLYMERASE II PROMOTERS AND THE REGULATION OF TRANSCRIPTIONThe primary step at which the expression of viral and cellular protein coding genes iscontrolled is the initiation of RNA polymerase II transcription. The ability of retroelements toretrotranspose into new positions is dependent on the availibility of an RNA copy that can actas a substrate for reverse transcriptase and thus is directly related to the transcriptionalcompetence of the element (Curcio et al., 1990). Furthermore, even in the cases of structuralrearrangements that do not involve transpositions, the transcriptional capabilities ofretroelements are important because newly introduced promoter and enhancer sequences mayinfluence the expression of nearby genes. Therefore, the transcriptional control sequences ofretroelements are largely responsible for their mutagenic capabilities.Promoters and enhancers are the cis-acting DNA sequence elements containing theinformation required to regulate gene transcription. These elements have been shown to bemodular in nature and are typically composed of discrete sequence motifs containingrecognition sites for one or more transcription regulatory proteins (for reviews, Maniatis etal., 1987; Dynan, 1989; Muller et al., 1990). The primary roles of promoters and enhancersdiffer. In general, promoters are required for accurate and efficient transcript initiation,while enhancers modulate the rate of transcription from promoters. Recent studies haveshown that the transcription factor binding sites of which these elements are composed sharemany properties and, in some cases, may be interchangeable. For these reasons, the lineseparating these elements is rapidly becoming blurred. Thus, there may be a unifiedmechanism by which these sequences influence gene expression.1. Promoter StructureDetailed molecular analysis of a number of different RNA polymerase II promotershas revealed a common structural organization. A typical promoter consists of an AT richmotif (the TATAA box) that is located approximately 30 bp upstream of the transcriptioninitiation site. Specific mutagenesis of the TATAA box results in 5’ heterogeneity in start sites(Grosschedl and Birnsteil, 1980; Grosveld et al., 1981) and/or decreased initiation frequency20(Grosveld et al., 1982). Thus, this region is functionally analogous to the Pribnow box ofprokaryotic promoters, in that it appears to be essential for accurately positioning andinitiating the start of transcription. Promoters also contain a variable number of upstreampromoter elements (UPEs), typically located 30 - 110 bp upstream of the start site (Dynan, 1989),that are involved in modulating promoter strength. The saturation mutagenesis experimentsof Myers et al. (1986) have confirmed that these conserved TATAA and UPE sequences are,indeed, the functional elements mediating promoter activity of the f3-globin gene.A variety of UPEs from different genes have now been identified. These elements fallinto two classes. Some UPEs, for example the CAAT and G/C boxes, appear in the promoterregions of many genes transcribed by RNA polymerase II and are thought to be important forconstitutive expression. In addition, several less common UPEs have been implicated inmediating regulated expression (Mitchell and Tjian, 1989). Analyses of these elements haveshown that the majority can act independently of their orientation (Kadonaga et al., 1986;Maniatis et al., 1987). However, the insertion of nucleotides between a UPE and the TATAAbox, particularly the insertion of odd multiples of half a DNA turn, often results in a markeddecrease in transcription levels (McKnight, 1982; Takahashi et al., 1986). These findingsimply that the UPE must lie on the same face of the DNA helix as the TATAA box to befunctionally active. This constraint presumably reflects an interaction between proteinsbound to the UPEs and those bound to the TATAA box.In addition to the “TATAA containing” promoters discussed above, there are manyviral and cellular genes that do not contain obvious upstream TATAA boxes. The promoters ofthese genes fall into two catagories: the GO rich promoter elements associated with“housekeeping genes” (discussed in chapter 5) and the remaining promoters which have noTATAA homology and are not GO rich (for example, the promoters of the terminaldeoxynucleotidyltransferase gene [Landau et al., 1984] and the T cell receptor j3 chain genes[Anderson et al., 1988]). Recently, a discrete initiator motif (Smale and Baltimore, 1989) hasbeen shown to direct transcription initiation in some promoters of this second class.212. Enhancer StrudureEnhancers were originally detected as cis-acting genetic elements that increased therate of transcription from a promoter located at a distant site (Grosschedl and Birnsteil, 1980;Banerji et al., 1981; Fromm and Berg, 1983). These elements were shown to exert their effectsover long distances (over 1000 bp), independent of their orientation or position relative to thetranscript initiation site (Serfling et al., 1985). This mode of activity had little precedent inearly studies of prokaryotic transcriptional regulation and generated much conjecture aboutthe mechanism by which these elements function. Subsequent studies have shown thatenhancers are, in fact, quite similar to promoters in that they are also modular elementscomposed of combinations of discrete transcription factor binding sites which in some casesare interchangable with promoter elements. For example, the “octamer” sequence foundwithin the immunoglobulin enhancer has also been identified as a component of severaldifferent promoters (Bohmann et al., 1987; Parslow et al., 1987). Furthermore, inexperimental constructs, the presence of the SV4O enhancer immediately adjacent to the f3-globin promoter decreases the effect of detrimental UPE mutations. At a more distant site,however, the SV4O enhancer loses its ability to substitute for UPEs (Treisman and Maniatis,1985). There are also many cases in which a single UPE, when multimerized experimentally,will enhance transcription from a remote location (Ondek et a!., 1987; Courey et al., 1989).Thus, other than for a small number of UPEs (such as the CAAT box) that seem to occur in apreferred location and orientation, there appears to be little functional distinction betweenspecific promoter and enhancer components.The significance of the modular nature of enhancers is emphasized by the studies ofHerr and Clarke (1986). In their experiments, specific SV4O enhancer modules were mutatedsuch that transcription levels were reduced. The analysis of phenotypic revertants revealed ahigh incidence of duplications of the remaining unmutated motifs. Such studies have led to thegeneral hypothesis that direct repeats are an essential component of enhancer structure.However, although direct repeats frequently occur in enhancers, many enhancers (eg.22polyoma virus strain A2, hepatitis B virus, and the majority of cellular enhancers) do notcontain obvious repeats (Serfling et al., 1985). Thus, it might not be repetition per se but thepresence of multiple transcription activator binding sites that is required for the enhancerscharacteristic “action at a distance”. It is conceivable that the presence of multiple protein-protein and protein-DNA interactions can stabilize the formation of transcription complexesamong widely separated DNA elements. Taken together, the results of the above studies implythat enhancers are not so much physical entities with clearly definable boundaries but, rather,an effect exerted by a variety of different sequence motifs (Serfling et al., 1985).It is generally accepted that enhancers and UPEs confer temporal, tissue specific, orconstitutive patterns of expression on their linked promoters through interactions betweenbound transcription activating proteins and the transcriptional apparatus. However, there hasbeen much speculation about the precise mechanism by which these complexes can influencetranscriptional activity at a distant site. The models which have most often been consideredfor this process are “looping” and “scanning”. In the looping model, transcription initiationis enhanced via the direct interactions of enhancers and UPEs with the promoter throughspecifically bound proteins, the intervening DNA looping to allow the interaction. In thescanning model, enhancers or UPEs are used as an entry site for the polymerase molecule or atranscription activating protein which scans along the DNA until it reaches the promoter. Tworecent studies demonstrate that enhancers can stimulate transcription without beingcovalently linked to the promoter and hence support the looping model. Muller et al. (1989),have observed that the SV4O or cytomegalovirus enhancers can stimulate transcription fromthe rabbit 3-globin promoter in trans when linked by a biotin-strepavidin bridge. Otherexperiments using interlocked circular sequences (one containing the promoter, the other theenhancer) have revealed that an RNA polymerase I enhancer also can stimulate transcriptionin trans (Dunaway and Droge, 1989). In either case, it is difficult to imagine a proteinscanning across these constructs.233. Interactions Between Promoters and EnhancersThe modular nature of promoters and enhancers is key to the regulation oftranscription in response to diverse intracellular signals. An interesting example ofinteractions between different modular elements has been shown for the polyoma virusenhancer. Rochford et al., (1987) constructed a hybrid enhancer from polyoma virus andMoloney murine leukemia virus enhancer modules. Surprisingly, unlike either of the parentenhancers, the recombinant construct displayed a strong pancreas specificity. These findingssuggest that, beyond the contributions of individual modules, tissue specificity may be greatlyinfluenced by specific combinations of modules. Regulated promoters and enhancers oftencontain modules that interact with constitutively expressed transcription activating proteins(eg. Sp-1 binding sites are found in a number of different promoters [Dynan, 1989]). In thesecases, they may act to prime the transcriptional apparatus for acute response to induction byregulatory proteins and/or to ensure low levels of basal transcription.Analyses of numerous viral and cellular enhancers have revealed that they can confertissue specificity on any promoter element to which they are linked (Dynan, 1989; Muller etal., 1990; Maniatis et al., 1987). Indeed, enhancers are now commonly used in gene transfertechniques to impose regulated patterns of expression on hybrid gene constructs. Thisinherent flexibility suggests that transcriptional activation is stimulated by a non-specificinteraction between the transcriptional machinery and the transcription activating proteinsbound to promoter and enhancer modules.4. The Role of Transcription Activating ProteinsA large number of transcription factors which bind specifically to various promoter orenhancer components have now been identified. These factors fall into two groups. Thegeneral transcription factors, which include TFIIA, TFIIB, TFIID (the TATAA box bindingprotein), TFIIE/F, and RNA polymerase II, are involved in forming the preinitiation complexvia an ordered assembly at the promoter (Buratowski et al., 1989; Lin and Green, 1991; Sharp,1991). These factors are necessary and, in some cases, sufficient for the initiation of basal24transcription. A second class of factors, the transcriptional activators, are sequence specificDNA binding proteins that stimulate transcriptional activity when bound to their recognitionsites (UPEs and enhancer motifs [Ptashne, 1988; Mitchell and Tjian, 1989; Lin and Green,1991]). It is presumed that these activators can increase the rate of basal transcription via aninteraction with one or more of the general transcription factors or the basal apparatus.Analyses of these transcription activating proteins have revealed that they also have amodular organization, being composed of separable DNA binding and transcriptionalactivation domains. This structural scheme was first established for the GAL 4transcriptional activating protein of yeast. In their classic “domain swap” experiments,Brent and Ptashne (1985) replaced the DNA binding domain of GAL 4 with that of a bacterialrepressor (Lex A). The resulting chimeric protein enhanced transcription from reporterconstructs downstream of Lex A binding sites. Further studies have shown that GAL 4 canstimulate transcription in mammalian, insect, and plant cells as well as in yeast, providedthat a GAL 4 binding site is present (Ptashne, 1988). That DNA binding and activationdomains are interchangeable has since been confirmed in many other systems (Courey andTjian, 1988; Webster et al., 1988). Moreover, it appears that the precise positioning of thesedomains within the proteins is flexible. These findings suggest that each module representsan independent structural domain. While the presence of multiple structural domains inthese proteins is not unexpected, it is remarkable that they can be mixed and matched with suchflexibility. Indeed, this enormous flexibility is reminiscent of the variable positioning ofUPEs and enhancer elements with respect to the transcript initiation site. The functionalredundancy of these elements argues strongly against any dependence of the activating eventson specific interactions between the promoter and activating domain. Rather, it is possible thatUPEs and enhancers act to tether transcriptional activators to the DNA molecule, therebyincreasing their local concentrations. It would follow that, in most cases, the particularactivation domain used would be inconsequential. This idea is consistent with the observation25that the DNA binding and activator domains from most of the transcription factors exhibit alimited number of structural designs (Ptashne, 1988).5. Rearrangements Tnvolving Transcriptional Control ElementsThe ability of the modular DNA and protein elements to be experimentally mixed andmatched with such flexibility suggests that genomic rearrangements involving thesesequences could have a tremendous effect on the regulation of nearby genes. Indeed, it hasbeen proposed that these types of events may play a larger part in shaping genomes than dochanges in gene sequence (Finnegan, 1989). Several cases in which genomicrearrangements have resulted in new patterns of gene expression have been reported. Forexample, the deregulated expression of c-myc observed in Burkitt’s lymphoma is a result of itstranslocation downstream of strong transcriptional promoters/enhancers of theimmunoglobulin or T cell receptor loci. Other cellular genes which have been activated by thismechanism include bcl-2 (Cleary et al., 1986), ttg (McGuire et al., 1989), and interleukin-3 (IL-3) (Grimaldi and Meeker, 1989). Recently, the t(1;19) translocation observed in approximately40% of all acute lymphocytic leukemias CALLs) has been shown to fuse the 5’ end of the geneencoding the transcription factor E12 to the 3’ end of a putative homeoprotein, pri. This fusionresults in the production of a chimeric transcription factor in which the helix loop helix DNAbinding domain of E12 has been replaced by the homeodomain ofpri (Nourse et al., 1990;Kamps et al., 1990). As a consequence, an activation domain may be fused to a DNA bindingspecificity not normally expressed in developing lymphocytes. The resulting protein could bea novel transcriptional activator important in the pathogenesis of this class of ALL. Takentogether, these observations emphasize the potential of promoter/enhancer elements andtranscription factors to generate new patterns of gene expression in vivo. Indeed, in additionto these genomic events, gene activation is often a result of retroviral infection. In these cases,insertion of strong promoter or enhancer sequences within the retroviral LTR may alter thelevels or specificity of expression of cellular genes. For example, the activation of Spi-1, agene that encodes a DNA binding transcriptional activator similar to ets, by Friend virus may26be an important step in the transformation of erythroid cells in Friend disease (Ben-Davidand Bernstein, 1991). As the human genome contains an extensive array of endogenous LTRcontaining sequences, it is quite likely that these elements also may contribute to genomicchange.PROVIRAL LONG TERMINAL REPEATSProviral DNAs are flanked by long terminal repeats (LTRs) derived from the 5’ and 3’ends of the viral genome. The LTRs are critical components of the retrovirus, since theycontain all of the regulatory information controlling the initiation and termination of viraltranscription, in addition to certain integration functions (Temin, 1981, 1982; Srinivason etal., 1984). Indeed, studies have shown that defective retroviruses or retrovirus-like sequencescan be experimentally propagated when various viral proteins are supplied in trans, providedthat these elements have complete, functional LTRs (Temin 1982; Varmus, 1982). Asretroviral LTRs often contain strong transcriptional promoters and enhancers, anyrearrangements involving retroviruses will have the potential to drastically alter theexpression of nearby genes. For this reason, the LTRs of endogenous retroviruses may alsohave profound effects on the genomes in which they reside.1. LTR SynthesisA typical retroviral LTR is divided into three regions: U3, R, and U5 (see figure 1-2).In this form, LTRs exist only in the proviral (integrated) DNA state while the 5’ and 3’ ends ofretroviral genomic RNA have the structure R-U5-gag-pol-env-U3-R. Retroviral LTRs areformed during reverse transcription of the viral genomic DNA into a eDNA copy in thefollowing way: Reverse transcription begins at the PBS, near the 5’ end of the viral RNAusing a specific tRNA molecule as the primer. When the 5’ end of the RNA is reached, theRNAse H activity associated with the reverse transcriptase molecule digests the 5’ end of theviral RNA so that the newly formed DNA copy with its attached primer can ‘jump” to the 3’ endof the viral RNA and can hybridize with the R region at the 3’ terminus of the RNA. The27reverse transcription reaction then continues through U3 to produce an almost complete strandof DNA (the minus strand). The result of this switch and extension is to add a U3 sequence tothe R-U5 end of the DNA and thus produce the 5’ LTR. A similar series of events adds a U5segment to the 3’ end, forming the 3’ LTR (Temin, 1981; Varmus 1983;).2. LTR StnictureThe complete nucleotide sequences of a large number of exogenous and endogenousretroviral LTRs have been determined (Chen and Barker, 1984; Majors, 1990). Analysis ofthese sequences has revealed that a great deal of variability exists in the LTRs of even closelyrelated strains of retroviruses. For example, the LTRs of different retroviruses can vary insize from 325 bp - 1300 bp (Temin, 1981). However, in spite of the overall differences, certainhomologies in nucleotide sequence and in LTR structure are present (figure 1-2; Temin, 1981,1982; Majors, 1990).The observed differences in LTR sizes, in most cases, reflects the variability of the U3region of LTRs, which, in mammals, can range from 342 bp in Feline sarcoma virus to 1197 bpin MMTV (Chen and Barker, 1984). The presence of numerous promoter and enhancer-likesequences in U3 (Golemis et al., 1990) has led to the hypothesis that this region, in general,harbors the eukaryotic transcriptional control sequences. This theory has been substantiatedby many studies in which retroviral LTRs have been used to drive the expression of a linkedreporter gene in heterologous cells. The strongest enhancer activity attributed to severalretroviruses has been localized to a 50 - 100 bp repeat in the 5’ region of U3 (Gluzman, 1985) thatoften contains the SV4O core enhancer sequence, TG&pppA (Weiher et al., 1983) and br theMMTV glucocorticoid response element AGAACA (Miksicek et al., 1986) in addition to otherUPEs (Golemis et a]., 1990). Thus, the U3 region appears to play a critical role in determiningthe tissue specificity and pathogenicity of many retroviruses. Indeed, it has been shown thatthe disease specificity determinant of Balb/c endogenous (DesGroseillers et al., 1983) Moloney(DesGroseillers and Jolicoeur, 1984) and Friend MuLV (Chotis et al., 1984) retroviruses lieswithin the enhancer region of U3 and that this specificity correlates with the transcriptional28a) PROTOTYPICAL LTRU3(250-450) R(60-70) U5 (50-80)ITG —-—. CCAAT(40-50) TATAAA (24-26)IGC... AATAAA (15-17) CAl - CAinverted enhancers promoter CAP pyA polyArepeat 50-100 bp site d-p repea S signalb) RTVL-H LTRU3(345) R(65) U5(40)TGTCA—’---.-CCAAA(46) TATAAA (24) GC-..AATAAA (16) CAl GAAinverted 47 bprepeat repeats(4/5matchesFigure 1-2. Comparison of an RTVL-H LTR with a prototypical type C viral LTR. (a)Structure and conserved sequences of a mammalian type C retroviral LTR. Values inparentheses are numbers of nucleotides. (b) Sequence features of a typical RTVL-H LTR.29activity of the LTR in the appropriate cell type. The 3’ part of U3 typically contains a CCAATbox and a TATAA box located approximately 75 and 22 - 26 bp upstream of the transcriptionstart site, respectively (figure 1-2; Temin, 1981, 1982). As in eukaryotic genes, these sequencesfunction to position the RNA polymerase II molecule and to determine promoter strength.The beginning of the R region is defined by the transcriptional start (or CAP) site,usually GO. The polyadenylation signal AATAAA also appears in R and is followed 15 - 17 bpdownstream by the polyadenylation site, CA, that marks the end of R. Under normalcircumstances, the poly A signal in the 5’ LTR is not used, allowing the generation of fulllength viral RNAs that terminate in the 3’ LTR. Control of transcriptional termination isthought to involve secondary structure of the viral RNA (Temin, 1981, 1982). An invertedrepeat observed at the U3 - R boundary of some LTRs has been implicated in thistranscriptional termination (Temin, 1981) since this structure would only be present in RNAderived from the 3’ LTR (Temin, 1982). In addition, the sequence TTGT is often found 10 - 25bp 3’ of R and may also play a role in transcriptional termination. A feature that all proviralLTRs share is 4 -15 bp inverted repeat that, in most cases, ends with the dinucleotide TG....CA(Temin, 1981, 1982).3. Functional Studies ofEndogenous LTRsThe LTRs of exogenous retroviruses contain some of the strongest transcriptionalpromoters and enhancers known (Coffin, 1985). Therefore, it has been of interest to determinethe transcriptional capabilities of endogenous retroviruses. Of the different types ofendogenous retroviruses, the LTRs of the murine elements have been the most extensivelystudied. Many of these sequences (eg. lAP, VL-30 and LTR-IS ) are known to be present inhundreds of copies in normal mouse cells. This observation suggests that a large number ofpotentially functional retroviral transcriptional activators exist in mouse DNA. Analysis ofthe expression of many of these elements in various cell types has suggested that at least someof their LTRs are capable of transcriptional activity. However, because of the repetitive natureof these sequences, it is extremely difficult to determine whether a large number of these LTRs30are capable of low levels of transcription or whether all of the observed transcripts are derivedfrom a subset of highly active elements. Thus, a number of studies have examined thetranscriptional capabilities of individual LTRs linked to reporter genes. These studies haverevealed that many endogenous murine LTRs are indeed capable of acting as promotersand/or enhancers of gene transcription (Kohrer et al., 1985; Rotman et al., 1986; and Luederset al., 1984).For example, VL3O elements have been shown to be constitutively expressed in alldifferentiated mouse tissues, however the examination of different VL30 LTR sequences hasrevealed a high degree of diversity. Transient assays of five randomly isolated VL3O LTRshave shown variable, usually low levels of promoter activity (although one VL3O LTR hadpromoter activities five to six fold higher than the murine sarcoma virus LTR) (Rotman et al.,1986). Thus, it is likely that the functional variability is a reflection of the structuralheterogeneity of these elements. This idea is supported by the fact that LTR sequence diversityhas been localized to the U3 region (Eaton and Norton, 1990). Interestingly, further studieshave shown that VL-30 LTRs can be significantly (— twenty fold) stimulated in transientassays by co-expression of mutant ras genes (Owen et al., 1990). The expression from theseLTRs is unaffected in a revertant cell line that is transformation defective.Another middle repetitive family of endogenous LTR like sequences, LTR-IS, also hastranscriptionally competent LTRs. However, in this case, promoter abilities are weak anddependent on the proximity of enhancer sequences (Kohrer et al., 1985)Murine lAPs have been the most well studied of the endogenous LTRs. Sequenceanalysis of over twenty different lAP LTRs (reviewed in Kuff and Lueders, 1988) has shownthat they are quite variable in size and sequence. However, unlike most LTRs, the major lAPLTR sequence differences map to the R region. These differences are primarily due toduplication of a CT stretch of nucleotides downstream of the CAP site, that result in R regionsranging from 66 - 222 bp (Christy et al., 1985) and do not appear to correlate with LAP class.lAP LTRs contain the typical LTR transcriptional regulatory sequences indicated in figure 1-312 and several cloned lAP LTRs have been shown to be capable of promoting transcription oflinked reporter genes in promoter assays. In one study, Lueders et al. (1984) showed thatstably integrated IAP-LTR reporter gene constructs efficiently promoted gene expression inrat cells. Furthermore, in transient assays, randomly isolated LTRs were shown to have highlevels of activity in COS-1 and CV-1 cells. Interestingly, the levels of activity observed inmonkey cells were five fold higher than in the mouse.The promoter abilities of individual lAP LTRs have also been shown to varyconsiderably (Christy and Huang, 1988). lAP LTRs have been shown to be stimulated by avariety of agents, including dexamethasone (Emanoil-Ravier et al., 1988), and variousnuclear oncogenes, including Simian Virus 40 large T antigen, c-myc, and p53 (Luria andHorowitz, 1986). The analysis of deletion mutants has shown that the U3 region houses the lAPpromoter and that sequences downstream of the CCAAT box determine basal promoteractivities while upstream sequences modulate the levels of transcription (Christy and Huang,1988). The U3 regions of these LTRs contain several nuclear protein binding domains and thesequences of these sites are highly conserved among lAP LTRs that show significantdivergence in other regions (Falzon and Kuff, 1988). DNA methylation state correlatesinversely with endogenous TAP expression and in vitro methylation can strongly suppresslAP promoter activity (Feenstra et al., 1986). Correspondingly, it has recently been suggestedthat methylation may alter the interaction of a purified DNA binding protein, EBP-80, with itsrecognition site in the lAP LTR (Falzon and Kuff 1991).INTERACTIONS OF RETROELEMEN WITH THE HOST GENOMEThe high copy numbers of many of the different types of retroelements has led to muchspeculation of their possible functions. Although it is possible that certain individual elementshave evolved into a functional role, it is now generally believed that these sequences constitute“selfish DNA” - i.e. sequences concernedonly with their own propogation (Orgel and Crick,1980). Such genetic components may be maintained simply because the rate at which their32numbers are reduced by natural selection is balanced by the rate at which they direct their ownmultiplication and reinsertion into the host genome. Nonetheless, even if the differentfamilies of retroelements do not provide specific cellular functions, their potential to causemutations may be advantageous to the host organism. In fact, that increased mutation ratescreate a potential for beneficial variation has been suggested as a reason for the tolerance of alarge viral load in the mouse (Stoye and Coffin, 1988). Thus, in addition to the oftendetrimental effects of these sequences in causing disease, retroelements may play a large rolein the generation of the genetic diversity required for evolution. This idea is supported by theobservation that, in lower eukaryotes, rearrangements involving retroelements account for alarge proportion of the new mutations encountered (Shapiro, 1983). In mammals, severalinstances of of retroelement induced mutations have recently been observed. Indeed, with thepossible exception of base substitutions, all of the changes in gene structure and function thatare required for evolution can be generated either directly or indirectly by retroelements. Forexample, new insertions can alter the expression of cellular genes and the reversetranscriptase activity thought to be encoded by some retroelements could potentially catalyzethe RNA* DNA flow of information. In addition, gross chromosomal changes such asamplifications, deletions, inversions, and translocations could result from homologousrecombinations between different copies of a retroelement. Such large scale changes inchromosomal architecture may alter the pattern of inheritance of groups of genes byrecombining linkage groups or by altering the frequency of recombination among genes(Finnegan, 1989).1. Contribution ofRetroelements to the RNMI DNA Flow of Other SequencesThe structure of sequences such as SINES and processed pseudogenes (specifically theflanking target site duplications, the lack of introns, and/or the presence of A rich 3’ termini)suggest that they have inserted into the genome after reverse transcription of an RNA copy.Because these sequences do not encode functions that could be responsible for reversetranscription and integration, it is generally believed that reverse transcriptase activities33donated by other retroelements are responsible for the dispersion of these elements (Finnegan,1989). This idea is supported by the recent identification of a reverse transcriptase activityfrom NTera2Dl cells (Deragon et al., 1990). Preliminary studies suggest that this activity isthe reverse transcriptase recently shown to be encoded by the Li ORF2 (Dombroski et al., 1991;Mathias et al.,1991). This discovery raises the possibility that sequences expressed inNTera2Dl cells and possibly in analogous normal primitive cells, but which do not encodetheir own reverse transcriptase activity may nevertheless be involved in the reverse flow ofgenetic information.2. Insertions ofRetroelementsAs discussed previously, retroelements are so termed because they are believed to havebeen amplified and dispersed throughout the genome as a result of an RNA mediated process.In addition, many human retroelements are structurally similar to the knownretrotransposons of yeast and Drosophila. Together, these observations have led to the generalbelief that at least some modern day human retroelements are capable of transposition.Nevertheless, it is only recently that the first examples in humans of de novo retroelementinsertions have been reported. For example, two independent Li insertions into exon 14 of thefactor VIII gene have been identified among 240 individuals with hemophilia A (Kazazian etal., 1988). In another case, a tumor specific insertion of an Li element into the second intron-ofc-myc has been observed in a breast carcinoma (Morse et al., 1988). These Li insertions wereboth detected during the characterization of mutations of genetic loci implicated in thepathogenesis of a particular disease. Because human genetic studies are biased towards theanalysis of detrimental mutations, germ-line retroelement insertions that do not confer adetectable phenotype and somatic insertions that do not confer a growth advantage will often bemissed. However, the advent of more sensitive techniques for genetic analysis may revealthat such events are much more common than had previously been expected.343. Changes in Gene SequenceRetroelements may directly affect gene sequence by inserting into coding regions andthereby altering splicing patterns, causing frameshift mutations, or causing prematureterminations. Conversely, point mutations of retroelements naturally residing within genesmay mutate the gene indirectly. One interesting example has recently been identified as acausative mutation in gyrate atrophy of the choriod and retina. In this case, an Mu sequencenormally found within intron three of the ornithine 6-aminotransferase gene has undergone aG * C transversion that created a splice donor site. The presence of this new splice donor siteresulted in the use of an upstream cryptic splice acceptor site in the Mu element, that resultedin a splice mediated insertion of the Alu element into the mRNA transcript. The result of thismutation was inactivation of the gene (Mitchell et al., 1991). This finding illustrates amechanism whereby point mutations of an endogenous retroelement can have a profoundinfluence on the expression of a cellular gene.4. Induction of Chromosomal RearrangementsAlthough the mechanisms described above can play a role in retroelement mediatedmutation, by far the most frequently encountered cases of retroelement mutations have beencaused by homologous recombination. Indeed, the high copy number of many retroelementfamilies coupled with their dispersal suggests many opportunities for the generation of DNArearrangements. Depending on the position and orientation of the retroelement involved,homologous recombination can result in deletion, duplication, translocation, or inversion.For example, intrachromatid recombination between sequences lying in the same orientationat different sites on a chromosome will delete the intervening DNA. When homologousrecombination occurs as an interchromatid event, the intervening segments are deleted fromone strand and duplicated on the other. Reciprocal recombinations between sequences lyingin opposite orientations can result in inversion of the internal DNA. When the recombinationevent involves sequences on different chromosomes, the result is either a dicentricchromosome and a chromosomal fragment or a reciprocal chromosomal translocation.35Several examples of retroelements that have been involved in such structural changes havenow been described.i) AmplificationsSince duplicated genes can be modified by mutation to encode slightly differentproducts or to have different patterns of expression without depriving the organism of theoriginal gene product, amplification may open new avenues for evolution (Finnegan, 1989;Stavenhagen and Robins, 1988). There have been several cases of retroelement mediatedamplifications. For example, the y-globin genes of old world monkeys appear to have resultedfrom recombination between Li elements flanking the ancestral gene (Maeda and Smithies,1986; Fitch et al., 1991). Duplication of low density lipoprotein (LDL) receptor exons inhypercholesterolemia have been attributed to recombination between Alu elements (Lehrmanet al., 1987). In addition, the human growth hormone and chorionic somatomammotrophingene cluster may have evolved from a single ancestral gene as a result of recombinationbetween flanking Alu elements (Barsh et al., 1983).if) DeletionsHomologous recombinations between repeated elements have been causally implicatedin a large number of deletions analyzed. Recombinations between Alu elements are believedto account for many of the deletions in the cx-globin gene cluster that cause thalassemia (Higgset al., 1989) and in the complement Ci inhibitor gene that cause angioedema (Stoppa-Lyonnet etal., 1990). A frequent type of recombination event involving endogenous retroviruses isreciprocal crossing over between the homologous LTRs at either end of the element. Thisrecombination results in deletion of the internal region of the provirus and leaves behind asolitary LTR. The human genome contains a large number (—1000) of solitary RTVL-HLTRs that are unassociated with full length elements. It is likely that most of these elementsare the result of homologous recombination between the 5’ and 3’ LTRs of intact RTVL-Helements because transposition of solitary LTRs has not been described. This idea is supportedby the recent identification of an RTVL-H variant in which two siblings have a solitary LTR36at a locus normally containing an intact element (Mager and Goodchild, 1989). Moreover,RTVL-H elements were originally detected because one member is located near thebreakpoints of three naturally occurring deletions in the human 3-globin gene that causethalassemia or hereditary persistance of fetal hemoglobin (Mager and Henthorn, 1984).Although the role of RTVL-H in the generation of these f3-globin deletions is unknown, theseobservations together with the large number of both solitary RTVL-H LTRs and full lengthRTVL-H elements in the genome raise the possibility that RTVL-H elements may befrequently involved in such events.5. Generation ofVariabifityThe presence of large numbers of retroelements provides an enormous resource forgenetic variability that goes far beyond the usual chromosomal changes and base mutations.There is some evidence in Drosophila that retroelement insertions represent a major fractionof the total spontaneous mutation rate (Temin and Engels, 1984). It has been suggested that inmost cases where two closely related species differ greatly in their DNA content, the differenceoccurs primarily in the moderately repetitive components. Indeed, divergence between speciesmight involve the gradual conversion of moderately repetitive to unique sequences throughmutation or the acquisition of a new function (Temin and Engels, 1984). For example, thepromoter of the human salivary amylase gene has been shown to be provided by an upstream‘y-actin pseudogene. Interestingly, an endogenous LTR has also been identified upstream ofthis gene (Emi et al., 1988). This structure is not present in the mouse salivary amlylase geneand may, in fact, be responsible for differences in the patterns of expression of this geneobserved between these organisms. Additionally, the role of retroelements in providinggeneral variability is supported by the observation of polymorphisms, particularly thoseinvolving Alu elements, that have not been associated with any particular phenotype (Materaet al., 1990).376. SignificanceThe above examples illustrate the potential of retroelements to generate genomicdiversity either by retrotransposition or by the induction of gross chromosomal changes. Oneimportant consequence of these types of rearrangements is the juxtaposition of transcriptionalregulatory sequences within the retroelement, particularly those that contain LTRs, with newcombinations of cellular sequences. These alterations may have profound effects upon theregulation of nearby genes, altering either the timing or tissue specificity of their expression.When such changes occur in the germline, they may provide the variation necessary forevolution. In somatic cells, many of these events will result in detrimental mutations andthus may be important in the pathology of certain diseases.LTRS AS MUTAGENIC AGENTSRetroviruses and retrovirus-like sequences may be particularly competent mutagens.Since these sequences are flanked by LTRs containing control sequences for transcriptionalinitiation and termination, it is likely that structural changes such as those described abovewill affect the regulation of cellular genes in the vicinity of the rearrangement. Thus LTRsmay exert a wide range of effects on surrounding cellular sequences.The first indications that retroviruses can be powerful mutagens came from studies ofcertain tumors induced by exogenous retroviruses, such as avian leukosis virus (ALV). Inmany cases, the tumors induced by these viruses were shown to be clonal descendents of asingle viral integration event. Furthermore, the viral integration sites were conservedamong different tumors induced by a particular retrovirus (Hayward et al., 1981; Neel andHayward, 1981). Similarly, studies on the arrangement of endogenous MuLVs in inbredmouse strains revealed that distinct integrations of these sequences were associated withseveral heritable mutations (Jenkins et al., 1981; Jaenisch et al., 1983; Stoye et al., 1988).Together, these findings implied that the site of proviral insertion was causally related to thegeneration of these mutations. Since these early observations, there have been many further38Insertional lnacvationframe shift ImutatbnsFigure 1-3. Some common mechanisms of LTR mediated mutagenesis. Lines flanked bysmall black arrows (LTRs) represent proviral genomes. The box containing an arrow in part3 represents a negative regulatory element. Insertional Activation: 1. Promoter insertionoccurs when a retroviral LTR inserts in the 5 region of a gene, in the same transcriptionalorientation and overrides transcriptional signals in the normal gene promoter. 2. Insertionof a provirus either downstream of the gene promoter or upstream in the oppositetranscriptional orientation often alters tissue specificity or levels of gene expression viaenhancers present in the LTR sequence. 3. Either of these mechanisms can be accompaniedby the displacement of negative regulatory elements that typically down regulate geneexpression. Insertional Inactivation: Proviral integration into the coding regions of a genecan result in inactivation by causing frameshift mutations, by disrupting splice signals, or bycausing premature polyadenylation via polyadenylation signals present in the LTRs. Itshould be noted that results of retroviral integrations are also dependent on the region intowhich they insert. For example, premature polyadenylation has the potential to activateexpression (see text for examples) and promoter insertion may shut down expression in a celltype in which the LTR promoter is not active.Insertional Activation1. Promoter Insertion2. Enhancer Insertion3. Replacement of NegativeRegulatory Elements...TATAA.. —Ø’--—---+ [ gene jF gene0__gene1. Gene Disruption F1—øcDispting splicePremature signalspo,adenylation39examples of retroviral insertional mutagenesis, most commonly resulting in malignancies,and the mutagenic potential of these sequences has been widely accepted. The analysis of alarge number of these proviral insertions has identified several common mechanisms bywhich these events may activate or inactivate nearby cellular genes (illustrated schematicallyin figure 1-3 and discussed below). These mechanisms are largely governed by the uniqueproperties of proviral LTRs, and their result depends on the nature of the insertion with respectto different structural and functional domains of the affected gene.1. Insertional Activation by LTRsi) Promoter InsertionMutagenesis by promoter insertion was first proposed to explain the activation of the cmyc gene in ALV induced avian bursal lymphomas (Hayward et al., 1981). The observationthat in virtually all of these tumors, the ALV proviral sequences were located either within oradjacent to the c-myc gene led to the hypothesis that strong transcriptional promoters within theALV LTR were responsible for the elevation in levels of c-myc RNA. This model wassupported by the discovery of chimeric myc transcripts initiating within the ALV LTR(Hayward et al., 1981). Since these initial observations, there have been many furtherexamples of gene activation by retroviral promoter insertion and a number of commonfeatures have emerged.Promoter insertion most commonly occurs when a retroviral genome integratesupstream or in the 5’ region of a gene, in the same transcriptional orientation, allowingtranscription initiating in either the 5’ or 3’ LTR to drive expression of the gene. When the 3’LTR is used, the integrated proviruses often have deletions in the proviral genome that removeor inactivate the 5’ LTR (van Lohuizen and Berns, 1990), thus relieving transcriptionaloverlap interference (Cullen et al., 1984) and allowing transcription from the 3’ LTR.Similarly, homologous recombination between the LTRs of an endogenous proviral elementmay remove transcriptional constraints on the 3’ LTR and thus stimulate the expression of adownstream gene. For example, Friend murine leukemia virus has been implicated in the40activation of the c-Ki-ras proto-oncogene in the bone marrow derived murine cell line 416B.Primer extension and DNA sequence analyses have identified transcripts that initiate in the3’ LTR of an integrated Friend provirus with a deleted 5’ LTR, suggesting that the 20 - 30 foldenhancement of c-Ki-ras expression in these cells is due to promoter insertion. Thisrearrangement has not disturbed the normal coding potential of the gene (Trusko et al., 1989).There have been several instances in which rearrangements involving endogenousprovirus-like sequences have activated cellular genes. For example, in one case, an insertionof a 4.2 kb lAP element into codon 88 of the c-mos gene in the mouse plasmacytoma XPRC24(Kuff et al., 1983) resulted in its activation. Indeed, this rearranged c-mos locus couldtransform NIH 3T3 cells, while the unrearranged gene could not (Canaani et al., 1983).Although this lAP insertion was in the opposite transcriptional orientation to the c-mos gene,chloramphenicol acetyltransferase (CAT) promoter assays revealed that this particular LTRcan promote gene expression in both transcriptional orientations (Horowitz et al., 1984).Furthermore, two RNA transcription start sites were mapped to the LTRJmos junction and to 10bp upstream of this junction within the TAP LTR. These observations suggest that in these cellsthe c-mos gene is under the control of a cryptic promoter functioning in the 3’ 4 5’ orientationwith respect to the TAP LTR. An TAP insertion has also been shown to activate expression of acellular gene in another mouse plasmacytoma, MPC11. Here, the lAP inserted in the sametranscriptional orientation, 18 bp upstream of the putative interleukin-6 (TL-6) transcriptionalstart site, removing the normal IL-6 promoter to a position 2.5 kb upstream of its normalposition, and resulting in the constitutive expression of the IL-6 gene in these cells(Blankenstein et aL,1990).The oncomodulin gene in the rat is an interesting example in which an LTR hasevolved into the normal promoter of a cellular gene. Studies have shown that the oncomodulingene is normally controlled by a LTR promoter related to lAPs (Banville and Boie, 1989;Furter et al., 1989). The structural and sequence similarity of oncomodulin to theparvalbumin gene suggests that this gene arose by duplication of an ancestral gene, with a41subsequent rearrangement placing the LTR upstream of one copy in the same transcriptionalorientation (Banvifle and Boie, 1989) If so, this scenario may be responsible for the observeddifferences in tissue specificity between these genes.In addition to the examples of promoter insertions outlined above, there have also beenreports in which a different type of chimeric transcripts initiating in the the 5’ LTR of aprovirus have been identified. In these cases, inefficient processing at the 3’ LTR poly A andthe subsequent splicing to exons in a flanking gene has resulted in the generation of fusiontranscripts. Some examples have been caused by ALV splicing into the c-erbB gene in chickenerythroleukemias (Goodwin et al., 1986) and by MuLV splicing into exon 1 of the c-myb protooncogéne in mouse lymphsarcomas (Shen-Ong et al., 1986). In humans, similar events arethe regulation of a Kruppel-like gene by the ERV-3 endogenous retrovirus (Kato et al., 1990)and the recent identification of an RTVL-H element spliced into the calbindin gene in a cellline derived from a prostate metastasis (Liu and Abraham, 1991).It should be stated that in addition to the changes in transcriptional regulation of thedownstream gene mediated by promoter sequences in the LTRs, these events may also alter thefundamental properties of the gene product by removing important upstream regions. Forexample, in cases where 5’ RNA sequences are removed, important coding regions may be lostfrom the rearranged gene and/or the translation initiation site may be altered. The effect ofthese events on the cellular gene depend on the specific gene involved, on the LTR signalsused, and on the location of the insertion.ii) Enhancer InsertionThe positional flexibility and the ability of enhancers to function over long distancessuggests that in addition to the mechanisms described above, retroviral LTRs should have theability to activate cellular genes at distant sites. Indeed, enhancer activation is a commonlyidentified mechanism of retroviral insertional mutagenesis and has been proposed to explainactivation of c-myc by ALV in some chicken bursal lymphomas (Payne et al., 1982), of the mtgenes by MMTV in murine mammary carcinomas (Nusse et al., 1984) and of the pim-1 and c42myc genes in MuLV induced T cell lymphomas (Cuypers et al., 1984). In these cases, theproviruses are typically positioned such that the LTR promoter sequences are not used (i.e.either at the 3’ end of the gene, or at the 5’ end in the opposite transcriptional orientation). Inmost cases, the effect of these insertions is the elevation of expression of otherwise unalteredgenes, often in tissues where these genes are normally silent.There have also been a number of cases in which endogenous retrovirus-likesequences have activated cellular genes by enhancer insertion. A well known example is thatof the interleukin-3 (IL-3) gene in the murine cell line WEHI-3B. In these cells, theconstitutive expression of IL-3 has been shown to be caused by insertion of a 5.2 kb lAP element215 bp upstream of the IL-3 TATAA box, in the opposite orientation (Ymer et al., 1985). Theobservation that WEHI-3B cDNA clones initiate at the same site as those derived from othercell lines suggests that the WEHI-3B IL-3 transcripts are transcribed from the normal IL-3promoter, which is intact in the rearranged gene (Ymer et al., 1985). Consequently, activationmay be due to the displacement of a silencer that has been identified 1376 - 1841 bp upstream ofthe murine IL-3 transcript initiation site (Lee, 1989), to the presence of transcriptionalenhancers within the lAP LTR, or to a combination of both of these mechanisms. Intriguingly,WEHI-3B cells also contain an lAP mediated activation of the Hox 2.4 homeobox gene. In thiscase, the lAP element has inserted into exon 1, in the same transcriptional orientation andpromoter sequences in the 3’ LTR are thought to drive expression of a fused and truncatedIAP/Hox2.4 mRNA encoding a normal Hox2.4 protein (Blatt et al., 1988; Kongsuwan et al.,1989). Recently, gene transfer experiments have shown that the expression of a combination ofIL-3 and Hox 2.4 genes elicited myeloid leukemia in normal bone marrow cells, whileexpression of the IL-3 gene alone did not have this effect (Perkins et al., 1990). This findingraises the interesting possibility that the combination of these two lAP mediated activationevents may be responsible for the transformation of WEHI-3B cells. In addition to theseexamples, there have also been a number of reports in which insertions of lAP elements haveactivated genes encoding hemopoietic growth factors such as GM-CSF and IL-3 (Stocking et43al., 1988; Durhsen et al., 1990; Heberlain et al., 1990) and in this way, have transformedhemopoietic cells to factor independence.An example in which a proviral LTR has evolved into the normal enhancer of acellular gene is the mouse sex limited protein gene. This gene is thought to have evolvedthrough duplication of the gene encoding the 4th component of complement (C4) and isandrogen dependent due to enhancers present in a provirus-like LTR located 2 kb upstream(Stavenhagen and Robins, 1988).The murine N-myc gene contains a common proviral insertion site in MuLV inducedT cell lymphomas. In this case, the majority of insertions occur in the 3’ untranslated regionand result in a 3’ truncated N-myc RNA that encodes an unaltered N-myc protein (i.e.transcription begins at the normal myc promoter but terminates in the 3’ LTR which providesa novel poiy A signal). The strong clustering of proviral integrations in this 100 bp regionsuggests that, in addition to enhancer donation, this premature polyadenylation may remove anegative regulatory element at the 3’ end of the gene (VanLohuizen et al., 1989).iii) Removal of Negative Regulatory SignalsThis mechanism is not easily distinguishable from enhancer insertion, indeed therehave been several cases in which both mechanisms may be simultaneously operational. Forexample, the role of deletion of sequences in the 3’ untranslated region of the N-myc gene byMuLV insertions has been discussed above. Similarly, the activation of pim-1 by MuLVs in Tcell lymphomas often involves the removal of AU rich motifs that can confer instability of thepim-1 mRNA (Vanbohuizen and Berns, 1990).2. hisertional InactivationLTR mediated insertional inactivation may be the result of prematurepolyadenylation by signals in the LTR, or of lower or altered levels of LTR driventranscription. Non-LTR mediated mechanisms include alterations in gene splicing andframeshift mutations. In general, proviral integration into or close to a gene is more likely tobe a deleterious event than to activate expression. However, as these rearrangements typically44occur in only one allele, any adverse effects may be compensated for by the remainingunaffected allele on the other chromosome. Indeed, even if both copies of a gene areinactivated, in somatic cells such recessive events are unlikely to be recognized unless theyconfer a growth advantage. Nonetheless, certain cases of insertional inactivation have beenrecognized. For example, in one case, a recessive lethal mutation is caused by an MoMuLVinsertion in the first intron of the al(I) collagen gene in Movl3 mice. This insertion is thoughtto be responsible for the —20 - 100 fold decrease in transcription, via a change in chromatinstructure, since the locus containing the insertion lacks a DNAse I hypersensitive site presentin the unrearranged allele. The observation that the integrated provirus is transcriptionallyinactive in these cells, and that no transcripts initiating in the 3’ LTR can be detected areconsistent with this idea (Hartung et al., 1986).In another study, the ic light chain genes from two hybridoma cell lines defective in Klight chain synthesis were compared with the wild-type genes. The mutant genes were found tohave TAP sequences inserted into one of their introns, possibly resulting in the lowered levelsof ic chain expression in these lines (Hawley et al., 1982; 1984). Similarly, molecularanalysis of elements inserted into mouse actin pseudogenes revealed insertions of severaldifferent LTR-like elements (Mm Man et al., 1987).3. SignificanceThese examples clearly suggest that endogenous retroviral LTRs contain sequencesthat are capable of controlling the expression of cellular genes and thus, that rearrangementsinvolving these elements may alter the regulation of nearby genes. Such changes may bedetrimental, such as in the cases of activation of proto-oncogenes that result in malignancy, orthey may cause beneficial changes that are maintained in evolution.45THESIS OBJE(WIVESThe long term goal of this project was to gain insight into the potential impact that thepresence of RTVL-H LTRs may have on the human genome. Although the long terminalrepeats of exogenous retroviruses and murine endogenous retroviruses or retrovirus-likesequences have been well studied, comparatively little is known about the role that similarsequences may play in human cells. Nonetheless, the striking similarity of RTVL-H LTRsin structure and organization to the LTRs of proviruses (figure 1-2) and the observations ofRTVL-H related transcripts in human cells (Wilkinson et al., 1990) raise the possibility thatRTVL-H sequences may act as transcriptional regulators of human gene expression. Thisresearch project involves direct efforts to test this hypothesis.Three specific objectives were undertaken1. Analysis ofLTR Transcriptional Regulatory Capabifites (Chapter 3)The first objective of this study was to analyze the promoter/enhancer capacities ofindividual RTVL-H LTRs to determine the degree of heterogeneity in their activitiesand their potential cell type specificities.2. Investigation ofTransactivation ofRTVL-H LTRs (Chapter 4)It is likely that the transcriptional capabilities of RTVL-H LTRs can be controlled byboth endogenous and exogenous trans-acting factors. The second objective of this studywas to investigate an initial observation suggesting that SV4O large T antigen cantransactivate RTVL-H LTRs.3. Identification ofLTR Promoted Cellular Genes (Chapter 5)The numerous RTVL-H LTRs in the human genome may exert their greatest genomiceffect through actions on nearby cellular genes. The third objective of this study was toisolate and characterize transcripts from candidate genes that are promoted fromRTVL-H LTRs.46REFERENCESAdams, S. E., P. D. Rathgen, C. A. Stanway, S. M. Fulton, M. H. Malim, W. Wilson,J. Ogden, L. King, S. M. Kingsman, and A. J. Kingsman. 1988. Complete nucleotidesequence of a mouse VL3O element. Mol Cell Biol. 8:2989-2998.Anderson, S. J., H. S. Chou, and D. H. Loh. 1988. A conserved sequence in the T cellreceptor 13-chain promoter region. Proc. NatI. Acad. Sci. USA. 85:3551-3554.Banerji, J., S. Rusconi, and W. Schaffner. 1981. Expression of a globin gene isenhanced by remote SV4O sequences. Cell. 27:299-308.Banville, D. and Y. Boie. 1989. Retroviral long terminal repeat is the promoter of thegene encoding the tumor associated calcium binding protein oncomodulin in the rat.J. Mol. Biol. 207:481-490.Barsh, G. S., P. H. Seeburg, and It. E. Gelenas. 1983. The human growth hormonegene family: Structure and evolution of the chromosomal locus. NucI. Acids. Res.11:3939-3958.Ben-David, V. and A. Bernstein. 1991. Friend virus-induced erythroleukemia andthe multistage nature of cancer. Cell. 66:831-834.Blankenstein, T., Z. Qin, W. Li, and T. Diamantstein. 1990. DNA rearrangementand constitutive expression of theinterseukin-6 gene in a mouse plasmaëytoma. J.Exp. Med. 171:965-970.Blatt, C. D. Aberdam, It. Schwartz, and L. Sachs. 1988. DNA Rearrangement of ahomeobox gene in myeloid leukemic cells. Embo J. 7:4283-4290.Boccaccio, C., J. Deschatrette, and M. Meunier-Rotival. 1990. Empty and occupiedinsertion site of the truncated LINE-i repeat located in the mouse serum albuminencoding gene. Gene. 88:181-186.Boeke, J. D., and V. G. Corces. 1989. Transcription and reverse transcription ofretrotransposons. Annu. Rev. Microbiol. 43:403-434.Bohmann, D., W. Keller, T. Dale, H. Scholer, G. Tebb, and I. Mattaj. 1987. A• transcription factor which binds to the enhancers of SV4O, immunoglobulin heavychain and U2 snRNA genes. Nature 325:268-272.Boiler, K., H. Frank, J. Lower, and It. Kurth. 1983. Structural organization of uniqueretrovirus-like particles budding from human teratocarcinoma cell lines. J. Gen.Virol. 64:2549-2559.Bonner, T., C. 0’ Conneil, and M. Cohen. 1982. Cloned endogenous retroviralsequences from human DNA. Proc. NatI. Acad. Sci. USA. 79:4709-47 13.Brent, It. and M. Ptashne. 1985. A eukaryotic transcriptional activator bearing theDNA specificity of a prokaryotic repressor. Cell. 43:729-736.Britten, It. J. and D. E. Kohne. 1968. Repeated sequences in DNA. Science. 161:529-540.47Buckler, C. E., P4. D. Hoggan, H. W. Chan, J. F. Sears, A. S. Khan, J. L. Moore, J. W.Hartley, W. P. Rowe, and M. A. Martin. 1982. Cloning and characterization of anenvelope specific probe from Xenotropic murine leukemia proviral DNA. J. Virol.41:228-236.Buratowski, S., S. Hahn, L. Guarente, and P. A. Sharp. 1989. Five intermediatecomplexes in transcription initiation by RNA polymerase II. Cell. 56:549-561.Callahan, R., W. Drohan, S. Tronick, and J. Schlom. 1982. Detection and cloning ofhuman DNA sequences related to the mouse mammary tumor virus genome. Proc.Nati. Acad. Sci. USA. 79:5503-5507.Canaani, E., 0. Dreazon, G. Rechavi, D. Rain, J. B. Cohen, and D. GivoL 1983.Activation of the c-mos oncogene by the LTR of an Intracisternal A particle gene.Embo. J. 3:2937-2941.Carter, A. T., J. D. Norton, and K J. Avery. 1983. A novel approach to cloningtranscriptionally active retrovirus-like genetic elements from mouse cells. Nuci.Acids Res. 11:6243-6254.Chatis, P. A., C. A. Holland, J. E. Silver, T. N. Fredrickson, N. Hopkins, and J. W.Hartley. 1984. A 3’ end fragment encompassing the transcriptional enhancers ofnondefective Friend virus confers erythroleukemogenicity on Moloney murineleukemia virus. J. Virol. 52:248-254.Chen, H. R. and W. C. Barker. 1984. Nucleotide sequences of the retroviral longterminal repeats and their adjacent regions. Nucl. Acids. Res. 12:1767-1778.Chiu, I., R. C. Huang, and S. A. Aaronson. 1985. Genetic relatedness betweenintracisternal A particles and other major oncovirus genera. Virus. Res. 3:1-11.Christy, R. J., A. H. Brown, B. B. Gourlie, and H. C. C. Huang. 1985. Nucleotidesequences of murine intracisternal A particle gene LTRs have extensive variabilitywithin the R region. Nuci. Acids. Res. 13:289-302.Christy, R. J. and R. C. C. Huang. 1988. Functional analysis of the long terminalrepeats of intracisternal A particle genes: Sequences within the U3 region determineboth the efficiency and direction of promoter activity. Mol. Cell. Biol. 8:1093-1102.Cleary, M. L., S. D. Smith, and J. Sklar. 1986. Cloning and structural analysis ofcDNAs for bcl-2 and a hybrid bcl-2/immunoglobulin transcript resulting from thet(14; 18) translocation. Cell. 47:19-28.Coffin,J. 1985. “Endogenous Viruses”. In: RNA Tumor Viruses. (Ed. R. Weiss, N.Teich, H. Varmus, and J. Coffin). Cold Spring Harbor Laboratory, Cold SpringHarbor, N. Y. pp 1109-1203.Cohen, M., M. Powers, C. O’Connell, and N. Kato. 1985. Nucleotide sequence of theenu gene from the human prvirus ERV 3 and isolation and characterization of an ERV3 specific cDNA. Virology. 147:449-458.48Courey, A. J. and R. Tjian. 1988. Analysis of SP-1 in vivo reveals multipletranscriptional domains, including a novel glutamine-rich activation motif. Cell.55:887-898.Courey, A. J. D. A. Holtzman, S. P. Jackson, and R. Tjian. 1989. Synergisticactivation by the glutamine-rich domains of human transcription factor Sp-1. Cell.59:827-836.Cullen, B. R., P. T. Lomedico, and G. Ju. 1984. Transcriptional interference in avianretroviruses-implications for the promoter insertion model of leukaemogenesis.Nature. 307:241-245.Curcio, M. J., A-M.. Hedge, J. D. Boeke, and D. J. GarfinkeL 1990. Ty RNA levelsdetermine the spectrum of retrotransposition events that activate gene expression inSaccaromyces cerevisiae. Mol. Gen. Genet. 220:213-221.Cuypers, H. T., G. Selton, W. Quint, WI. Zijlstra, E. K Maandag, W. Boelens, P. vanWezenbeek, C. Melief, and A. Berns. 1984. Murine leukemia virus-induced T celllymphomagenesis: Integration of proviruses in a distinct chromosomal region. Cell.37:141-150.Deragon, J.M., S. Sinnet, and D. Labuda. 1990. Reverse transcriptase activity fromhuman embryonal carcinoma cells NTera2Dl. Embo. J. 9:3363-3368.DesGroseillers, L. and P. Jolicoeur. 1984. Mapping the viral sequences conferringleukemogenicity and disease specificity in Moloney and amphotropic murineleukemia viruses. J. Virol. 52:448-456.DesGroseillers, L., E. Rassart, and P. Jolicocur. 1983. Thymotropism of murineleudemia virus is conferred by its long terminal repeat. Proc. NatI. Acad. Sci. USA.80:4203-4207.Dickson, C., K Smith, S. Brookes, and G. Peters. 1984. Tumorigenesis by MouseMammary Tumor Virus: Proviral activation of a cellular gene in the commonintegration region int-2. Cell. 37:529-536.Dirkson, E. R. and J. A. Levy. 1977. Virus-like particles in placentas from normalindividuals and patients with systemic lupus erythematosus. J. Nati. Cancer. Inst.59:1187-1192.Dombroski, B. A., S. L. Mathias, E. J. Nanthakumar, A. F. Scott, and H. H. KazazianJr. 1991. Isolation of the Li gene responsible for a retrotransposition event in man.Am. J. Hum. Genet. Suppl. 49:403.Doolittle, K F., D. F. Feng, M. S. Johnson, and MA. McClure. 1989. Origins andevolutionary relationships of retroviruses. Quart. Rev. Biol. 64:1-30.Dunaway, M. and P. Droge. Transactivation of the Xenopus rENA gene promoter byits enhancer.1989. Nature. 341:657-659.Durhsen, U., J. Stahl, and N. W. Gough. 1990. In vivo transformation of factor-dependent hemopoietic cells: Role of intracisternal A-particle transposition for growthfactor gene activation. Embo. J. 9:1087-1096.49Dynan, W. S. 1989. Modularity in promoters and enhancers. Cell. 58:1-4.Eaton, L. and J. D. Norton. 1990. Independent regulation ofmouse VL3Oretrotransposon expression in response to serum and oncogenic cell transformation.NucI. Acids. Res. 18:2069-2077.Enianoil-Ravier, it, G. Mercier, M. Canivet, M. Garcette, J. Lasneret, F. Peronnet,M. Best-Belpomme, and J. Peries. 1988. Dexamethasone stimulates expression oftransposable type A intracisternal retrovirus genes in mouse (Mus musculus) cells. J.Virol. 62:3867-3869.Emi, M., A. Horii, N. Tomita, T. Nishide, M. Ogawa, T. Mori, and K. Matsubara.1988. Overlapping two genes in human DNA: A salivary amylase gene overlaps witha gamma actin pseudogene that carries an integrated human endogenous retroviralDNA. Gene. 62:229-235.Evans, J. P. and R. D. Palmiter. 1991. Retrotransposition of a mouse Li element.Proc. Natl. Acad. Sci. USA. 88:8792-8795.Faizon, M and E. L. Kuff. 1988. Multiple protein binding sites in an intracisternal Aparticle long terminal repeat. J. Virol. 62:4070-4077.Faizon, M. and E. L. Kuff. 1991. Binding of the transcription factor EBP-80 mediatesthe methylation response of an intracisternal A particle long terminal repeatpromoter. Mol. Cell. Biol. 11:117-125.Fanning, T. G. and M. F. Singer. 1987. LINE-i: A mammalian transposableelement. Biochim. Biophys. Acta. 910:203-212.Feenstra, A., J. FeweR, K. Lueders, and E. Kuff. 1986. In vitro methylation inhibitsthe promoter activity of a cloned intracisternal A-particle LTR. Nucl. Acids. Res.14:4343-4352.Finnegan, D. J. 1989. Eukaryotic transposable elements and genome evolution.Trends. Genet. 5:103-107.Fitch, D. IL, W. J. Bailey, D. Tagle, M. Goodman, L. Sieu, and J. L. Slightom. 1991.Duplication of the y-globin gene mediated by Li long interspersed repetitive elementsin an early ancestor of simian primates. Proc. Natl. Acad. Sci. USA. 88:7396-7400.Frankel, A. B. and P. S. Kim. 1991. Modular structure of transcription factors:Implications for gene regulation. Cell. 65:717-719.Frankel, W. N., J. P. Stoye, B. A. Taylor, and J. M. Coffin. 1990. A linkage map ofendogenous murine leukemia proviruses. Genetics. 124:221-236.Fraser, C., Humphries, R. K., and Mager, D. L. 1988. Chromosomal distribution of theRTVL-H family of human endogenous retrovirus-like sequences. Genomics. 2:280-287.Fromm, M. and P. Berg. 1983. Simian virus 40 early and late region promoterfunctions are enhanced by the 72 base pair repeat inserted at distant locations andinverted orientations. Mol. Cell. Biol. 3:991-999.50Furter, C. S., C. W. Heizmann, and M. W. Berchtold. 1989. Isolation and analysis ofa rat genomic clone containing a long terminal repeat with high similarity to theoncomodulin rat leader sequence. J. Biol. Chem. 264:18276-18279.Gattoni-Celli, S., K. Kirsch, S. Kalled, and K. Isselbacher. 1986. Expression of type Crelated endogenous retroviral sequences in human colon tumors and colon cancer celllines. Proc. Natl. Acad. Sci. USA. 83:6127-6131.Gluzman, Y. (ed). 1985. Eukaryotic transcription. The role of cis and trans-actingelements in initiation. Cold Spring Harbours Laboratory, Cold Spring Harbor, NewYork.Golemis, E., N. A. Speck, and N. Hopkins. 1990. Alignment of U3 region sequences ofmammalian type C viruses: Identification of highly conserved motifs andimplications for enhancer design. J. Virol. 64:534-542.Goodwin, R. G., F. M. Rotman, P. Callahan, H-J. Kung, P. A. Maroney, and T. W.Nilsen. 1986. c-erbB activation in avian leukosis virus-induced erythroblastosis:Multiple epidermal growth factor receptor mRNAs are generated by alternative RNAprocessing. Mol. Cell. Biol. 6:3128-3133.Grimaldi, J. and T. C. Meeker. 1989. The t(5;14) chromosomal translocation in a caseof acute lymphocytic leukemia joins the interleukin-3 gene to the immunoglobulinheavy chain gene. Blood. 73:2081-2085.Grosschedl, R. and M. L. Birnsteil. 1980. Identification of regulatory sequences in theprelude sequences of an H2A histone gene by the study of specific deletion mutants invivo. Proc. Natl. Acad. Sci. USA. 77:1432-1436.Grosveld, G. C., C. K. Shewmaker, P. Jat, and U.. A. Flavell. 1981. Localization ofDNA sequences necessary for transcription of the rabbit -globin gene in vitro. Cell.25:215-226.Grosveld, G. C., E. deBoer, C. K. Shewmaker, and U.. A. Flavell. 1982. DNAsequences necessary for transcription of the rabbit f3-globin gene in vitro. Nature.295:120-126.Harada, F., G. Peters, and J. Dalilberg. 1979. The primer tRNA for Moloney murineleukemia virus DNA synthesis: Nucleotide sequence and aminoacylation of tRNApro.J. Biol. Chem. 21:10979-10985.Harada, F., N. Tsukada, and N. Kato. 1987. Isolation of three kinds of humanendogenous retrovirus-like sequences using tRNAP’0 as a probe. Nucl. Acids. Res.15:9153-9162.Harrigan, M. T., G. Baughman, N. Campbell, and S. Bourgeors. 1989. Isolation andcharacterization of glucocorticoid and cyclic AMP-induced genes in T lymphomas.Mol. Cell. Biol. 9:3438-3446.Hartung, S., U.. Jaenisch, and M. Briendi. 1986. Retrovirus insertion inactivatesmouse cd(I) collagen gene by blocking initiation of transcription. Nature. 320:365-367.51Hattori, M. S. Kuhara, 0. Takenaka, andY. Sakaki. 1986. Li family of repetitiveDNA sequences in primates may be derived from a sequence encoding a reversetranscriptase related protein. Nature. 321:625-628.Hawley, R. G., M. J. Shulman, and N. Hozumi. 1984. Transposition of two differentIntracisternal A particle elements into an immunoglobulin kappa chain gene. Mol.Cell. Biol. 4:2565-2572.Hawley, H. G., M. J. Shulmnn, H. Murialdo, D. M. Gibson, and N. Hozumi. 1982.Mutant immunoglobulin genes have repetitive DNA elements inserted into theirintervening sequences. Proc. Nati. Acad. Sci. USA. 79:7425-7429.Hayward, W. S., B. G. Neel, and S. M. Astrin. 1981. Activation of a cellular onc geneby promoter insertion in ALV induced lymphoid leukosis. Nature. 290:475-480.Heberlain, C., M. Kawai, M-J Franz, G. Beck-Engeser, C. P. Daniel, W. Ostertag,and C. Stocking. 1990. Retrotransposons as mutagens in the induction of growthautonomy in hemopoietic cells. Oncogene. 5:1799-1807.Herr, W. and H. Clarke. 1986. The SV4O enhancer is composed of multiple functionalelements that can compensate for one another. Cell. 45:46 1-470.Riggs, D. H., M. A. Vickers, A. 0. M. Wilkie, L M. Pretorius, A. P. Jarman, and D.J. Weatherall. 1989. A review of the molecular genetics of the human c-globin genecluster. Blood. 73:1081-1104.Hoggan, M. D., C. E. Buckler, J. F. Sears, W. P. Rowe, and M. A. Martin. 1983.Organization and stability of endogenous xenotropic murine leukemia virus DNA inmouse genomes. J. Virol. 45:473-477.Horowitz, M., S. Luria, G. Rechavi, D. Givoi 1984. Mechanism of activation of themouse c-mos oncogene by the LTR of an Intracisternal A particle gene. Embo. J.3:2937-2941.Hynes, N. E., B. Groner, and H. Michalides. 1984. Mouse Mammary Tumor Virus:Transcriptional control and involvement in tumorigenesis. Adv. Cancer. Res.41:155-184.Itin, A. and E. Keshet. 1983. Apparent recombinants between virus-like (VL-30) andmurine leukemia virus-related sequences in mouse DNA. J. Virol. 47:178-184.Jaenisch, R. 1976. Germ line integration and Mendelian transission of the exogenousMoloney leukemia virus. Proc. NatI. Acad. Sci. USA. 73:1260-1264.Jaenisch, H. 1983. Endogenous retroviruses. Cell. 32:5-6.Jaenisch H., K. Harbers, A. Schnieke, J. Lohier, L Chumkou, D. Jahner, D. Grotkopp,and E. Hoffman. 1983. Germline integration of Moloney murine leukemia virus atthe Movi3 locus leads to recessive lethal mutation and early embryonic death. Cell.32:209-216..Jenkins, N. A., N. G. Copeland, B. A. Taylor, and B. K. Lee. 1981. Dilute (d) coatcolour mutation of DBA/2J mice is associated with the site of integration of an ecotropicMuLV genome. Nature. 293:370-374.52Kadonaga, J. T., K. A. Jones, and R. Tjian. 1986. Promoter specific activation ofRNA polymerase II transcription by Sp-1. TIBS. 11:20-23.Kalter, S. S., R. J. Hehnke, and R. C. Heberling. 1973. C-type particles in normalhuman placentas. J. Nati. Cancer. Inst. 50:1081-1084.Kamps, M. P., C. Murre, X-H. Sun, and D. Baltimore. 1990. A new homeobox genecontributes to the DNA binding domain of the t (1;19) translocation protein in pre-BALL. Cell. 60:547-555.Kato, N., E. Larsson, and M. Cohen. 1988. Absence of expression of a humanendogenous retrovirus is correlated with choriocharcinoma. mt. J. Cancer. 41:380-385.Kato, N., S. Pfeifer-Ohlsson, M. Kato, E. Larsson, J. Rydnert, it Ohisson, and M.Cohen. 1987. Tissue specific expression of human provirus ERV 3 mRNA in humanplacenta: Two of the three ERV 3 mRNAs contain human cellular sequences. J. Virol.61:2182-2191.Kato, N., K. Shimotolmo, D. VanLeeuwan, and M. Cohen. 1990. Human proviralmRNAs down regulated in choriocarcinoma encode a zinc finger protein related toKruppel. Mol. Cell. Biol. 10:4401-4405.Kazazian, IL IL, C. Wong, H. Youssoufian, A. F. Scott, D. G. Phillips, and S. E.Antonarakis. 1988. Haemophilia A resulting from de novo insertion of Li sequencesrepresents a novel mechanism for mutation in man. Nature. 332:164-166.Keshet, E. and A. Itin. 1982. Patterns of genomic distribution and sequenceheterogeneity of a murine “retrovirus-like” multigene family. J. Virol. 43:50-58.Keshet, E., R. Scuff, and A. Itin. 1991. Mouse retrotransposons: A cellular reservoirof long terminal repeat (LTR) elements with diverse transcriptional specificities.Adv. Cancer Res. 56:215-251.Kohrer, K., L Grummt, and L Horak. 1985. Functional RNA polymerase II promotersin solitary retroviral long terminal repeats (LTR-IS elements). Nuci. Acids. Res.13:2631-2645.Kongsuwan, K., J. Allen, and J. M. Adams. 1989. Expression of Hox 2.4 homeoboxgene directed by proviral insertion in a myeloid leukemia. Nuci. Acids. Res. 17:18811892.Kozak, C. A. 1985. Retroviruses as chromosomal genes in the mouse. Adv. Cancer.Res. 44:295-336.Kuff; E. L, A. Feenstra, K. Lueders, G. Rechavi, D. Givol, and E. Canaani. 1983.Homology between an endogenous viral LTR and sequences inserted in an activatedoncogene. Nature. 302:547-548.Kuff, E. L. and J. W. Fewell. 1985. Intracisternal A particle expression in normalmouse thymus tissue: Gene products and strain related variability. Mol. Cell. Biol.5:474-483.53Kuff, E. L., K. K. Lueders, H. L. Ozer, and N. A. Wivel. 1972. Some structural andantigenic properties of intracisternal A particles occuring in mouse tumors. Proc.NatL Acad. Sci. USA. 69:218-222.Kuff., E. L. and K. K. Lueders. 1988. The intracisternal A particle gene family:Structural and functional aspects. Adv. Cancer. Res. 51:183-276.Landau, N. R., T. P. St John, I. L. Weissman, S. C. Wolf, A. E. Silverstone, and D.Baltimore. 1984. Cloning of terminal transferase cDNA by antibody screening. Proc.Nati. Acad. Sci. USA. 81:5836-5840.Larsson, E., N. Kato, and M Cohen. 1989. Human endogenous proviruses. Curr.Top. Microbiol. Immunol. 148:115-132.Larsson, E., B. 0. Nilsson, and S. Widehn. 1981. Morphological and microbiologicalsigns of endogenous C-virus in human oocytes. mt. J. Cancer. 28:55 1-557.Lee, J. 1989. Molecular organization and regulation of the murine interleukin-3gene. Ph.D. thesis. Australian National University.Leib-Mosch, C. R. Brack, T. Werner, V. Erfie, and R. Hehlmann. 1986. Isolation ofan SSAV-related endogenous sequence from human DNA. Virology. 155:666-677.Leib-Mosch, C., R. Brack-Werner, T. Werner, 1W. Bachznann, 0. Faff, V. Erfie, andR. Helhmann. 1990. Endogenous retroviral elements in human DNA. Cancer. Res.(Suppl). 5O:5636s-5642s.Lehrman, M. A., J. L. Goldstein, D. W. Russell, and M. S. Brown. 1987a. Duplicationof seven exons in LDL receptor gene caused by Alu-Alu recombination in a subject withfamilial hypercholesterolemia. Cell. 48:827-835.Lueders, K., J. Fewell, E. Kuff, and T. Koch. 1984. The long terminal repeat of anendogenous Intracisternal A particle gene functions as a promoter when introduced toeukaryotic cells by transfection. Mol. Cell. Biol. 4:2128-2135.Lin, Y-S. and M. R. Green. 1991. Mechanism of action of an acidic transcriptionalactivator in vitro. Cell. 64:971-981.Liu, A. Y. and B. Abraham. 1991. Expression of a hybrid human endogenousretrovirus and calbindin gene in a prostate cell line. Cancer Res. 51:4107-4110.Lower, J., E. 1W. Wondrak, and K Kurth. 1987. Genome analysis and reversetranscriptase activity of human teratocarcinoma derived retroviruses. J. Gen. Virol.68:2807-2815.Luria, S. and M. Horowitz. 1986. The long terminal repeat of the intracisternal Aparticle as a target for transactivation by oncogene products. J. Virol. 57:998-1003.Maeda, N. 1985. Nucleotide sequence of the haptoglobin and haptoglobin-related genepair. The haptoglobin-related gene contains a retrovirus-like element. J. Biol. Chem.260:6698-6709.Maeda, N. and H-S. Kim. 1990. Three independent insertions of retrovirus-likesequences in the haptoglobin gene cluster of primates. Genomics. 8:671-683.54Maeda, N. and 0. Smithies. 1986. The evolution of multigene families. Annu. Rev.Genet. 20:81-108.Maeda, S., K. C. Meflors, J. W. Mellors, L. B. Jerabek, and L A. Zervoudakis. 1983.Immunohistologic detection of antigen related to primate type C retrovirus p30 innormal human placenta. Am. J. Pathol. 112:347-356.Mager, D. L. and Freeman, J. D. 1987. Human endogenous retrovirus-like genomewith type C poi sequences and gag sequences related to human T cell lymphotrophicviruses. J. Virol. 61:4060-4066.Mager, D. L. and N. L. Goodchild. 1989. Homologous recombination between theLTRs of a human endogenous retrovirus-like element causes a 5 kb deletion in twosiblings. Am. J. Hum. Genet. 45:848-854.Mager, D. L. and P. S. Henthorn. 1984. Identification of a retrovirus-like repetitiveelement in human DNA. Proc. NatI. Acad. Sci. USA. 81:7510-7514.Majors, J. 1990. The structure and function of retroviral long terminal repeats. Curr.Top. Microbiol. Immunol. 157:49-92.Maniatis, T., S. Goodbourn, and J. A. Fischer. 1987. Regulation of inducible andtissue specific gene expression. Science. 236:1237-1244.Martin, M. A., T. Bryan, S. Rasheed, and A. S. Khan. 1981. Identification andcloning of endogenous retroviral sequences present in human DNA. Proc. Natl.Acad. Sci. USA. 78:4892-4896.Matera, A. G., U. Heilman, M. F. Hintz, and C. W. Schmid. 1990. Recentlytransposed Alu repeats result from multiple source genes. NucI. Acids. Res. 18:60 19-6023.Mathias, S., A. Gabriel, J. Boeke, and A. Scott. 1991. Demonstration of a humanLINE-encoded reverse transcriptase activity. Am. J. Hum. Genet. Suppl. 49:38.McCubrey, J. and R. Risser. 1982. Genetic interactions in induction of endogenousmurine leukemia virus from low leukemic mice. Cell. 28:881-888.McGuire, E. A., R. D. Hockett, K. M. Pollock, M. F. Bartholdi, S. J. O’Brien, and S. J.Korsmeyer. 1989. The t( 11; 14) (p 15;ql 1) in a T-cell acute lymphoblastic leukemia cellline activates multiple transcripts including ttg- 1, a gene encoding a potential zincfinger protein. Mol. Cell. Biol. 9:2124-2132.McKnight, S. L. 1982. Functional relationships between transcriptional controlsignals of the thymidine kinase gene of herpes simplex virus. Cell. 31:355-365.Mietz, J. A., Z. Grossman, K. K. Lueders, and E. L. Kuff. 1987. Nucleotide sequence ofa complete mouse intracisternal A-particle genome: Relationship to known aspects ofparticle assembly and function. J. Virol. 61:3020-3029.Miksicek, R., A. Heber, W. Schmid, U. Daneseh, G. Posseckert, M. Beato, and G.Schultz. 1986. Glucocorticoid responsiveness of the transcriptional enhancer ofMoloney Murine Sacroma Virus. Cell. 46:283-290.55Miller, J., A. D. McLachlan, and A. King. 1985. Repetitive zinc binding domains inthe protein transcription factor lIlA from Xenopus oocytes. Embo J. 4:1609-1614.MinMan, Y., H. Delius, and D. P. Leader. 1987. Molecular analysis of elementsinserted into mouse ‘y actin processed pseudogenes. NucI. Acids. Res. 15:3291-3304.Mitchell, G. A., D. Labuda, G. Fontaine, J. 1W. Saudubray, J. P. Bonnefont, S. Lyonnet,L.C. Brody, G. Steel, C. Obie, and D. Valle. 1991. Splice mediated insertion of an Alusequence inactivates ornithine 6-aminotransferase: A role for Alu elements in humanmutation. Proc. NatI. Acad. Sci. U.S.A. 88:815-819.Mitchell, P.J. and R. Tjian. 1989. Transcriptional regulation in mammalian cellsby sequence specific DNA binding proteins. Science. 254:371-378.Morse, B., P. G. Rothberg, V. J. South, J. M. Spandorfer, and S. M. Astrin. 1988.Insertional mutagenisis of the myc locus by a LINE-i sequence in a human breastcarcinoma. Nature. 333:87-90.Muller, H-P., J. M. Sogo, and W. Schaffner. 1989. An enhancer stimulatestranscription in trans when attached to the promoter via a protein bridge. Cell. 58:767-777.Muller, H-P., and W. Schaffner. 1990. Transcriptional enhancers can act in trans.Trends. Genet. 6:300-304.Mushinski, J. F., M. Potter, S. R. Bauer, and E. P. Reddy. 1983. DNA rearrangementand altered RNA expression of the c-myb oncogene in mouse plasmacytoidlymphosarcomas. Science. 220:795-798.Myers, R. M., K. Tilly, and T. Maniatis. 1986. Fine structure genetic analysis of aglobin promoter. Science. 232:613-618.Neel, B. G. and W. S. Hayward. 1981. Avian leukosis virus-induced tumors havecommon proviral integration sites and synthesize discrete new RNAs:Oncogenesis bypromoter insertion. Cell. 23:323-334.Nourse, J., J. D. Mellentin, N. Galii, J. Wilkinson, E. Stanbridge, S. D. Smith, and1W.. L. Cleary. 1990. Chromosomal translocation t (i;19) results in synthesis of ahomeobox fusion mRNA that codes for a potential chimeric transcription factor. Cell.60:535-545.Nusse, 11., A van Ooyen, D. Cox, Y. K. T. Fung, and IL Varmus. 1984. Mode ofproviral activation of a putative mammary oncogene (mt-i) on mouse chromosome 15.Nature. 307:131-136.O’Brien, S., T. Bonner, M. Cohen, C. O’Connell, and W. Nash. 1983. Mapping of anendogenous retroviral sequence to human chromosome 18. Nature. 303:74-77.O’Connell, C. and M. Cohen. 1984. The LTR sequences of a novel human endogenousretrovirus. Science. 226:1204-1206.56O’Connell, C., S. O’Brien, W. G. Nash, and M. Cohen. 1984. ERV-3, a full lengthhuman endogenous retrovirus: Chromosomal localization and evolutionaryrelationships. Virology. 138:225-235.Ondek, B., A. Shepard, and W. Herr. 1987. Discrete elements within the SV4Oenhancer region display different cell specific enhancer activities. Embo J. 6:1017.Ono, M., K. Kawakami, and H. Ushikubo. 1987. Stimulation of expression of thehuman endogenous retrovirus genome by female steroid hormones in human breastcancer line. J. Virol. 61:2059-2062.Ono, M., T. Yasunaga, T. Miyata, and H. Ushikubo. 1986. Nucleotide sequence ofhuman endogenous retrovirus genome related to the mouse mammary tumor virusgenome. J. Virol. 60:589-598.Orgel, L. E. and F. H. C. Crick. 1980. Selfish DNA: The ultimate parasite. Nature.284:604-607.Owen, R. D., D. M. Bortner, and M. C. Ostrowski. 1990. ras oncogene activation of aVL3O transcriptional element is linked to transformation. Mol. Cell. Biol. 10:1-9.Parsiow, T. G., S. D. Jones, B. Bond, and K. Yamamoto. 1987. The immunoglobulinoctanucleotide: Independent activity and selective interaction with enhancers.Science. 235:1498-1501.Paulson, K. E., N. Deka, C. W. Schmid, R. Misra, C. W. Schindler, M. G. Rush, L.Kadyk, and L. Leinwand. 1985. A transposon-like element in human DNA. Nature.316:359-361.Paulson, K. E., A. G. Matera, N. Deka, and C. W. Schmid. 1987. Transcription of ahuman transposon-like sequence is usually directed by other promoters. NucI. Acids.Res. 15:5199-5215.Payne, G. S., J. M. Bishop, and H. E. Varmus. 1982. Multiple arrangements of viralDNA and an activated host oncogene in bursal lymphomas. Nature. 295:209-2 13.Perkins, A., K. Kongsuwan, J. Visuader, J. Adams, and S. Cory. 1990. Homeoboxgene expression plus autocrine growth factor production elicits myeloid leukemia.Proc. Nati. Acad. Sci. USA. 87:8398-8402.Peters, G., M. Placzek, S. Brookes, C. Kozak, R. Smith, and C. Dickson. 1986.Characterization, chromosome assignment, and segregation analysis of endgenousproviral units of mouse mammary tumor virus. J. Virol. 59:535-544.Ptashne, M. 1986. Gene regulation by proteins acting nearby and at a distance.Nature. 322:697-701.Ptashne, M. 1988. How eukaryotic transcriptional activators work. Nature. 335:683-689.Rabson, A., Y. Hamagishi, P. Steele, M. Tykocinski, and M. Martin. 1985.Characterization of human endogenous retroviral envelope RNA transcripts. J. Virol.56:176-182.57Rabson, A., P. Steele, C. Garos, and M. Martin. 1983. mRNA transcripts related tofull length endogenous retroviral DNA in human cells. Nature. 306:604-607.Repaske, R., H.. O’Neill, P. Steele, and M. Martin. 1983. Characterization of partialnucleotide sequence of endogenous type C retrovirus segments in human chromosomalDNA. Proc. Natl. Acad. Sci. USA. 80:678-682.Repaske, R., P. Steele, H.. O’Neill, A. Rabson, and M. Martin. 1985. Nucleotidesequence of a full length human endogenous retroviral segment. J. Virol. 54:764-772.Risser, R., J.M. Horowitz, and J. McCubrey. 1983. Endogenous mouse leukemiaviruses. Annu. Rev. Genet. 17:85-121.Roehford, H.., B. A. Campbell, and L. P. Villarreal. 1987. A pancreas specificityresults from the combination of polyomavirus and Moloney murine leukemia virusenhancer. Proc. NatI. Acad. Sci. USA. 84:449-453.Rogers, J. H. 1985. The structure and evolution of retroposons. mt. Rev. Cytol. 93:23 1-279.Rotman, G., A. Itin, and E. Keshet. 1986. Promoter and enhancer activities of longterminal repeats associated with cellular retrovirus-like (VL-30) elements. Nucl.Acids. Res. 14:645-658.Sertling, E., M. Jasin, and W. Schaffner. 1985. Enhancers and eukaryotic genetranscription. Trends. Genet. 1:224-230.Shapiro, J. A. 1983. “Mobile Genetic Elements”. Acad. Press. N.Y.Sharp, P. A. 1991. TFIIB or not TFIIB? Nature. 351:16-18.Shen-Ong, G. L. C., E. P. Reddy, M. Potter, and J. F. Mushinski. 1984. Disruptionand activation of the c-myb locus by M-MuLV insertion in plasmacytoidlymphosarcomas induced by Pristane and Abelson viruses. Curr. Top. Microbiol.Immunol. 113:41-46.Shen-Ong, G. L., H. C. Morse ifi, M. Potter, and J. F. Mushinski. 1986. Two modes ofc-myb activation in virus-induced mouse myeloid tumors. Mol. Cell. Biol.. 6:380-392.Shih, A., E. Coutavas, and M. G. Rush. 1991. Evolutionary implications of primateendogenous retroviruses. Virology. 182:495-502.Shih, A., R. Misra, and M. G. Rush. 1989. Detection of multiple novel reversetranscriptase coding sequences in human nucleic acids: Relation to primateretroviruses. J. Virol. 63:64-75.Singh, K., S. Saragusti, and M. Botchan. 1985. Isolation of cellular genesdifferentially expressed in mouse NIH 3T3 cells and a simian virus 40 transformedderivative: Growth specific regulation of VL3O genes. Mol. Cell. Biol. 5:2590-2598.Skowronski, J. and M. F. Singer. 1985. Expression of cytoplasmic LINE-i transcriptis regulated in a human teratocarcinoma cell line. PNAS. 82:6050-6054..58Skowronski, J., T. J. Fanning, and M. F. Singer. 1988. Unit length Line-itranscripts in human teratocarcinoma cells. Mol. Cell. Biol. 8:1385-1397.Smale, S. T., and D. Baltimore. 1989. The “initiator” as a transcription controlelement. Cell. 57:103-113.Srinivasan, A., E. P. Reddy, C. Y. Dunn, and S. A. Aaronson. i984. Moleculardissection of transcriptional control elements within the long terminal repeat of theretrovirus. Science. 223:286-289.Stavenhagen, J. B. and D. M. Robins. 1988. An ancient provirus has imposedandrogen regulation on the adjacent mouse sex-limited protein gene. Cell. 55:247-254.Steele, P., A. Rabson, T. Bryan, and M. Martin. 1984. Distinctive terminicharacterize two families of human endogenous retroviral sequences. Science.225:943-947.Stocking, C., C. Loliger, M. Kawai, S. Suciu, N. Gough, and W. Ostertag. 1988.Identification of genes involved in growth autonomy of hemopoietic cells by analysisof factor independent mutants. Cell. 53:869-879.Stoppa-Lyonnet, D., P. E. Carter, T. Meo, and M. Tosi. 1990. Clusters of intragenicAlu repeats predispose the human Cl inhibitor locus to deleterious rearrangements.Proc. Nati. Acad. Sci. USA. 87:1551-1555.Stoye, J. P. and J. M. Coffin. 1987. The four classes of endgenous murine leukemiaproviruses. Genetics. 124:221-236.Stoye, J. P. and J. M. Coffin. 1988. Polymorphism of murine endogenous provirusesrevealed by using virus class specific oligonucleotide probes. J. Virol. 62:168-175.Stoye, J. P., S. Fenner, G. E. Greenoak, C. Moran, and J. M. Coffin. 1988. Role ofendogenous viruses as mutagens.: The hairless mutation of mice. Cell. 54:383-39 1.Suni, J. T. Wahistrom, and A. Vaheri. 1981. Retrovirus p30-related antigen inhuman syncytiotrophoblasts and IgG antibodies in cord-blood sera. mt. J. Cancer.28:559-566.Swergold, G. D. 1990. Identification, characterization, and cell specificity of a humanLINE-i promoter. Mol Cell. Biol. 10:6718-6729.Takahashi , K., M. Vigeron, H. Mattes, A. Wildeman, M. Zenke, and P. Chambon.1986. Requirement of stereospecific alignments for initiation from the simian virus 40early promoter. Nature 319:121-126.Tchenio, T. and T. Heidman. 1991. Defective retroviruses can disperse in the humangenome by intracellular transposition. J. Virol. 65:2113-2118.Temin, H. M. 1981. Structure , variation, and synthesis of retrovirus long terminalrepeat. Cell. 27:1-3.Temin,H. M. 1982. Function of the retrovirus long terminal repeat. Cell. 28:3-5.59Temin, H. M. and W. Engels. 1984. “Movable genetic elements and evolution”. In:Evolutionary Theory: Paths into the future. Ed. J. W. Pollard. John Wiley and Sons,Ltd. pp 173-201.Treisman,R. and T. Maniatis. 1985. Simian virus 40 enhancer increases number ofRNA polymerase II molecules on linked DNA. Nature. 315:72-75.Trusko, S. P., E. K. Hoffman, and D. L. George. 1989. Transcriptional activation ofcKi-ras protooncogene resulting from retroviral promoter insertion. Nuci. Acids.Res. 17:9259-9265.van Lohuizen, M. and A. Berns. 1990. Tumorigenesis by slow transformingretroviruses: An update. Biochim. Biophys. Acta. 1032:213-235.van Lohuizen, M., M. Breur, and A. Berns. 1989. N-myc is frequently activated byproviral insertions in MuLV induced T cell lymphomas. Embo. J. 8:133-136.van Nie, R. A. A. Verstreaten, and J. de Moes. 1977. Genetic transmission ofmammary tumor virus by GR mice. mt. J. Cancer. 19:383-390.Varmus, H. 1982. Form and function of retroviral proviruses. Science. 216:812-820.Varmus, H. E. 1983. “Retroviruses” In: Mobile genetic elements. J. A. Shapiro (ed).Acad. Press. New York.Verstraeten, A. A. and R. van Nie. 1978. Genetic transmission of mammary tumorvirus in the DBAf mouse strain. mt. J. Cancer. 21:473-475..Voliva, C. F., C. L. Jahn, M. B. Comer, C. A. Hutchison m, and M. A. Edgell. 1983.The L1Md Long interspersed repeat family in the mouse: Almost all examples aretruncated at one end. Nucl. Acids. Res. 11:8847-8859.Webster, N. J. G., S. Green, J. IL Jin, and P. Chambon. 1988. The hormone-bindingdomains of the estrogen and glucocorticoid receptors contain an inducibletranscription activation function. Cell. 54:199-207.Weiher, H., M. Konig, and P. Gruss. 1983. Multiple point mutations affecting thesimian virus 40 enhancer.Science. 219:626-63 1..Weiner, A. M., P. L. Deininger, and A. Efstratiadis. 1986. Nonviralretrotransposons: Genes, pseudogenes, and transposable elements generated by thereverse flow of genetic information. Ann. Rev. Biochem. 55:631-661.Wilkinson, D. A., Freeman, J. D., Goodchild, N. L., Kelleher, C. A., and Mager, I). L.1990. Autonomous expression of RTVL-H endogenous retrovirus-like elements inhuman cells. J. Virol. 64:2157-2167.Wilkinson, D. A. and D. L. Mager. 1990. Expression of the putative retrotransposon,RTVL-H is repressed by retinoic acid induced differentiation of humanteratocarcinoma cell lines. Proceedings of the 6th International Conference onDifferentiation of Normal and Neoplastic Cells; pp53.60Ymer, S. W. Q. J. Tucker, C. J. Sanderson, A. J. Hapel, H. D. Campbell, and L G.Young. 1985. Constitutive synthesis of interleukin 3 by leukemia cell line WEHI-3Bis due to retroviral insertion near the gene. Nature. 317:255-258.61CHAPTER IIMaterials and MethodsDESCRIPTION OF LTRSThe RTVL-H LTRs used in this study were obtained from either genomic or cDNAlibraries (see figure 3-la). In each case, the portion of the LTR analyzed extended from ahighly conserved Stul restriction enzyme site, eight bp downstream of the start of the LTR, to aslightly variable point near the polyadenylation (poly A) signal. The genomic clonescontaining the 3’ LTR from the RTVL-Hl element (3’Rl) and the 5’ LTR from the RTVL-H2element (5’R2) have previously been described (Mager and Henthorn, 1984; Mager andFreeman, 1987). Both of these LTRs have a Hindill site three bp downstream of the poiy Asignal and were isolated as StuTJHindIII fragments. The PB-3, NlO-14, 3’Hb, H6, and H7LTRs were isolated from cDNA libraries. Except for 3’Hb, each of these LTRs had functionedto polyadenylate transcripts (see below), and terminated in a poly A tail that was removedbefore the LTRs were tested for promoter activity. Each of these LTRs was subcloned into theplasmid pUC18 and the poly A tail was deleted, either by digestion with an appropriaterestriction enzyme or by exonuclease III (exo III) digestion, followed by isolation of the LTRfragment via digestion with Stul and a second enzyme specific to the pUC18 polylinker, asdescribed below. For this reason, each LTR subcloned into the CAT construct contains a fewnucleotides of downstream pUC18 polylinker sequences. The PB-3 LTR was isolated from ahuman phytohemagglutinin stimulated peripheral blood cDNA library and had functioned topolyadenylate a non-RTVL-H related transcript (Mager, 1989). The region used in thisanalysis was isolated from a pUC 18-PB-3 construct, after exo III treatment, by a Stul/Hindlildigestion. It extended to a position one bp downstream of the poly A signal and contained six bp62of plasmid derived sequence. The N10-14 LTR is the 3’ polyadenylated LTR from anNTera2Dl eDNA clone that appears to represent a spliced RTVL-H transcript (Wilkinson etal., 1990). The LTR fragment used in this study was isolated from a pUC18-N10-14 constructby a StuTJSphI digest and extended from the 5’ Stul site to the first A of the poiy A signal followedby two bp of plasmid sequence. The H6 LTR is the 3’ polyadenylated LTR isolated from a Hep2cDNA clone that contains approximately 500 bp of upstream internal RTVL-H sequences(Mager, 1989). The portion of the LTR analyzed extended from the Stul site to one bpdownstream of the poly A signal. It contained seven bp of polylinker sequences generatedfrom a StuIIBamHI digestion. The 3’Hb LTR is the 3’ LTR from RTVL-H cDNA clone cH-4,reported previously (Mager, 1989). The portion of this LTR analyzed was isolated by aStuI/Ksp6321 digest of LTR sequences and extended 42 bp 3’ of the poly A signal. The H7 LTRis from cDNA clone cH-7 and had functioned to polyadenylate a heterologous transcript(Mager, 1989). A StuIfBstNI H7 LTR fragment was subcloned into pUC18 and the fragmentused in construction of the 117-CAT vector was then isolated via a BamHI/PstI double digest.This fragment contained 14 bp of plasmid sequences upstream of the Stul site and 4 bp ofplasmid sequences downstream of the Stul site.PLASMID CONSTRUCTIONSThe expression vectors pSVAOCAT(X), pSV2ACAT, pSVAOCAT(LR),pSV232ACAT(LR), pSVAf3GCAT(X), and pSVAGCAT(LR) have previously been described(Kadesch and Berg, 1986; Kadesch et al., 1986; Henthorn et al., 1988) and were kindly providedby Dr. Paula Henthorn and Dr. Tom Kadesch. Test vectors were constructed by blunt endligation of each LTR fragment into the Hindill site of pSVAOCAT(X) or pSVAOCAT(LR) (asshown in figure 3-1) or into the XbaI site of pSVAGCAT(X) (figure 3-7) in both the forward andreverse orientations. The H6-46-CAT and PB3-45-CAT deletion constructs were generatedusing exo III digestions of the H6 and PB-3 LTRs (indicated in figure 4-4a). The deleted LTRfragments were then blunt end ligated into the Hindill site of pSVAOCAT(X). The SV4O largeand small T expression vectors pRSVT/t and pSV-t/cDNA (Loeken et al, 1988) were kindly63provided by Dr. Mary Loeken. The pSVw2At plasmid vector was constructed by deleting thesmall t coding region of pSV-tJcDNA and was also provided by Dr. Mary Loeken.CELL TINESThe adenovirus-transformed human embryonal kidney cell line 293 was obtainedfrom the American Type Culture Collection. Hep2 cells (a subline of HeLa; Lavappa, 1978)were obtained from Dr. Paula Henthorn. The human teratocarinoma cell line NTera2Dl(Andrews et al., 1984) was provided by Dr. Peter Andrews. Mouse Ltk- and NIH 3T3, and COS1 (African green monkey kidney cells transformed by SV4O) cells were provided by Dr. KeithHumphries. The CV-1 cell line (the non-transformed parent of COS-1 cells) was obtainedfrom Dr. Jurgen Vielkind. All cells were grown in Dulbecco Modified Eagle Mediumsupplemented with either 10% horse serum (293 cells) or 10% fetal calf serum (others) and weremaintained at 37°C with 5% Co2.TRANSFECTIONSSubconfluent cell cultures were transfected with supercoiled plasmid DNA by calciumphosphate precipitation (Graham and van der Eb, 1973), as follows. For CAT assays, cesiumchloride purified plasmid DNA (10 p.g, unless otherwise indicated in the figure legend) wasresuspended in 500 p.1 of 0.25 M CaCl2. The DNA solution was added, dropwise, to 500 p.1 of 2 xHBS buffer (50 mM Hepes, 280 mM NaCl, 1.5 mM Na2HPO4, pH 7.12) and was mixed bygently bubbling N2 through the HBS/DNA mixture. The resulting solution was then incubatedat room temperature for 30 minutes, and added directly to the subconfluent cell culture (3 x 106cells/lOmm tissue culture dish for 293 cells, 106 cells/dish for all other cell types) that had beenplated 16 hours earlier. For RNA preparations, cells were transfected as outlined as above,except that 150 x 25 mm tissue culture dishes were transfected with 150 p.g of plasmid, in a totalvolume of 5 ml. After transfection, the cells were incubated at 37 °C with 5% C02. Freshmedia was added to the cells 24 hours post transfection. In most cases, cells were harvested 48hours post-transfection. When co-transfected with an expression vector, however, cells wereharvested 72 hours post-transfection.64GENERATION OF PB-S CAT CElL LINESA 10 fold excess of PB-3CAT was co-transfected into CV-1 cells with pSV2neo (Southernand Berg, 1982). Colonies were selected for ability to grow in G418 medium (1 mg/mI) andDNA from the resulting cell lines was analyzed by Southern blot hybridization for the presenceof integrated PB-3CAT sequences.CAT ASSAYSCell extracts were assayed for CAT activity by the method of German et al., (1982).Cells were harvested 44-50 hours post single transfection or 70-74 hours post co-transfection,resuspended in 100 ii of 0.25 M Tris and lysed by repeated freeze/thawing. Cell debris wasremoved by centrifugation (at 10K, for 5 minutes in an eppendorf tabletop microfuge). Theresulting extract was then assayed for CAT activity at 37 °C for 30 - 60 minutes, with 20 il of asolution of 4 mM acetyl coenzyme A and 0.2 pCi of 14C chloramphenicol used as a substrate. Inthe case of figure 4-6, 30 p1 aliquots of the 3’ Hb reaction were assayed at 15 minute intervals toensure that the measurements were within the linear range of CAT activity. Percentageconversion of chloramphenicol into its acetylated forms was determined by scintillationcounting of the radioactive spots after their resolution by thin layer chromatography andautoradiography. Experiments were repeated 3 to 8 times and average percentage conversionvalues determined.RNA ISOLATIONCells from subconfluent 150 x 25 mm tissue culture dishes were harvested bytrypsinization and resuspended in 8 ml of GIT buffer (4M guanidine isothiocyanate; 0.25 Msodium acetate, pH 6.0; 0.8 % -mercaptoethanol; Davis et al., 1986). Nuclear DNA wassheared by passing the suspension through a 23-gauge needle repeatedly, until it was no longerviscous (10 - 15 times). The lysed cell suspension was then layered over 4 ml of CsC1 buffer (5.7M CsC1; 0.25 M sodium acetate, pH 6.0) and ultracentrifuged overnight at 32,000 rpm in aBeckman SW41 rotor. After centrifugation, the supernatant was removed by suction and theRNA pellet was resuspended in 0.3 M sodium acetate and ethanol precipitated.65PRIMER EXTENSION ANALYSISThe oligonucleotides indicated in figure 3-5b was 5’ end labelled and primer extensionanalysis was performed (Paulson et al., 1987) with 50 jig total cellular RNA and 2 x i05 cpm of5’ end labelled oligonucleotide primer, in the following way. The end labelled primer (in 10 jilof buffer A [300 mM NaCI; 10 mM Tris, pH 7.5, 1 mM EDTAJ ) was added to the dried downRNA sample and hybridized at 48 °C for 1.5 hours. The sample was then ethanol precipitatedand resuspended in 25 p1 of buffer B [60 mM KCI; 25 mM Tris, pH 8.3; 10 mM MgCl, 30 mM f3-mercaptoethanol; 1 mM each dNTP, 2 units4tl RNA guard (RNAse inhibitor obtained fromPharmacia)]. Reverse transcriptase (15 -30 units) was added and incubated at 42 °C for 1 hour.The products were treated with 300 pg/mI RNAse A at 40 °C for 60 minutes, phenol/chloroformextracted, ethanol precipitated, and run on a 6% acrylamide sequencing gel alongside theappropriate plasmid which had been sequenced by a modification of the chain terminationmethod (Tabor and Richardson, 1987) using the same oligonucleotide primer.PROBESProbes used were either isolated fragments or whole plasmids and are indicated in theappropriate figure legend. The unincorporated plasmid DNA used in figure 4-5 was isolatedas described in Kay and Humphries (in press). The RTVL-H fragments used as probes areshown in figures 5-1 and 5-8. The U3 specific probe is a 323 base pair (bp) Stu I/Sph I fragmentof the H6 LTR (Mager, 1989). The U5 specific probes were described previously (Mager, 1989)andare a 108 bp HindIII/SspI fragment isolated from RTVL-H2 (Mager and Freeman, 1987)and a 102 bp Hind Ill/Dra I fragment isolated from the 3’ 1 element, RTVL-H1, (Mager andHenthorn, 1984). Since these probes contain the 3’ 58 bp of an RTVL-H LTR but different 3’genomic flanking sequences, only clones hybridizing to both were classified as U5 positives.The internal RTVL-H probes used were: 1-a 140 bp XmnL’NdeI fragment from RTVL-H1 ; 2- a1 kb BglII fragment, 3- a 900 bp Hindlil/EcoR fragment, 4- a 1 kb EcoRI fragment, 5- an 800 bpStu I fragment, 6- a 265 bp NcoI/OxaNI fragment mapping 140 bp downstream of the splicedonor site, and 7- a 210 bp BglIIJHindIII fragment mapping to immediately downstream of an66observed cluster of RTVL-H acceptor sites (Wilkinson et al., 1990) all isolated from RTVLH2. The non RTVL-H probes used are described in the appropriate figure legends. All probeswere radioactively labelled by the random primer method (Feinberg and Vogelstein, 1983).SOUTHERN AND NORTHERN ANALYSISFor Southern blot analysis, DNA was restriction digested, electrophoresed in 0.8 - 1%agarose gels, transferred to either nitrocellulose or Zetaprobe (Biorad) membranes by themethod of Southern (1975) and the membranes either baked or fixed by exposure to U. V. light.Transfers of cloned DNA were hybridized in 6 x SSC (0.9 M NaCI, 0.09 M Na citrate), 1 xDenhardts solution (0.02% ficoll, 0.02% bovine serum albumin, 0.02% polyvinyl pyrrolidone),and 1% Sodium dodecyl sulphate (SDS) in the presence of 3 x i05 cpm/ml of radioactive probe.Genomic Southerns were hybridized in 3 x SSC, 2% SDS, 20 mM sodium phosphate, pH 6.8, and10 x Denhardts solution in the presence of 500 p.g/ml denatured salmon sperm and 2 -6 x 106cpmlml radioactive probe. Radioactive probe and salmon sperm carrier DNA was denaturedby the addition of 1110 volume of 0.4 M NaOH at room temperature for 5 minutes, followed byneutralization with an equal amount of 0.4 N HC1. Post hybridization washes were at 65 °C in3 x SSC, 1% SDS. In some cases, a final wash at 0.1 x SSC, 0.5% SDS at 65 0C was performed.For Northern analysis, total cellular RNA was electrophoresed in 1% agarose wickgels (containing 0.66 M formaldehyde, 2 x MOPS buffer (0.04 M 3-[N-morpholino]propanesulphonic acid; 0.01 M sodium acetate; 0.002 M EDTA)) at 1V/cm andtransferred to Zetaprobe membrane in 10 X SSC. Hybridizations were at 65 °C in 5 x Denhart’ssolution; 1.5 x SSPE [3 M NaC1; 0.2 M NaH2PO4;0.02 M EDTA], 1 % SDS in the presence ofsalmon sperm DNA and radioactive probe, as described for genomic Southerns [above]. Finalpost hybridization washes were at 50 °C, in 0.1 x SSC, 0.5 % SDS.LIBRARY SCREENINGA ).gtlO NTera2Dl cDNA library (kindly provided by Drs. Maxine Singer and RonaldThayer; [Skowronski et al., 1988]) was differentially screened (as described in chapter 5 andin Sambrook et al., [1989]) with the probes shown in figures 5-1 and 5-8, and described above.67Hybridization conditions were: 65 °C in 6 x SSC, 1 x Denhardts solution, 1% SDS. Posthybridization washes were at 65 °C in 3 x SSC, 1% SDS. Phage clones of interest were plaquepurified and eDNA inserts were subcloned into pUC 18 or Bluescript plasmid vectors.DNA SEQUENCING AND COMPUTER ANALYSISThe appropriate fragments were subcloned into pUC18 and sequenced using thedideoxy chain termination method of Tabor and Richardson (1987), as follows. The denaturedtemplate (1 p.g) was annealed with the prepared primer (2 ng) in 5 p.1 water, 1 p.1 annealingbuffer (280 mM Tris, pH 7.5; 100 mM MgC12;350 mM NaC1) at 37 °C for 15 - 20 minutes. Thesample was then added to 0.6 p.1 labelling mix (2 mM of each dNTP, in 10 mM Tris, pH 7.5), 0.5p.1 300 mM dithiothreitol, 0.3 p.1 32P-dCTP (3000 Ci/mmol), and 1.5 units T7 DNA polymerase atroom temperature for 3 minutes. 2.25 p.1 of the above solution was then added to 1.25 p.1 of eachtermination mix (150 jiM of each dNTP plus 15 p.M of the appropriate ddNTP; all in 40 mMTris, pH 7.5; 10 mM MgC1; 50 mM NaCI) in tubes prewarmed to 37 °C for 6 minutes. Thereaction was terminated by the addition of 8 p.1 of formamide loading buffer (80% deionizedformamide; 10mM EDTA, pH 8.0; 1 mg/mI bromophenol blue, and 1 mg/mI xylene cyanol)and the samples were run on a 6% acrylamide sequencing gel. Sequences were analyzedusing software provided by the Genetics Computer Group (Devereux et al., 1984).68REFERENCESAndrews, P. W., L Dainjanov, D. Simon, G. S. Ranting, C. Carlin, N. C. Dracopoli,and J. Fogh. 1984. Pleuripotent embryonal carcinoma clones derived from the humanteratocarcinoma cell line Tera-2. Lab. Invest. 50:147-162.Davis, L. G., M. D. Dibner, and J. F. Battey. 1986. In: Basic Methods in MolecularBiology. New York, New York: Elsevier.Devereux, J., P. Haeberli, and 0. Smithies. 1984. A comprehensive set of sequenceanalysis programs for the VAX. Nucl. Acids. Res. 12:387-395.Feinberg, A. P. and Vogeistein, B. 1983. A technique for radiolabelling DNArestriction endonuclease fragments to high specific activity. Anal. Biochem. 132:6-13.Gorman, C. M., L. F. Moffat, and B. H. Howard. 1982. Recombinant genomes whichexpress chloramphenicol acetyltransferase in mammalian cells. Mol. Cell. Biol.2:1044-1051.Graham, F. and A. van der Eb. 1973. A new technique for the assay of infectivity ofhuman adenovirus-5 DNA. Virology. 52:456-457.Henthorn, P., P. Zervos, M. Raducha, H. Harris, and T. Kadesch. 1988. Expression ofa human placental alkaline phosphatase gene in transfected cells: Use as a reporter forstudies of gene expression. Proc. Nati. Acad. Sci. USA. 85:6342-6346.Kadesch, T. and P. Berg. 1986. Effects of the position of the simian virus 40 enhanceron expression of multiple transcription units in a single plasmid. Mol. Cell. Biol.6:2593-2601.Kadesch, T., P. Zervos, and D. Ruezinsky. 1986. Functional analysis of the murineIgH enhancer: Evidence for negative control of cell type specificity. Nucl. Acids. Res.14:8209-8221.Kay, R., and R. K. Humphries. 1991. New vectors and procedures for isolating cDNAsencoding cell surface proteins by expression cloning in COS cells. Methods. Mol.Cell. Biol. 2: 254-265.Lavappa, K. S. 1978. Survey of ATCC stocks of human cell lines for HeLacontamination. In Vitro. 14:469-475.Loeken, M., I. Bikel, D.M. Livingston, and J. Brady. 1988. Trans-activation of RNApolymerase II and III promoters by SV4O small t antigen. Cell. 55:1171-1177.Mager, D. L. 1989. Polyadenylation function and sequence variability of the longterminal repeats of the human endogenous retrovirus-like family RTVL-H.Virology. 173:591-599.Mager, D. L. and J. I). Freeman. 1987. Human endogenous retroviruslike genomewith type C poi sequences and gag sequences related to human T cell lymphotropicviruses. J. Virol. 61:4060-4066.69Mager, D. L. and P. S. Henthorn. 1984. Identification of a retrovirus-like repetitiveelement in human DNA. Proc. Nati. Acad. Sci. USA. 81:7510-7514.Paulson, K. E., A. G. Matera, N. Deka, and C. W. Schmid. 1987. Transcription of ahuman transposon-like sequence is usually directed by other promoters. Nuci. Acids.Res. 15:5199-5215.Sambrook, J., E. F. Fritsch, and T. Maniatis. 1989. Molecular cloning: A laboratorymanual. 2nd ed. Cold Spring Harbor Laboratory, Cold Spring Harbour, N. Y.Skowronski, J., T. G. Fanning, and M. F Singer. 1988. Unit length line-i transcriptsin human teratocarcinoma cells. Mol. Cell. Biol. 8:1385-1397.Southern, E. M. 1975. Detection of specific sequences among DNA fragmentsseperated by gel electrophoresis. J. Mol. Biol. 98:503-5 17.Southern, P. J. and P. Berg. 1982. Transformation of mammalian cells to antibioticresistance with a bacterial gene under control of the SV4O early region promoter. J.Mol Appl. Genet. 1:327-341.Tabor, S. and C. C. Richardson. 1987. Sequence analysis with a modifiedbateriophage T7 DNA polymerase. Proc. Nati. Acad. Sci. USA. 84:4767-4771.Wilkinson, D. A., J. D. Freeman, N. L. Goodchild, C. A. Kelleher, and D. L. Mager.1990. Autonomous expression of RTVL-H endogenous retroviruslike elements inhuman cells. J. Virol. 64:2157-2167.V 70CHAPTER IIIFunctional Heterogeneity of a Large Family of Human LTR-likePromoters and EnhancersApaper ofthe same title, byA. Feuchter andD. Mager was written fmin this chapterand has beenpublished in Nucleic Acids ResearchINTRODUCTIONIn recent years, several classes of elements structurally related to integratedretroviruses (proviruses) or retrotransposons have been discovered in the human genome (forreview, Cohen and Larsson, 1988). In many cases, the LTR-like sequences associated withthese elements contain potential transcriptional regulatory sequences that are analogous tothose found in “true” retroviral LTRs (Steele et al., 1984; O’Connel and Cohen, 1984; Ono etal., 1986). These findings raise the possibility that human endogenous LTR-like sequencesmay function to regulate the expression of linked genes. Such genes could be those containedwithin the retrovirus-like element itself or they could be adjacent cellular genes. Indeed, inmouse cells, several genetic rearrangements involving LTR-like sequences have beenimplicated in gene inactivation (by insertional mutagenesis [Hawley et al., 1982; Mm Man etal., 19871) and in gene activation (by acting as mobile promoters or enhancers [Cohen et al.,1983; Gattoni-Celli et al., 1983; Ymer et al.,1985; Kongsuwan et al., 1989]).A proviral LTR contains three functional regions U3, R, and U5. U3 and U5 areunique sequences derived from the 3’ and 5’ ends of the viral RNA, while R is a short sequencepresent at both termini of the RNA genome. The strongest viral transcriptional enhancersequences are typically found in U3, as are transcriptional promoter signals such as theCCAAT and TATAA boxes which are located approximately 75 and 22-26 bp upstream of thetranscriptional start site, respectively. The transcriptional start site, usually GC, defines the71beginning of the R region which, in mammalian retroviruses, also contains thepolyadenylation signal followed 15-20 bp downstream by the polyadenylation site, CA (LTRstructure is reviewed in Temin, 1981; Varmus, 1982; Chen and Barker, 1984). LTRs are foundnot only in proviruses but also in retrotransposons such as Ty in yeast (Kingsman andKingsman, 1988) and copia in Drosophila melanogaster (Finnegan, 1985) where they servesimilar functions.RTVL-H LTRs contain direct repeats, a TATAA box, and a polyadenylation signal inpositions analogous to those found in prototypical LTRs. RNAs homologous to RTVL-Hsequences are found in certain human cells (Johansen et al., 1989; Wilkinson et al., 1990) andRTVL-H LTRs promote the endogenous expression of these elements (Wilkinson et al., 1990).These observations indicate that RTVL-H LTRs contain transcriptional regulatory sequencesthat function in particular cell types. Since the human genome contains such large numbersof these LTRs, it is of interest to determine the functional heterogeneity and cell typespecificities of these elements. In this study, the ability of different RTVL-H LTRs to driveexpression of the reporter gene chioramphenicol acetyltransferase (CAT) has been measuredin various mammalian cells. The five LTRs tested here had widely different abilities topromote CAT gene expression, could be activated by heterologous enhancer sequences andcould also act to enhance a heterologous promoter. RTVL-H LTRs thus constitute a largefamily of endogenous sequences with the potential to affect human gene expression.RESULTSRTVL-H LTR PromoterActivities Are HeterogeneousTo examine the functional capacities of different RTVL-H LTRs, a system in whichthe bacterial gene coding for CAT can be placed under the transcriptional control ofheterologous promoter sequences was used. Because eukaryotic cells do not contain CAT, thelevel of this enzyme in cells transfected with a CAT plasmid is a quantitative measurement ofthe strength of the promoter under consideration. The LTRs used in this analysis and the72construction of the vectors used is outlined in figure 3-1. The parent plasmid pSVAOCAT(X) ispromoterless, with the CAT coding sequences followed by SV4O early polyadenylation signals.An SV4O poly A signal dimer (Kadesch and Berg, 1986) has been inserted as indicated toprevent “readthrough” transcription from cryptic promoters located elsewhere in the plasmid.pSVAOCAT(LR) is a modification of pSVAOCAT(X) in which the SV4O early enhancer region,LR, (Kadesch and Berg, 1986) has been inserted downstream of the CAT gene as shown. Thevectors pSV2ACAT and pSVA232CAT(LR) (not shown) were used as positive controls forexperiments involving the CAT(X) and CAT(LR) constructs, respectively. The SV4O earlypromoter/enhancer region directs transcription in both of these vectors, however in the232ACAT(LR) construct the enhancer sequences have been deleted from the promoter regionand replaced by the LR segment, as indicated in figure 3-lb.A unique Hindlil site located —35 bp upstream of the CAT initiation codon was used toinsert the portions of the LTRs shown, in both the forward (0 and reverse (r) orientation withrespect to analogy with other LTRs (figure 3-1). The LTRs tested were derived from genomicor cDNA libraries as shown and as described in Chapter 2. Two randomly isolated genomicLTRs as well as LTRs derived from transcripts expressed in three different cell types (figure3-la) were analyzed to increase the probability of detecting LTRs with different functionalcapabilities. In the case of the cDNA clones, LTR containing segments were isolated from the3’ end of the transcript for analysis. These sequences should be similar, if not identical, to thelinked 5’ LTR, as the 5’ and 3’ LTRs of intact RTVL-H elements show an average of 96%sequence identity while unlinked RTVL-H LTRs range from between 75-95% homologous(Mager, 1989). The U3 region of the 5’ LTR most probably acted as the promoter for two of thesetranscripts (see below) but would not be present in the cDNA clone used because transcriptioninitiates at the 5’ boundary of R (reviewed in Chapter 1; Temin, 1981; Varmus, 1982; Chen andBarker, 1984). The NTera2Dl eDNA clone from which the NlO-14 LTR was isolatedcorresponds to a transcript which initiated in the 5’ LTR (Wilkinson et al., 1990; figure 3-la).The 5’ terminus of this cDNA clone mapped to 40 bp downstream of the TATAA box in the LTR.73Thus, the N10-14 LTR tested here should be very close in sequence to the linked 5’ LTR whichfunctions in NTera2Dl cells. The H6 LTR is derived from a partial Hep2 (HeLa) cDNA clonewhich also most probably represents an LTR initiated transcript, as unit-length RNAs arefound in these cells (Wilkinson et al., 1990). The PB-3 LTR had functioned to polyadenylatean unrelated transcript and was chosen because it differs substantially in sequence from theother LTRs tested (Mager, 1989).The various LTRCAT(X) constructs were transfected into the human embryonalcarcinoma line NTera2Dl, the human embryonal kidney line 293, and the HeLa subline Hep2by calcium phosphate mediated precipitation. Cell lysates were measured for CAT activity 44-50 hours later. A typical CAT assay is shown in figure 3-2 (a, b, and c) and the relativepromoter activities are summarized in Table 1. The results clearly indicate that some LTRsare capable of directing CAT expression in these cells, although their individual promoterstrengths vary considerably. The H6 LTR consistently showed the strongest promotercapabilities; its activity was approximately 84%, 44% and 38% of the SV4O promoter in 293, Hep2and NTera2Dl cells, respectively. The N1O-14 LTR showed substantially weaker activity,4.4% and 1.3% of SV4O, in NTera2Dl and 293 cells respectively and did not promote detectablelevels of CAT activity in Hep2 cells. The other 3 LTRs tested showed no activity abovebackground levels in these three human cell lines.To determine whether the RTVL-H LTR promoters showed species or tissuespecificity, the constructs were tested in mouse (Ltk-, NIH 3T3) and monkey (COS-1) cells(figure 3-2d, e, and 0. The H6 LTR again showed strong promoter activity in NIH 3T3 andCOS-1 cells, but did not function to a detectable level in Ltk cells. No other RTVL-H LTRtested promoted transcription in the two mouse cell lines used. In COS-1 cells, however,several of the LTRs functioned to some extent (figure 3-20. The H6 LTR remained thestrongest and the PB-3 LTR, which did not function in any of the previously examined cells,displayed a significant level of activity. Low activities were also promoted by the 5’R2 LTR.74OriginRTVL-H2 4 4-..-5LTRHep2 cDNAcloneNTera2Dl cDNA•Anclone(R)(U3)RTVL-H1..4 -4..-.3LTRTATA I GC AATAAjB.pBR322AnpSVAOCAT (XJLR)AmprA.5’R23RlH6N1O-14PB-3 PB cDNA —. AncloneFigure 3-1. Construction of plasmids used to test the promoter activity of RTVL-H LTRs. (a)The RTVL-H LTRs tested were isolated from genomic or eDNA libraries (see Materials andMethods). Straight lines indicate RTVL-H interior sequences and thick arrows and boxesindicate complete LTR sequences and partial LTR sequences (due to polyadenylation),respectively. “An” indicates that the polyadenylation signal within the LTR has been used.Wavy lines represent non-RTVL-H related genomic sequences, and the dotted lines in theN10-14 interior sequences indicate a putative splicing event (Wilkinson et al., 1990). (b)Insertion of the putative U3 and R regions of the LTRs, in both the forward and reverseorientations, into the CAT expression vectors (forward orientation is shown). The 3’ extent ofeach LTR sequence varied by a few nucleotides, as described in Materials and Methods. TheLTR sequences were inserted at the unique Hindill site (H). Striped, CAT coding sequences;black, pBR322 sequences; stippled, SV4O sequences. “An” denotes the location of SV4O derivedpolyadenylation signals. One set of constructs had the SV4O early enhancer region, LR,inserted as shown. The “X” refers to the multiple cloning site of the CAT(X) constructs whichreplaces 545 bp of plasmid sequence in the CAT(LR) constructs.“1R75ANTera2Dl•••••••. .. .B..293 • ••••••,. .CHep2. • • • • • •I I? $Z 5 .I— - I- F—< < < < < < <C.) 0 0 0 0 0 0oth :uCj) o U) C) Cl)zFigure 3-2 (a, b, c). Assay of RTVL-H LTR promoter activity in different cell lines. Cellextracts (100 jil) were prepared 48 hours post transfection with the indicated plasmid andassayed at 37°C for 60 minutes as previously described (Kadesch et al., 1986). CAT activitywas measured by conversion of 14C chioramphenicol to its acetylated forms. The productswere separated by thin layer chromatography and detected by autoradiography. Extractsprepared from a) NTera2Dl, b) 293, and c) Hep2 cells are shown. The bottom spot in each laneis the unreacted chloramphenicol. The spots directly above, in the positive lanes, are the mono(lower two spots) and di-acetylated products.76DNIH 3T3‘‘9....ELtk•••.••.I I •i I I I *:F.. .•cos-1 . .•.••• .i . . I>? $ >? g gI— H I< < < < < < <o o 0 0 ) ) 0‘ ° Cl)a- zFigure 3-2 (d, e, f). Assay of RTVL-H LTR promoter activity in different cell lines. Cellextracts (100 i.tl) were prepared 48 hours post transfection with the indicated plasmid andassayed at 37°C for 60 minutes as previously described (Kadesch et al., 1986). CAT activitywas measured by conversion of 14C chioramphenicol to its acetylated forms. The productswere separated by thin layer chromatography and detected by autoradiography. Extractsprepared from d) NIH 3T3, e) Ltk-, and 0 COS-1 cells are shown. The bottom spot in each laneis the unreacted chioramphenicol. The spots directly above, in the positive lanes, are the mono(lower two spots) and di-acetylated products.77Table 1. Relative Promoter Activities of RTVL-H LTRs in human cells.Promoter Cell TypeNTera2Dl 293 HeD2promotertess 0.2 ± 0.02 (3) 0.2 ± 0.05 (3) 3.1 ± 1.4 (2)PB-3 0.3± 0.1 (3) 0.1 ± 0.01 (3) 1.8± 0.6 (3)5’R2 0.3 ± 0.07 (3) 0.5 ± 0.2 (3) 2.2 ± 0.7 (3)3’Rl 1.0± 0.7 (3) 0.6± 0.4 (3) 2.8± 1.1 (3)N10-14 4.4± 0.8 (3) 1.3± 0.4 (3) 2.8± 1.1 (3)H6 38.3 ± 17.7 (8) 84.3 ± 13.9 (6) 44.4 ± 22.3 (3)SV4O 100.0± 19.0 (8) 100.0± 7.5 (6) 100.0± 44.6 (3)Cell extracts were prepared as in the legend to Fig3-2 The percentage conversion ofchloramphenicol to its acetylated forms was determined by cutting spots from the TLC plateand measuring the radioactivity in a liquid scintillation counter with toluene-basedscintillation fluid. The data shown are the mean ± standard deviation of the mean for thenumber of experiments shown in parentheses. Values are normalized to the CAT activity ofpSV2ACAT transfected cells.78Although not evident in the figures shown, the 3’Rl LTR showed low levels of activity inNTera2Dl and COS-1 cells in other experiments. Interestingly, the N10-14 LTR, whichshowed weak activity in the NTera2Dl and 293 cells, did not function to a significant level inthe COS-1 cells. Further tests using the H6CAT(X) construct revealed promoter activity in avariety of cell types, including human primary bone marrow fibroblasts, mouse P-19embryonal carcinoma cells, and 1 day fish (Orizias latipes) embryos. Thus, the H6 LTRcontains strong promoter sequences with a wide range of activity. The other LTRs tested herecontain weaker promoters with more limited specificities.This heterogeneity in promoter function may be attributed to sequence differencesbetween the LTRs, since those tested here range from 75 to 93% identical to each other. Asequence comparison of the LTRs used in this study is shown in figure 3-3. Figure 3-3acompares the sequences of H6, N1O-14, 3’Rl, and 5’R2 to a consensus LTR generated from theseLTRs plus eight other genomic LTRs (Mager, 1989). The PB-3 LTR is not included in thiscomparison because its sequence differs substantially from the others (see below). As expected,figure 3-3a shows that several single nucleotide differences are scattered throughout the LTRsequences. Interestingly, the H6 LTR differed significantly in sequence from the consensusLTR between positions 130-220. This region contains several single nucleotide substitutionswhich are unique to H6, as well as two unique short deletions. Another difference of note in H6occurs at positions 245-254. In this region, two nucleotide substitutions, which again areunique to H6, create a potential Sp-1 binding site, CCCCGCCCC (Dynan and Tjian, 1983).Unlike the 5’R2, and N10-14 LTRs, which also showed a more restricted range ofactivity, the PB-3 LTR was capable of relatively high levels of promoter activity in COS-1 cells.It is thus interesting to note that PB-3 is quite similar in sequence to 116 in a region where H6differs from the other LTRs (see positions 132-160 in figure 3-3b). Downstream of this region,however, the PB-3 sequence differs completely from the other LTRs (see positions 170-350 infigure 3-3b). PB-3 has been designated a “type II” RTVL-H LTR based on these extensivesequence differences (Mager, 1989).79A100ConsensusH6 0 q C G910—14 C g A 75’n2 A T *3’Rl 6 0 A- 200Consensusn6 — 7 GTC 77 CT 76 CA GTGA7 A C T 0900-14 — a g C g g552 — g 1 4 SAg 53’Rl *5* g_____________• 300Consensus GCTCACCCTGGCTC_AAAA0CTCCCCYACTGAGCACCTTGTGACCCCIC7CGCCCGCCA6A0AACAACCCCCYTTTGACTGTAATTTTCCTTTACC36 6 I CCI a * A910-14 aS A * ‘C A T a A 6 A552 A a C CC A A C301 C392Consensus 7ACCCAAATCYTATAAACGCCCCCACCCCTATCTCCCTTYGCTGACTC7CTTTTCGCACTCA6CCC0CCTGCACAT3ATACHG 7 CslO—14 CS’R2 ITO gun3’Rl C S a guIB• 100Consensus CCTCTGAGCCCAAGCTAA6CCATCAnATCCCCTGTCAC97GCACRTAYACRYCCACAT0GCCeGAAGTAACT6AA6AATCACAATGTC1111111 liii I Ill I 1111111111111111111 I 111111 11111 III 1111111111111111 111111 1111115605-3 CCTCTOAGCCCAAGCCAAC .CA7CGCATCCCCTGTCATTTGCACGTATACATCCAGATGGCC7AAAGTAACTGAAGATCCACAAAAGAAGTAAAP• 200CsnsessOs CTGTYCC0GCCTTAACTGA7GACAT7CCACCACAAAAGAAG7GAAAATCCCCTGTYCCTCCCTTAACT0AT6ACATT50CTTGTGACCCCII I 1111111111111111136 CCC .CCCCGCCTTAACTGATGACATTCCACC ATGGTGATTTGTTCTTGCCCCACCTTAACTGAGTGATTAACCCTGTGAA77TGCTTCI IIIILIIILIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII II II 11111111 I08-3 TAGCCTTAACTOATGACATTCCACC A7TGTGATTTGTTCCTGCCCCACCCTAACTGATCAATG7ACTTTGT .AA7CTCCCCCACCCT300Consensus GCTCA7CCTGGCTCAAAAGCTCCCCYAC7GAGCACCT7GTGACCCCCACTCCTGCCCGCCAGAGAACAACCCCCTTTTCAI I 111111 11111 liii 11111111111 liii liii I 1111111 III 1111111 1111111 111111111 1111111111HG TCCTGGCTCAGAAGCTCCCCCACTGAGCACCTTGTGACCCCCGCCCC7GCCCACCA0AGACAATTTCAI I 1111111111 I II 11111111 II I I liiiPB-3 AAGAAGGTACTTTGTAATCTTCCCCACCCTTAAGAAGGTTC77TGTAATTC7CCCCACCC7TGAGAAT07ACTTTGT0A05TCCACTCCCA• 400Consensus CTGTAATTTTCCTTTACCTACCCAAATC0kTAA.AACGGCCCCACCCCTATC7CCCTTYGCTGACTCTCTT7CTCAGCCCGCCCCCHII 1111111 III LIII I 1111 11111111 LIII 11111 11111111111 1111111 1111111. 11111111 11111 I liii II96 CTGTAAT7TTCCA7TACC7TCCCAAATCCTATAA.AACGGCCCCACCCCTATC7CCCTTCGCTGACTC7C7TTTCGGTCACCCTOCCCC1111 III I05-3 CACTGCTCTTAACTTCACCGCC7AACCCAAAACCTATAAGAACTAA7GATAATCCATCACCCTTCGCTGACTCTCTTTT000ACTOCACCTOCACCCsnsensOs CAGGTGAAATAAAC66 CAGGTCAAATAAACIII II LIII I LIII05—3 CAGGTGAAAThAACFigure 3-3. Sequence comparison of RTVL-H LTRs. a) The sequences of the H6, N1O-14,5’R2, and 3’Rl LTRs are compared to a consensus sequence generated from these LTRs pluseight other genomic LTRs (Mager, 1989). Capital letters indicate nucleotide differences,relative to the consensus, which occur only in one of the LTRs shown and lower case letters arenucleotide differences occurring in multiple RTVL-H LTRs. Asterisks represent deletions,relative to the consensus, which occur only in one of the LTRs shown and dashes indicatedeletions which occur in multiple LTRs. The dots at the end of the LTRs indicate the extent ofthe LTR tested in this analysis. A potential Sp-1 binding site in the H6 LTR is boxed. b) Thesequences of the PB-3 and H6 LTRs are compared to the consensus. Dots indicate lengthdifferences. The TATAA boxes are overlined. Abbreviations: Y, C or T; R, A or G; W, A or T.The sequences shown here have been reported elsewhere (Mager and Henthorn, 1984; Magerand Freeman, 1987; Mager, 1989).80Bidirectional PromoterActivityIn order to determine whether RTVL-H LTRs could promote transcription in theopposite orientation, the activities of the PB-3 and H6 LTRTCAT constructs were compared withthose of the LTRCAT(X) constructs in 293 and COS-1 cells. Figure 3-4 illustrates the results ofone of these experiments. The H6TCAT(X) construct functioned at moderate levels in both celltypes. Furthermore, this construct also displayed significant activities in NTera2Dl cells butnot in Ltk cells. Although the H6 “reverse promoter” activity showed similar cell specificityto that of the forward promoter, the relative strength of the reverse promoter was weaker thanthat of the forward promoter. The PB3rCAT(X) construct showed activity in COS- 1 cells but notin 293 or in any other of the cell lines tested. Again, the reverse promoter had the same patternof cell specificity as the forward promoter. None of the other LTRs tested promoted significantlevels of CAT activity in the reverse orientation.RNA Mapping AnalysisTo determine whether the CAT transcripts were initiated within the LTR, primerextension experiments were performed to define transcriptional start sites. RNA from 293cells, either mock transfected or transfected with the H61’CAT(X) or H6CAT(X) constructs, wasisolated and primer extension analysis was performed using an oligonucleotide primercomplementary to RNA sequences downstream of the LTR/CAT junction. Results of thisanalysis are shown in Figure 3-5a. One major extended product was consistently observedusing RNA from cells transfected with the H&’CAT(X) construct. It maps to the expectedinitiation site in the LTR, 23-24 nucleotides downstream of the TATAA box (see figure 3-5b).Smaller bands, such as the one shown in figure 3-5a, were occasionally observed but were notconsistent in size between experiments.Several RNA start sites were found using the H6’CAT(X) transfected RNA. Thesetranscripts were mapped to a region between 30-50 nucleotides downstream of a OC richsegment which is preceded by a TTAA (figure 3-5b). The exact positions of thesetranscriptional start sites varied among experiments but were consistently located in the81.......A293Bcos-1..•.•••.. .I $x x ><I— H H<C) C) C) 9C) C) CD> thC/) ci-HC-)c’J>U)0Figure 3-4. Bidirectional promoter activity of RTVL-H LTRs. CAT constructs containing thePB-3 or H6 LTRs in the forward (f) or reverse (r) orientation were transfected into the celllines shown and assayed for CAT activity 44 hours later as described in figure 3-2 and inMaterials and Methods.82><I—C.)CD 0GATCGFigure 3-5a. Primer extension analysis. (a) 293 cells, either transfected with H6CAT(X) orH6rCAT(X), or mock transfected were hybridized with a 5’ end labeled 21 bp oligonucleotideprimer derived from the 5’ end of the CAT gene. The hybrids were extended with reversetranscriptase and the eDNA products were electrophoresed on a sequencing gel alongside asequencing reaction using the same primer. In the forward orientation, a major band (darkarrow) of 178 bp was observed in all experiments. Secondary products (e.g. white arrow, 125bp) were occasionally observed, but their size varied among experiments. Several extendedproducts were found in the H6rCAT(X) transfected RNA and are indicated by arrows. The sizeof the arrow indicates the relative abundance of the transcript as determined by bandintensity.-j5I—S—• ::><I—A GATCG178ê — —125-—83B_1 TCCTATAAAACGGCCCCACCCCTATCTCCCTTCGCTGACTCTCTTTTCGGACTCAGCCCG 6061 CCTGCCCCCAGGTGAAATAAACGGGATCAGCTTGGCGAGATTTTCAGGAGCTAAGGAAG 120121 CTAAAATGGAGAAAAAAATCACTGGATATACCACCGTTGATATATCCCAATGGCATCGTA 180181 AAGAACATTTTGAGGCATTTCAGTCAGTTGCTCAATGTAC 220*C1 GTTAAGGTGGGGCA.AGAACAAATCACCATGGTGGAATGTCATCAGGGCGGGGCAGG 6061 GCCTTTTCAGTTCTTTTGTGATTCTTCAGTTACTTCAGGCCATCTGGGCGTATACGTGCA 120121 AGTCACAGGGGATGCGATGGCTTGGCTTGGGCTCAGAGGAGCTTGGCGAGATTTTCAGGA 180181 GCTAAGGAAGCTAA.AATGGAGAAAAAAATCACTGGATATACCACCGTTGATATATCCCAA 240241 TGGCATCGTAAAGAACATTTTGAGGCATTTCAGTCAGTTGCTCAATGTACC 291Figure 3-5b and c. (b,c) Mapping of transcription start sites. Results of the primer extensionanalysis were used to map the transcription start sites in the forward (b) and reverse (c)orientations. Start sites have been localized to the regions indicated by the arrows. The regionhomologous to the primer is shown by a dashed underline with an asterisk indicating the 5’end label. A vertical line seperates LTR sequences from linker and CAT sequences.84same region. This heterogeneity of start sites is not unexpected, as the function of the TATAAbox is primarily to correctly position the start of transcription (for review, see Maniatis et al.,1987). In cases where a TATAA box is altered or absent, decreases in initiation frequency andheterogeneity of start sites have been observed (Grosveld et al., 1982). Nevertheless, in both theforward and reverse H6-CAT(X) constructs, transcription has been initiated within the LTR.RTVL-H LTRs Can Be Activated by SV4O Enhancer SequencesAs most of the LTRs tested displayed relatively weak promoter activity, it wasimportant to test whether transcription directed from RTVL-H LTRs could be augmented byheterologous enhancer sequences. The H6 and PB-3 LTRs were chosen for analysis in thisrespect because they differed significantly in both their sequence (see figure 3-3) and in theirpattern of promoter function. These LTRs were inserted, in both orientations, into the vectorpSVAOCAT(LR) (figure 3-ib), that contains no promoter sequences, but has an SV4O earlyenhancer region inserted downstream of the CAT gene. The promoter activity of these LTRsunder the influence of the SV4O enhancer was compared to their activities in the absence of anenhancer in NTera2Dl and COS-1 cells (figure 3-6). The addition of SV4O enhancersequences increased the activity of the H6 LTR in both orientations and in both cell types.Expression from the PB-3 LTR was significantly increased in COS-1 cells (figure 3-6b) butincreased only to barely detectable levels in the NTera2D 1 cells (figure 3-6a). Thus, the extentof induction of the PB-3 LTR promoter was dependent upon the cell in which it was tested.The 116 LTR Contains Enhancer SequencesTo determine whether RTVL-H LTRs contain a distinct enhancer activity, the H6 LTRwas inserted, in both orientations, downstream of a human f3-globin promoted CAT gene in theexpression vector pSVAfGCAT(X) (figure 3-7a). As the human 3-globin promoter is stronglyenhancer dependent (Mager, 1989), any activity observed can be directly attributed to RTVL-Henhancer sequences. The vector pSVAGCAT(LR) (figure 3-7a) which contains the SV4O earlyenhancer fragment, LR, was used as a positive control. The results of one experiment are85.cc x cc cc ccZD C) C)I z> I IFigure 3-6. Activation of RTVL-H LTRs by the addition of a downstream enhancer. Theactivity of the PB-3 and H6 LTRs, in both orientations was compared to their activity in thepresence of an SV4O enhancer in (a) NTera2Dl and (b) COS-1 cells.A NTera2Dl.•. •.... .......• • • . e • -B cos-iI I I I I I I I I I I I. .. .....• •..••. .•...•.••.... •.•••.• • . , .cc )< cc cc ccI-. < I— < I- I—< C) < C.) < <C) C), 9 0th ‘?0. cc 0.0.86BApBR322 OnSVOO poly A Anne,.. .•••.• I ICDI— To(3 0 0- 0 0CD>U, > >Cd) U)ci ci••..- DI- X I0ci U) U)ci ciFigure 3-7. RTVL-H LTRs can enhance transcription from an upstream promoter. (a)Construction of plasmids used to test the enhancer abilities of RTVL-H LTRs. The H6 LTRwas inserted, in both orientations, into a unique ThaI site found in the multiple cloning site (X)of the CAT(X) constructs. The SV4O enhancer sequence, LR, was inserted into the BamHI siteof the multiple cloning region in the positive controls. Black, pBR322 sequences; stippled,SV4O sequences; striped, CAT sequences; white, human -globin promoter. Test constructscontaining the H6 LTR, in both orientations were transfected into (b) NTera2D 1 cells and (c)NIH 3T3 cells and assayed for CAT activity.87C-JI00>U)shown in figure 3-7b and c. The enhancerless pSVAjGCAT(X) construct promoted only verylow levels of CAT expression in NTera2Dl or 3T3 cells. Addition of H6 LTR sequences ineither orientation, downstream of the CAT gene, activated the -globin promoter to significantlevels in both cell types. The SV4O enhanced control, pSVAGCAT(LR), is shown forcomparison.DISCUSSIONDiversity of RTVL-H PromotersIn a parallel study on RTVL-H expression, our laboratory has shown that RTVL-Htranscripts initiating in the 5’ LTR and terminating in the 3’ LTR are abundant in NTera2Dlcells and are also found in Hep2 cells and some other cell lines (Wilkinson et al., 1990). Ofthe primary tissues examined, little or no expression was detected in adult blood or fibroblastsamples while amnion and chorion membranes from normal placenta had significant levelsof expression (Wilkinson et al., 1990). These results indicate that endogenous tissue-specificexpression of these sequences occurs and may be biologically relevant.In this study, the functional capacities of individual LTRs have been examined in anattempt to determine the range of promoter/enhancer functions that this large family ofsequences may possess. Of the five LTRs tested, the H6 LTR derived from a Hep2 transcriptpromoted high levels of CAT gene expression in a wide variety of cell types. Figure 3-3 showsthat the H6 LTR contains several nucleotide differences and two length differences that are notfound in the other LTRs tested. Southern blot analysis using a specific subregion probe(spanning positions 139-219 in figure 3) indicates that LTRs closely related to H6 are presentin 40-60 copies per haploid genome (D. Mager, unpublished observations). This findingindicates that a distinct subpopulation of RTVL-H LTRs, with potentially strong promoteractivity, exists in the genome. Two of the nucleotide differences unique to H6 create a potentialSp-1 binding site (boxed in figure 3-3a). An Sp-l binding site is also found in a similarlocation in the myelproliferative sarcoma virus LTR and may be important for expression of88this retrovirus in the embryonal carcinoma line F9 (Hilberg et al., 1987). The possibleimportance of this and other H6-specific sequence differences inconferring increasedpromoter strength or range of activity to this LTR is currently under investigation.Over the course of these investigations, certain correlations between the cDNA libraryof origin and the cell types in which a particular LTR was active became apparent. Forexample, the N1O-14 LTR was derived from an NTera2Dl library and was most active in thesecells. The H6 LTR was derived from a Hep2 cDNA library and was the only LTR tested tofunction to a detectable level in this cell line. In addition, the two LTRs not derived fromtranscripts, 5’R2 and 3’Rl, functioned very weakly, if at all, in the cell lines tested. Thus,although there could be factors affecting endogenous expression of these sequences which arenot detectable using transient assays, the results presented here do appear to parallel theendogenous capabilities of different RTVL-H LTRs.It was also observed that some of the LTRs tested were capable of bidirectional promoteractivity, either alone (see figure 3-4) or when activated by heterologous enhancer sequences(figure 3-6). Interestingly, bidirectional promoter activity has also been reported for the LTRsof murine retrovirus-like intracisternal A particle (lAP) elements (Christy et al., 1985). The“reverse” and “forward” RTVL-H LTR promoters showed similar cell specificities. Thisshared specificity may occur because the same upstream regulatory elements are used by bothupstream and downstream components of the promoter. This finding implies that someRTVL-H LTRs are able to influence the expression of upstream as well as downstream genes.RTVL-H LTR Sequences Can Interact with Hetemlogous Enhancers and PromotersIn this study, it has been shown that expression of the CAT gene from the PB-3 and 116LTR promoters can be augmented by the presence of SV4O early enhancer sequences.However, the degree of enhancement observed depended on the basal level of promoter activity.For example, in the presence of the SV4O early enhancer, the expression from the H6 LTR wasraised from a level of 38% of the SV4O early promoter to a level approximately equal to the SV4Oearly promoter in NTera2Dl cells. Although expression from PB-3 was also increased by the89addition of the LR enhancer fragment, it could not be increased to a level comparable to theSV4O early promoter. The 5’R2 and N10-14 LTRs could also be enhanced to low levels ofactivity in cells in which their basal activities were not detectable. This finding raises thepossibility that genomic rearrangements which place a heterologous enhancer in theproximity of an RTVL-H LTR may activate the LTR. The extent of this activation woulddepend upon the particular LTR involved. The PB-3 and H6 LTRs both contain a TATAA box,however, the sequences surrounding this element (positions 170-325 and 344-355, in figure 3-3b)are not closely related. Therefore, it may be that some as yet unrecognized elements which canaffect promoter activity are involved.It was also observed that the H6 LTR can enhance expression from the human -globinpromoter. Enhancers are typically modular elements made up of short, often repeated units. Ithas been shown that combining different units can result in new tissue specificities conferredby the enhancer (for example, see Rochford et al., 1987). Therefore, it is possible that if smallerregions of the LTR had been tested differences in cell type restrictions may have beenobserved. As the H6 LTR was not fragmented in this study, it is likely that these results reflectthe effect that an intact RTVL-H LTR of this type may have on a nearby cellular promoter.Significance of These FindingsSeveral retrovirus-like families in mice have been implicated in both the inactivationand activation of cellular genes. For example, comparison of the i light chain genes of twohybridoma cell lines defective in i light chain gene synthesis with the wild type genesrevealed that the mutant genes had lAP element insertions in their introns, suggesting that theinsertions caused gene inactivation (Hawley et al., 1982). lAP elements have also beenimplicated in gene activation events. An interesting example occurs in the myelomonocyticmurine leukemia cell line WEHI-3B. This line constitutively produces the growth factorInterleukin-3 due to an lAP insertion 5’ of the gene (Ymer et al., 1985) and expresses theHox2.4 homeobox gene due to an lAP insertion in the first exon of this gene (Kongsuwan et al.,1989; Blatt et al., 1988). In both cases, transcription has been shown to originate within the90LTR. Therefore, insertions of endogenous retrovirus-like elements have altered the pattern ofexpression of these genes and may have contributed to the transfOrmation of this cell line.While H6 promoted high levels of CAT expression, the other LTRs tested in this studydisplayed a much more limited range of activity and much weaker levels of promoterstrength. That most of the LTRs tested contained relatively weak promoter sequences is notunexpected. The human genome contains between 2000-3000 RTVL-H LTRs in addition toseveral thousand LTR-like sequences belonging to different families (Chapter 1; Mager andHenthorn, 1984; Paulson et al., 1985; Larssen et al., 1989). If many of these sequences hadstrong promoter capabilities they would probably require cellular regulation so as not tointerfere with normal cellular processes. It may, however, confer a selective advantage to anorganism to have a variety of weak promoters in the genome which could be activated byrearrangements or other genetic changes to impose new patterns of regulation on cellulargenes. Support for this hypothesis comes from the identification of two rodent genomic LTRinsertions which have evolved into a regulatory role. Transcription of the mouse sex-limitedprotein gene is androgen dependent due to the enhancer donation of an upstream endogenousproviral insertion (Stavenhagen and Robins, 1988). It has also been shown that theoncomodulin gene of the rat is promoted by a solitary LTR related to lAP elements (Banvilleand Boie, 1989). Thus, similar events involving RTVL-H LTRs may be detectable in humancells.Here, I have demonstrated that RTVL-H LTRs are functionally diverse and canpromote the expression of a linked gene as well as interact with heterologous promoters andenhancers. Although transposition of RTVL-H elements has not yet been demonstrated, arearrangement involving an existing RTVL-H proviral element has been detected (Magerand Goodchild, 1989). These findings raise the possibility that rearrangements or activationof these endogenous sequences may contribute to alterations in human gene expression.V 91REFERENCESBanville, D. and Y. Boie. 1989. Retroviral long terminal repeat is the promoter of thegene encoding the tumor-associated calcium-binding protein oncomodulin in the rat.J. Mol. Biol. 207:481-490.Blatt, C., 1). Aberdam, R. Schwartz, and L. Saehs. 1988. DNA rearrangement of ahomeobox gene in myeloid leukemic cells. Embo. J. 7:4283-4290.Chen, H. R. and W. C. Barker. 1984. Nucleotide sequences of the retroviral longterminal repeats and their adjacent regions. Nuci. Acids. Res. 12:1767-1778.Christy, R. J. and it C. C. Huang. 1988. Functional analysis of the long terminalrepeats of intracisternal A particle genes: Sequences within the U3 region determineboth the efficiency and direction of promoter activity. Mol. Cell. Biol. 8:1093-1102.Cohen, M. and E. Larsson. 1988. Human endogenous retroviruses. Bioessays. 9:19 1-196.Cohen, J.B., T. Unger, G. Rechavi, E. Canaani, and D. Givol. 1983. Rearrangementof the oncogene c-mos in mouse myeloma NSI and hybridomas. Nature. 306:797-799.Dynan, W. S. and R. Tjian. 1983. The promoter-specific transcription factor Sp-1binds to upstream sequences in the SV4O early promoter. Cell. 35:79-87.Finnegan, D. J. 1985. Transposable elements in eukaryotes. mt. Rev. Cytol. 93:28 1-326.Gattoni-Celli, S., W. Hsiao, and L B. Weinstein. 1983. Rearranged c-mos locus in aMOPC 21 murine myeloma cell line and its persistance in hybridomas. Nature.306:795-796.Grosveld, G. C., E. deBoer, C. K. Shewmaker, and IL A. Flavell. 1982. DNAsequences necessary for transcription of the rabbit f-globin gene in vivo. Nature.295:120-126.Hawley, R., M. Shulinan, IL Murialdo, D. Gibson, and N. Hozunu. 1982. Mutantimmunoglobulin genes have repetitive DNA elements inserted into their interveningsequences. Proc. Nati. Acad. Sci. USA. 79:7425-7429.Hilberg, F., C. Stocking, W. Ostertag, and M. Grez. 1987. Functional analysis of aretroviral host-range mutant: Altered long terminal repeat sequences allowexpression in embryonal carcinoma cells. Proc. Nati. Acad. Sci. USA. 84:5232-5236.Johansen, T., Holm, T., and Bjorklid, E. 1989. Members of the RTVL-H family ofhuman endogenous retrovirus-like elements are expressed in placenta. Gene 79:259-267.Kadesch, T. and P. Berg. 1986. Effects of the position of the simian virus 40 enhanceron expression of multiple transcription units in a single plasmid. Mol. Cell. Biol.6:2593-2601.92Kadesch, T., P. Zervos, and B. Ruezinsky. 1986. Functional analysis of the murineIgH enhancer: Evidence for negative control of cell type specificity. Nucl. Acids. Res.14:8209-8221.Kingsman, A. J., and S. M Kingsman. 1988. Ty: A retroelement moving forward.Cell. 53:333-335.Kongsuwan, K., J. Allen, and J.M. Adams. 1989. Expression of Hox 2.4 homeoboxgene directed by proviral insertion in a myeloid leukemia. Nuci. Acids. Res. 17:1881-1892.Larsson, E., N. Kato, and M. Cohen. 1989. Human endogenous proviruses. Curr.Top. Microbiol. Immunol. 148:115- 132.Mager, D.L. 1989. Polyadenylation function and sequence variability of the longterminal repeat of the human endogenous retrovirus-like family RTVL-H. Virology.173:591-599.Mager, B. L. and J. D. Freeman. 1987. Human endogenous retroviruslike genomewith type C poi sequences and gag sequences related to human T cell lymphotropicviruses. J. Virol. 61:4060-4066.Mager, D. L. and N. L. Goodchild. 1989. Homologous recombination between theLTRs of a human retrovirus-like element causes a 5 kb deletion in two siblings. Am.J. Hum. Genet. 45:848-854.Mager, D. L. and P. 5. Henthorn. 1984. Identification of a retrovirus-like repetitiveelement in human DNA. Proc. NatI. Acad. Sci. USA. 81:7510-7514.Maniatis, T., S. Goodbourn, and J. Fischer. 1987. Regulation of inducible and tissuespecific gene expression. Science. 236:1237-1244.Mm Man, Y., H. Delius, and B. Leader. 1987. Molecular analysis of elementsinserted into mouse v actin processed pseudogenes. Nucl. Acids. Res. 15:3291-3304.O’Connell, C. and M. Cohen. 1984. The LTR sequences of a novel human endogenousretrovirus. Science. 226:1204-1206.Ono, M., T. Yasunaga, T. Miyata, and H Ushikubo. 1986. Nucleotide sequence ofhuman endogenous retrovirus genome related to mouse mammary tumor virusgenome. J. Virol. 60:589-598.Paulson, KE., N. Deka, S. Schmid, H.. Misra, C. Schindler, M. Rush, L Kadyk, andL. Leinwand. 1985. A transposon-like element in human DNA. Nature. 316:359-361.Stavenhagen, J. B. and Robins, B. M. 1988. An ancient provirus has imposedandrogen regulation on the adjacent mouse sex-limited protein gene. Cell. 55:247-254.Steele, P.E., A_B. Rabson, T. Bryan, and MA. Martin. 1984. Distinctive terminicharacterize two families of human endogenous retroviral sequences. Science.225:943-947.93Roehford, H., Campbell, B. A., and L. P. Villarreal. 1987. A pancreas specificityresults from the combination of polyomavirus and Moloney murine leukemia virusenhancer. Proc. NatI. Acad. Sci. U.S.A. 84:449-453.Temin, H. M. 1981. Structure, variation and synthesis of the retrovirus long terminalrepeat. Cell. 27:1-3.Varmus, H. E. 1982. Form and function of retroviral proviruses. Science. 216:8 12-820.Wilkinson, D. A., J. D. Freeman, N. L. Goodchild, C. A. Kelleher, and D. L. Mager.1990. Autonomous expression of RTVL-H endogenous retroviruslike elements inhuman cells. J. Virol. 64:2157-2167.Ymer, S., W. Tucker, C. J. Sanderson, A. J. Hapel, IL D. Campbell, and I. G. Young.1985. Constitutive synthesis of interleukin-3 by leukaemia cell line WEHI-3B is due toretroviral insertion near the gene. Nature. 317:255-258.94CHAPTER WSV4O Large T Antigen Tmnsactivates the Long Termins1 Repeats ofa Large Family ofHuman Endogenous Retrovirus-like SequencesApaperothe same title byA. Feuchter andD. Mager was writtenfrom this chapter and is in press inVirology.INTRODUCTIONDetailed analyses of RTVL-H LTRs have shown that their individual promotercapabilities and cell specificities vary widely (Chapter 3; Feuchter and Mager, 1990). In thisreport, the SV40 large T antigen is shown to induce expression in vitro of a normally quiescentRTVL-H LTR in addition to significantly increasing the promoter activities of several, butnot all, RTVL-H LTRs.The SV4O large T antigen (T) is a multifunctional phosphoprotein that encodes avariety of biochemical activities required for the productive infection of permissive cells(Livingston and Bradley, 1987). During lytic infection, T initiates viral DNA replication,autoregulates its own transcription (Reed et al., 1976; Rio et al., 1980) and activates late geneexpression (Brady et al., 1984; Brady and Khoury, 1985; Keller and Alwine, 1985). T alsoexhibits striking effects on host cellular functions. This protein stimulates cellular DNAsynthesis after infection and induces cells to overexpress nuclear RNA (Ide et al., 1977;Whelley et al., 1978). Although the ability of T to induce tumors in animals (Brinster et al.,1984, Topp et al., 1981; Windle et al., 1990) and to transform a variety of cell types in culture(Eddy et al, 1962; Srinivasan et al., 1989) has been well documented, the molecularmechanisms by which T exerts these effects remain unclear. It has been proposed, however,that the interaction of T with two cellular proteins, Rb (the product of the retinoblastomasusceptibility gene) and p53, both suspected negative regulators of cell growth, may play a role95in the transforming ability of T (DeCaprio et al., 1988; Stanbridge, 1990). In addition, Tencodes an activity or activities capable of transactivating several viral and cellularpromoters. Of these, the transcriptional activation of the SV4O late promoter by T has been themost well studied (Brady et al., 1984; Keller and Alwine, 1985; Brady and Khoury, 1985; Galloet al., 1989; Wildeman, 1989).It has recently been shown that direct binding of T is not necessary for transcriptionfrom the SV4O late promoter (Tack and Beard, 1985) and that a T mutant which cannot betransported to the nucleus can transactivate SV4O gene expression (Pannuti et al., 1987;Wildeman, 1989). In addition, cellular factors that bind to regions of the late promoterrequired for activation have been found to have altered and more stable bindingcharacteristics in the presence of T (Gallo et al., 1988; 1990). Together, these observationssuggest that P does not mediate transactivation directly, but rather, that it activates, modifies,and/or induces certain cellular transacting factors which then interact with the late promoter(Gallo et al, 1988; 1990). Presumably, these cellular factors would also be competent to activatethe endogenous genes that they normally regulate. This idea is consistent with the observationthat T can activate several different viral and cellular promoters including the adenovirus E2(Loeken et al., 1986) and E3 (Alwine, 1985) promoters, the chicken cx2 collagen (Alwine, 1985),the human heat shock protein 70 (Taylor et al., 1989), and e, and 3 globin (Cao et. al., 1989)promoters and the Rous Sarcoma virus (Alwine, 1985) and Intracisternal-A particleendogenous retroviral LTRs (Luria and Horowitz, 1986).RESULTS AN]) DISCUSSIONRTVL-H LTR Promoter Activities Vary in COS-1 and CV-1 cellsIn previous studies, it was observed that all of the seven RTVL-H LTRs tested couldpromote expression of the chloramphenicol acetyltransferase (CAT) gene in transient assays,but that their relative promoter activities and cell specificities varied widely (Chapter 3;Feuchter and Mager, 1990; unpublished observations). Over the course of those investigations,96I became particularly interested in one RTVL-H LTR, PB-3, that had no promoter activity inany cells other than in COS-1 cells (African green monkey kidney cells transformed bySV4O). This observation (see figure 3-2) suggested that a factor present in these cells isinvolved in the transcriptional activation of this LTR. To investigate the possibility that cellspecific transactivators may affect LTR expression, various LTR-CAT constructs weretransfected into both COS-1 cells and their untransformed parent line, CV-1, by calciumphosphate mediated precipitation. Cell lysates were measured for CAT activity 44-50 hourslater. Results from a typical experiment are shown in figure 4-1. None of the LTRs other thanH6 promoted detectable levels of CAT activity in CV-1 cells in numerous differentexperiments. In the COS-1 cells, however, although the 5’R2, and 3’Rl LTRs did not promotesignificant levels of CAT activity, the PB-3 and 5’R2 LTRs were activated. These resultssuggest that a factor present in COS-1 cells but not in CV-1 cells can activate certain RTVL-HLTRs.A Factor Present in COS-1 Cells Interacts with an RTVL-H LTRIn order to determine whether the enhanced promoter activity observed in COS-1 cellswas due to specific interactions with RTVL-H LTR sequences, competition assays wereperformed in which the PB-3CAT construct was co-transfected with a twenty fold excess ofeither PB-3pUC (the PB-3 LTR inserted into the Hincli site of pUC18) or pUC18 alone. Theresults of one such experiment (figure 4-2) demonstrate that the PB-3 LTR does not function inCV-1 cells, regardless of the presence of competitor. In COS-1 cells, however, expression fromPB-3 CAT is significantly decreased when excess PB-3 LTR sequences in pUC18 are added.This finding indicates that a factor present in COS-1 cells interacts with the PB-3 LTR tostimulate transcription and that this factor can be sequestered by excess PB-3 sequences.97cv-1.Icos-1•.•• .• .‘..1L4 4 4 4 4 4 4o 0 0 0 0 0 0o c ci — (0 4tho. (/)0 zC-Figure 4-1. Assay of RTVL-H LTR promoter activity in CV-1 and COS-1 cells. Cell extractswere prepared 48 hours post transfection and assayed for 60 minutes as previously described.CAT activity was measured by conversion of 14C chioramphenicol to its acetylated forms. Theproducts were separated by thin layer chromotography and detected by autoradiography. Thelower spots in each lane are the unreacted chioramphenicol. The spots directly above, in thepositive lanes are the mono (lower two spots) and di-acetylated products.98I•.••••I- 0Co• D. 0 .+0 c +cDCo CoQ_ o_I- 0co Ia: D‘ 0.+0 C+c)CoD_ oFigure 4-2. Competition analysis of the PB-3 LTR. The PB-3 CAT construct was cotransfected into CV-1 or COS-1 cells with a twenty fold excess of either PB-3 pUC or pUC 18.Cells were assayed for CAT activity as described in the legend to figure 4-1 and in Chapter 2.The autoradiograph shown is a representative result of three separate experiments.I0C.)Co0I0C.)Co0CV-1 cells COS-1 cells99SV4O T Antigen Transactivates RTVL-H LTRsTo determine whether the SV4O T antigen is responsible for the increased RTVL-HLTR promoter activity observed in COS- 1 cells, CV- 1 cells were next co-transfected with LTRCAT constructs and with vectors expressing either the SV4O large and small T antigens (pRSVPit), the small t antigen alone (pSV tJcDNA), or a negative control (pSVw2At). Arepresentative result is shown in figure 4-3 and illustrates that the presence of T does increasethe promoter activity of certain RTVL-H LTRs. The PB-3 LTR does not promote detectablelevels of CAT activity when co-transfected with equal amounts of either pSVw2t or with pSVt/cDNA. However, when co-transfected with pRSV T/t, this LTR is induced to promote highlevels of CAT activity. Conversely, the H7 LTR, which has a moderate basal promoter activity,was not detectably stimulated by P in this or in any subsequent experiments. This observationsuggests that the P mediated stimulation of RTVL-H promoter activity is not a generalphenomenon, but rather that it is specific for certain individual LTRs.These results have been interpreted as evidence for the transcriptional activation ofRTVL-H LTRs by T. However, these findings do not exclude the possibility that T activatesRTVL-H sequences through post-transcriptional events, such as by increasing RNA stability.It has recently been suggested that P may post-transcriptionally regulate mRNA levels of theDNA polymerase cx and proliferating cell nuclear antigen genes (Koniecki et al., 1991).However, the majority of transcripts stimulated by T are believed to be activated at thetranscriptional level (Ide et al., 1977; Whelley et al., 1978; Brady and Khoury, 1985; Cao et al.,1989; Taylor et al, 1989). Consequently, it is most likely that RTVL-H LTRs are alsotranscriptionally stimulated by T. The region of RTVL-H LTRs most likely to contribute totheir varying levels of responses to P is U3, their most highly variable subsection (Mager,1989). A sequence comparison of three of the LTRs used in this study (H6, PB-3, and H-7) isshown in Figure 4-4a and illustrates the variability of the U3 region. U3 contains the LTRtranscriptional regulatory signals and extends from the 5’ end of the LTR to the transcriptinitiation (CAP) site.100PB-3 CAT1 IH7 CAT1_..I Iz- C.)(1)>Cl)I Iz> C.)Cl)>0CJ>Cl)Figure 4.3. SV4O ‘arge T antigen can increase the promoter activity of certain RTVL-HLTRs. CAT constructs containing either the PB-3 or H7 LTRs were co-transfected into CV-1cells with an equal amount of the T expression vector RSV TJt or a negative control, as shown.1.C’J>Cl)101Thus, U3 sequences are not present in the transcript, and cannot participate in post-transcriptional control. The transcripts produced from the LTR-CAT constructs used in theprevious experiments are, however, expected to contain 40-50 bases of RTVL-H LTR sequencedownstream of the CAP site (see figure 4-4a). To investigate the roles of these regions inmediating the response to T, I have made use of the RTVL-H deletion constructs PB3-45-CATand H6-46-CAT, in which 45 and 46 bases, respectively have been removed from the 3’ end ofthe LTR segments used in the PB-3CAT and H6-CAT constructs (shown in figure 4-4a). RNAtranscribed from these deletion constructs should contain only 3 and 2 bases of RTVL-Hsequence (GOT and CC, respectively) linked to the CAT gene. The promoter activity of H6-CAT is significantly higher than the shorter H6-46-CAT in NTera2D 1 cells (see figure 4-4b fora lepresentative experiment). This difference suggests a role for sequences downstream of thetranscription initiation site in regulating H6 LTR activity in NTera2Dl cells. However, inCOS-1 cells (figure 4-4c) the removal of these sequences had no effect on the observed CATactivity from either the H6 or PB-3 LTRs. This finding suggests that the effects of T on RTVLH LTRs are transcriptional and that the LTR sequences required for mediating this effect do,indeed, occur in U3.The Increase in CATActivity is Not Due to SV44) Mediated Replication of the Test PlasnildsIn addition to its diverse roles as a transcriptional regulator, SV4O T antigen is knownto promote viral replication (Tegtmeyer, 1972). To ensure that the increase in CAT activityobserved in the presence of T is not due to SV4O mediated replication of the transfected RTVL-HCAT constructs, the methylation status of PB-3 CAT and pSV2ACAT that had been transfectedinto bacterial, CV-1, and COS-1 cells was examined. Mbo I, Sau3A, and Dpn I areisoschizomers which will differentially digest DNA, depending on the state of methylation oftheir recognition sequence, GATC. Mbo I and Dpn I will only cut DNA that has been replicatedin eukaryotic and prokaryotic cells, respectively, while Sau3A will cleave regardless of themethylation state of the DNA (Nelson and McClelland, 1991).102AH6 C C GCPB3 TGTCAGGCCTCTGAGCCCAAGCCAAG*CATCGCATCCCCTGTGATTTGCACGTATACATC 60H7 T T C AT * A CCH6 G AT C G GGCCCTGCCCCPB3 CAGATGGCCTAAAGTAACTGAAGATCCACAAAAGAAGTAAAAA*********TAGCCTTA 120H7 G C G CH6 G T T GTG TA CCPB3 ACTGATGACATTCCACCATTGTGATTTGTTCCTGCCCCACCCTAACTGATCAATGTACTT 180H7 A C TH6 G AT TG TTCT GGCTC G C TGAGC CCTT GAPB3 TGTAATCTCCCCCACCCTTAAGAAGGTACTTTGTAATCTTCCCCACCCTTAAGAAGGTTC 240H7 G T T AC C TH6 CCCCCGCCC G *****************A A A CCPB3 TTTGTAATTCTCCCCACCCTTGAGAATGTACTTTGTGAGATCCAC* *CCTGCCCACJ 300H7 T T C A CT CH6 *T GA G A TT CATTA CTT T A **CGGCCCC CC CT TPB3 CATTGCTCTTAACTTCACCGCCTAACCCAAAACCTATAAGAACTAATGATAATC*CATCA 360H7 C C T C T CG ACH6 G cPB3 CCCTTCGC’rGACTCTCTTTTCGGACTCAGCCCACCTGCACCCAGGTGAAATAAACH7 T T CT GFigure 4-4a. Deletion analysis of sequences involved in the T response. (a) Sequences of theH6, PB-3, and H7 LTRs (all reported previously, Mager, 1989). The sequence of the PB-3 LTRis shown, as are differences in H6 and H7 compared to PB-3. Asterisks represent lengthdifferences. The TATAA box is underlined and the CAP site is indicated by an arrow. Theportion of each LTR used in initial CAT constructs extends from position 8 to the end of thesequences shown. The dashes at the 3 end of H7 indicate that a shorter region of this LTR wastested. The 3 terminus of the H6-46 and PB3-45 LTR deletions are indicated by open triangles.103.B C.. ..•... .•.. . •.•• .I I I I I I I I I II- I- I- I- I- I- I- I- I- IC.) C.) C.) C.) C.) C.) 0o IC 0‘ CDCD C CD CD> . > > . >O Cl) Ci) D. CD U)i 0.0.Figure 44b, c. Deletion analysis of sequences involved in the T response. (b) The activity ofthe H6-46 CAT and H6 CAT constructs is compared in NTera2Dl cells. (c) Activities of theH6-46 vs H6 and PB3-45 vs PB-3 CAT constructs are compared in COS-1 cells.104Figure 4-5 illustrates that the positive control, pSV2ACAT, exhibits the same digestionpattern whether isolated from CV-1 cells or bacterial cells. However, when isolated from COS1 cells, some plasmid sequences are digested by Mbo I, but not by Dpn I, as would be expectedfrom a plasmid which had been replicated in response to T (see arrows in figure 4-5). The PB-3CAT construct, however, was not differentially digested in COS-1 cells (figure 4-5), indicatingthat this plasmid was not stimulated to replicate in response to SV4O T antigen.LTRs ofDiffirent Subtypes are Stimulated by Large T AntigenRTVL-H LTRs have previously been classified into three subtypes, 1, la, and 2, basedon sequence variations and structural differences (Mager, 1989; unpublished data). Over thecourse of these studies, consistent differences have been observed in both the transient promotercapabilities and the endogenous expression among these LTR subtypes (Chapter 3; Feuchterand Mager, 1990; Wilkinson and Mager, unpublished observations). To quantitate theresponse of LTRs with these known differences in sequence and promoter activities to thepresence of T, LTRs representative of each subtype were selected for further analysis. Theresulting LTR-CAT constructs were co-transfected into CV-1 cells, with either the pSVw2i.t orpRSV T/t vectors, and the degree of stimulation of CAT activity in the presence of T was thenmeasured. The results of several experiments are summarized in figure 4-6. Approximatelyfive and twelve fold increase in CAT activity was achieved upon co-transfection of pRSV T/twith the 5’Rl (type 1) and 3’Hb (type la) LTRs, respectively. However, although activity fromthe PB-3CAT (type 2) construct was induced to thirty fold over its activity in the absence of T,the H7 LTR (also type 2) was unaffected by T. These results suggest that the ability of aparticular LTR to be trans-activated by T is not directly related to the broad sequencedifferences used to classify these LTRs. Similarly, the degree of stimulation by T does notappear to be related to subtype, since the type 1 and la LTRs showed similar levels of inductionand since the PB-a LTR, which was stimulated 30 fold, and the H7 LTR, which was notsignificantly affected by T are both type 2.105PB-3 CAT pSV2ACATFigure 4-5. Southern analysis of plasmid sequences transfected into COS-1 and CV-1 cells.Unintegrated PB-3 CAT and pSV2ACAT DNA was isolated 48 hours post transfection asdescribed in Chapter 2, digested with the enzymes shown and hybridized with radioactivelylabelled PB-3 CAT plasmid. Digestion patterns are compared with purified plasmid DNAisolated from bacterial cells. Arrows indicate bands of interest (see text).cos-i cv-i plasmid COS-1 CV-1 plasmidI I I F I I I I I ic — — —C5 — — —C.)— C—° OcQ OQ oco 0C() OocO.c cO-c cu.cIii:: a106LTR % Conversion Fold10 15 InductionP B3- 303’Hb-____________125’ Ri- 5H 7 1.6Figure 4-6. Stimulation of different LTR subtypes by T. LTR-CAT constructs werecotransfected into CV-1 cells with either pSVw2At (white boxes) or RSV T/t (grey boxes) andassayed for CAT activity. Results shown for each LTR are the mean and standard deviationsof the percent conversions of chloramphenicol to its mono and di acetylated forms over threeexperiments. Fold inductions were calculated from these values. To ensure that thecalculated inductions were accurate, samples were measured during the linear phase of theCAT reaction. Thus, as the different constructs were assayed independently, their percentageconversion values are not directly comparable.107Sequence comparisons of the particular LTRs used in this study have not led directly toreadily identifiable motifs which may mediate the response to T, or to any sequencessignificantly similar to known T binding sites. This observation is consistent with the factthat little sequence similarity has been found among promoters that are activated by T. In fact,a mutant heat shock 70 promoter with all putative T binding sites removed retains the ability tobe stimulated by T (Taylor et al., 1989). These findings are in agreement with the finding thatT can stimulate transcription in the absence of sequence specific binding to the activatedpromoter (Tack and Beard, 1985) and that transactivation by T does not require its nuclearlocalization (Pannuti; et aL, 1987; Wildeman,1989). Recent studies (Gallo et al, 1988; 1990)have shown that certain simian DNA binding factors have altered activities in the presence ofT. It has also been observed that mRNA and protein levels of the transcription factor Sp 1 areincreased in SV4O infected cells (Saffer et al., 1990) Consequently, it is possible that severalcellular transcription factors with different recognition sequences may be modified and/orinduced by T to interact with different promoters. Thus, it is probable that T activates RTVL-Hsequences through either a direct or indirect interaction with the cellular proteins normallyinvolved in regulating RTVL-H transcription. Identification of these factors and thesequences with which they interact will aid in our understanding of the mechanismsgoverning the regulation of this large family of retroviral sequences. For example, althoughthe unresponsive H7 LTR and the responsive PB-3 LTR are highly related (89% identical),there are a number of nucleotide differences between them (figure 4-4a). Future studies of thesequences involved in the T response will concentrate on these variations.T Antigen Can Tmnsactivate Stably Integrated PB-3CAT ConstructsAnalysis of the effect of T on the transcriptional activity of endogenous RTVL-Helements is hampered by their high copy number in the genome and by their differing levels ofresponse to T. Therefore, to study the effect of T on a specific integrated RTVL-H LTR, two CV-1 cell lines (PB3-1 and PB3-2) containing 200-300 copies of stably integrated PB-3 CAT108cv-1 PB3-1 PB3-2I I(N>> U)U)Figure 4-7. Trans-activation of stably integrated PB-3 CAT constructs by SV4O large Tantigen. The cell lines CV-1, PB3-1 and PB3-2 were transfected with RSV TIt or pSVw2tt andassayed for CAT activity..I I.I>> U)U)I I•0>> Cl)U)109constructs were generated. The cell lines were transfected with pSVw2t and pRSV T/t andassayed for CAT activity 72 hours later. A typical result is shown in figure 4-7. Untransfectedcells, or those into which the negative control was transfected showed no CAT activity. Thisfinding is consistent with results of the PB-3 CAT transient assays (Figures 4-1, 4-2, and 4-3,and Feuchter and Mager, 1990). When the T producing plasmid was transfected, however,each of these lines produced significant levels of CAT enzyme. Thus, in two different celllines, stably integrated PB-3 is responsive to T antigen.Significance of These FindingsThis study supports the hypothesis that SV4O large T antigen can transactivate certainhuman RTVL-H LTRs. In addition to regulating the levels of expression of RTVL-H genomicsequences (Wilkinson et al., 1990), some human RTVL-H LTRs may be involved inregulating the expression of linked cellular genes. This idea is supported by the recentidentification of several distinct cellular non RTVL-H transcripts which initiate correctly inRTVL-H LTRs (see Chapter 5). As the human genome harbors approximately 3000 relatedLTRs, it is conceivable that a significant number of these promoters and the sequences thatthey regulate will be stimulated by T or by other oncogenes. Although these investigationshave concentrated on human RTVL-H elements, it is worthwhile to note that similar numbersof RTVL-H elements exist in the genomes of apes and old world monkeys (Johansen et al.,1989; Goodchild and Mager, unpublished observations). Thus, SV4O infection may triggerendogenous expression of RTVL-H elements in its natural host (Rhesus monkey) or in othermonkeys which may become infected.110REFERENCESAlwine, J.C. 1985. Transient gene expression control: Effects of transfected DNAstability and trans-activation by viral early proteins. Mol. Cell. Biol. 5:1034-1042.Andrews, P. W. 1984. Retinoic acid induces neuronal differentiation of a clonedhuman embryonal carcinoma cell line in vitro. Dev. Biol. 103:285-293.Andrews, P.W., Damjanov, I., Simon, D., Banting, G. S., Carlin, C., Dracopoli, N. C.,and J. Fogh. 1984. Pleuripotent embryonal carcinoma clones derived from the humanteratocarcinoma cell line Tera-2. Lab. Invest. 50:147-162.Brady, J., J.B. Bolen, M. Radnovich, N Salzman, and G. Khoury. 1984. Stimulation ofsimian virus 40 late gene expression by simian virus 40 tumor antigen. Proc. Natl.Acad. Sci. USA. 81:2040-2044.Brady, J., and G. Khoury. 1985. Trans activation of the simian virus 40 latetranscription unit by T antigen. Mol. Cell. Biol. 5:1391-1399.Brinster, R. L, H. Y. Chen, A. Messing, T. van Dyke, A. J. Levine, and R.D.Palmiter. 1984. Transgenic mice harboring SV4O T antigen genes developcharacteristic brain tumors. Cell. 37:367-379.Cao, S.X., H. Mishoe, J. Elion, P. E. Berg, and A. N. Schechter. 1989. Activation of thehuman e and b globin promoters by SV4O T antigen. Biochem. J. 258:769-776.DeCaprio, J. A., J. W. Ludlow, J. Figge, J-Y. Shew, C-M.. Huang, W-H. Lee, E.Marsilio, E. Paucha, and D. M. Livingston. 1988. SV4O large tumor antigen forms aspecific complex with the product of the retinoblastoma susceptibility gene. Cell.54:275-283.Eddy, B.E., G.S. Borman, G.E. Grubbs and R.D. Young. 1962. Identification of theoncogenic substance in rhesus monkey kidney cells as simian virus 40. Virology17:65-72.Feinberg, A.P. and B. Vogeistein. 1983. A technique for radiolabeling DNArestriction endonuclease fragments to high specific activity. Anal. Biochem. 132:6-13.Feuchter, A. and D. Mager. 1990. Functional heterogeneity of a large family ofhuman LTR-like promoters and enhancers. NucI. Acids. Res. 18:1261-1270.Gallo, G.J., G. Gilinger, and J. Alwine. 1988. Simian virus 40 T antigen alters thebinding characteristics of specific simian DNA binding factors. Mol. Cell. Biol.8:1648-1656.Gallo, G.J., MC. Gruda, J.R. Manuppello, and J. Alwine. 1990. Activity of simianDNA binding factors is altered in the presence of simian virus 40 (SV4O) earlyproteins: Characterization of factors binding to elements involved in activation of theSV4O late promoter. J. Virol. 64:173-184.Gorman, C.M., L.F. Moffat, and B.H. Howard. 1982. Recombinant genomes whichexpress chloramphenicol acetyltransferase in mammalian cells. Mol. Cell. Biol.2:1044-1051.111Graham, F. L., and A. J. Van der Eb. 1973. A new technique for the assay ofinfectivity of human adenovirus 5 DNA. Virology. 52:456-467.Ide, T., S. Whelley, and R. Baserga. 1977. Stimulation of RNA synthesis in isolatednuclei by partially purified preparations of simian virus 40 T-antigen. Proc. Natl.Acad. Sci. USA. 74:3189-3192.Johansen, T., T. Holm, and E. Bjorklid. 1989. Members of the RTVL-H family ofhuman endogenous retrovirus-like elements are expressed in placenta. Gene. 79:259-267.Kay, R. and R. K. Humphries. Methods. Mol. Cell. Biol., In press.Keller, J.M. and J.C. Aiwine. 1985. Analysis of an activatable promoter: Sequencesin the simian virus 40 late promoter required for T-antigen-medidatedtransactivation. Mol. Cell. Biol. 5:1859-1869.Koniecki, J., P. Nugent, J. Kordowska, and R.. Baserga. 1991. Effect of the SV4O Tantigen on the post-transcriptional regulation of the proliferating cell nuclear antigenand DNA polymerase a genes. Cancer Res. 51:1465-147 1.Livingston, D.M. and M.K. Bradley. 1987. The Simian virus 40 large P antigen: Alot packed into a little. Mol. Biol. Med. 4:63-80.Loeken, M.R., G. Khoury, and J. Brady. 1986. Stimulation of the adenovirus E2promoter by simian virus 40 T antigen or Ela occurs by different mechanisms. Mol.Cell. Biol. 6:2020-2026.Loeken, M., I. Bikel, D.M. Livingston, and J. Brady. 1988. Trans-activation of RNApolymerase II and III promoters by SV4O small t antigen. Cell. 55:1171-1177.Luria,S. and M. Horowitz. 1986. The long terminal repeat of the intracisternal Aparticle as a target for transactivation by oncogene products. J. Virol. 57:998-1003.Mager, D. L. 1989. Polyadenylation function and sequence variability of the longterminal repeats of the human endogenous retrovirus-like family RTVL-H.Virology. 173:591-599.Nelson, M. and M. McCleIland. 1991. Site specific methylation: effect on DNAmodification methyltransferases and restriction endonucleases. Nucl. Acids. Res.19:2045-2071.Pannuti, A., A. Pascucci, G. LaMantia, L. Fisher-Fantuzzi, C. Vesco, and L Lania.1987. Trans-activation of cellular and viral promoters by a transformingnonkaryophilic simain virus 40 large T antigen. J. Virol. 61:1296-1299.Reed, S.I., G.R. Stark, and J.C. Aiwine. 1976. Autoregulation of simian virus 40 geneA by T antigen. Proc. NatI. Acad. Sci. USA. 73:3083-3087.Rio, D., A. Robbins, R. Myers, and R. Tjian. 1980. Regulation of simian virus 40early transcription in vitro by a purified tumor antigen. Proc. Natl. Acad. Sci. USA.77:5706-5710.112Saffer, J. D., S. P. Jackson, and S. J. Thurston. 1990. SV4O Stimulates expression ofthe transacting factor Spi at the mRNA level. Genes. Dev. 4:659-666.Southern, E. M. 1975. Detection of specific sequences among DNA fragmentsseperated by gel electrophoresis. J. Mol. Biol. 98:503-517.Southern, P. J. and P. Berg. 1982. Transformation of mammalian cells to antibioticresistance with a bacterial gene under control of the SV4O early region promoter. J.Mol Appl. Genet. 1:327-341.Srinivasan, A., K.W.C. Peden, and J. Pipas. 1989. The large tumor antigen ofsimian virus 40 encodes at least two distinct transforming functions. J. Virol.63:5459-5463.Stanbridge, E. J. 1990. Human tumor suppressor genes. Annu. Rev. Genet. 1990.24:615-657.Tack L. and P. Beard. 1985. Both trans-acting factors and chromatin structure areinvolved in the regulation of transcription from the early and late promoters in simianvirus 40 chromosomes. J. Virol. 54:207-2 18.Taylor, LC.A., W. Soloman, B.M. Weiner, E. Paucha, M Bradley, and R.E.Kingston. 1989. Stimulation of the human heat shock protein 70 promoter in vitro bysimian virus 40 large T antigen. J. Biol. Chem. 264:16160-16164.Tegtmeyer, P. 1972. Simian virus 40 deoxyribonucleic acid synthesis: The viralreplicon. J. Virol. 10:591-598.Topp, W.C., D. B. Rifkin, and M. J. Sleigh. 1981. SV4O mutants with an altered smallt protein are tumorigenic in newborn hamsters. Virology. 111:341-350.Whelly, S., I. Toshinori, and R. Baserga. 1978. Stimulation of RNA synthesis inisolated nucleoli by preparations of simian virus 40 T antigen. Virology 88:82-9 1.Wildeman, A. 1989. Transactivation of both early and late simian virus 40 promotersby large tumor antigen does not require nuclear localization of the protein. Proc. Natl.Acad. Sci. USA. 86:2123-2127.Wilkinson, D. A., J. D. Freeman, N. L. Goodchild, C. A. Kelleher, and D. L. Mager.1990. Autonomous expression of RTVL-H endogenous retroviruslike elements inhuman cells. J. Virol. 64:2157-2167.Wincile, J. J., D. M. Albert, J. M. O’Brien, D. M. Marcus, C. M. Disteche, R.Bernards, and P. M. Mellon. 1990. Retinoblastoma in transgenic mice. Nature.343:665-669.113CHAPTER VHaimrni Endogenous LTRs Regulate theExpression of Diverse Cellular TranscriptsTw nanuc,ipts are currently beingpreparedjvm this chapterINTRODUCTIONThe promoter insertion model of gene activation by retroviruses was first proposed tenyears ago (Neel et al., 1981; Payne et al., 1981) to explain the activation of c-myc in the bursallymphomas of chickens infected with avian leukosis virus (ALV), (Chapter 1; Hayward et al.,1981). Since then, there have been over forty further examples of retrovirus-induced geneactivation (reviewed in Chapter 1) and LTR promoter insertion has become accepted as animportant mechanism of mutagenesis and carcinogenesis by retroviruses.In mammals, particularly in mice, there have been several reports in whichendogenous retrovirus-like sequences have acted as insertional mutagens in cultured cells(Chapter 1; Hawley et al., 1982; Ymer et al., 1985; Greenberg et al., 1985; Shell et al., 1987; Blattet al., 1988; Kongsuwan et al., 1989; Blankenstein et al., 1990). An interesting example occursin the murine myelomonocytic leukemia cell line WEHI-3B. This line constitutivelyproduces the growth factor interleukin-3 due to an lAP insertion 5’ of the gene (Ymer et al.,1985) and expresses the Hox 2.4 homeobox gene due to an LAP promoter insertion in the firstexon of this gene (Blatt et al., 1988; Kongsuwan et al., 1989). Two cases of germ line events thathave been maintained in evolution have also been described (Stavenhagen and Robins, 1988;Banville and Boie, 1989). In one case, transcription of the mouse sex limited protein gene hasbeen shown to be androgen dependent due to the enhancer donation of an upstream proviralelement (Stavenhagen and Robins, 1988). It has also been shown that the oncomodulin gene ofthe rat is promoted by a solitary LTR related to LAP elements (Banville and Boie, 1989).114The human genome contains several distinct families of endogenous retrotransposonlike sequences, which include both retrovirus-like elements (Chapter 1; Larsson et al., 1989)and non-LTR containing sequences typified by the Li family (Chapter 1; Fanning andSinger, 1987). In 1988, the first examples of gene inactivation by insertion of Li sequenceswere reported (Morse et al., 1988; Kazazian et al., 1988). However, no occurrences of geneactivation or altered regulation by recent insertion of human retrotransposons have beenidentified. Nonetheless, it is quite possible that such events do occur in the human genome.Indeed, since the human mutations that are typically analyzed have a detrimental phenotypiceffect, there has been little opportunity to observe cases where retrotransposons have imposedaltered, but not necessarily detrimental, regulation on an adjacent gene.This report describes a strategy to systematically search for cellular genes under thetranscriptional control of an LTR-like sequence, by identifying chimeric transcripts thatinitiate correctly in an endogenous LTR. The human genome harbors many LTR containingelements that may have the capacity to activate cellular genes. However, most families ofthese elements are either low in copy number (1-100), (Larsson et al., 1989) or have been shownto be primarily transcribed from external promoters (Paulson et al., 1987). Thus, I havechosen to use the RTVL-H family of endogenous retrovirus-like sequences (Mager andHenthorn, 1984) to increase the likelihood of finding LTR driven genes. This class ofelements contains close to 1000 members, in addition to several hundred solitary LTRs,dispersed on all chromosomes (Fraser et al., 1988). RTVL-H LTRs have been extensivelycharacterized and have varying levels of promoter activity when linked to a reporter gene(Chapter 3; Feuchter and Mager, 1990). Taken together, these findings raise the possibility thatthese sequences may affect the expression of adjacent cellular genes. In this report theidentification and analysis of five NTera2Dl cDNA clones that appear to be derived fromdifferent RTVL-H LTR promoted transcripts is described. Analysis of one clone has revealedthat the LTR is naturally linked to a CpG island. In another clone, the evidence stronglysuggests that it arose via a recent insertion of an RTVL-H element into a novel gene with a115region of homology to yeast CDC4. A third clone appears to be the result of a splicing event inwhich an RTVL-H transcript has spliced into a novel phospholipase A2 related gene.RESULTSStrategy to IdentiIr RTVL-H Promoted TranscriptsA proviral LTR is divided into 3 functional regions, U3, R, and U5 (Chapter 1; Temin,1982; Varrnus and Brown, 1989). U3 contains the retroviral transcription regulatory signalsand strongly contributes to tissue specificity and levels of viral gene expression (Varmus andBrown, 1989). In all retroviruses, the 5’ terminus of R is defined by the transcription initiationsite, usually GC, and the R region of mammalian retroviruses contains the polyadenylationsignal followed 15-20 bases downstream by the polyadenylation site. Because of thisarrangement, the 5’ end of a transcript promoted by an LTR does not contain U3 sequences. Afull length retroviral transcript will, however, contain U3 sequences at the 3’ end, as indicatedin figure 5-la, and will hybridize with all retroviral probes. Unit length RTVL-H transcriptsof this structure are abundant in the human teratocarcinoma cell line NTera2Dl, in othertransformed cell lines, and in normal placenta (Wilkinson et al., 1990). However, inaddition to these full length transcript,s, it is possible that genomes containing large familiesof endogenous retrovirus-like sequences such as RTVL-H may also produce transcripts of thealternative structures diagrammed in figures 5-ib, c, and d. Type “b” transcripts, whichrepresent an intact solitary LTR contained within a larger transcript will hybridize with U3and US probes, but not with any internal probes. Heterologous type “c” transcripts that havebeen polyadenylated by an LTR will hybridize only with the U3 specific probe, as wouldtranscripts of type “a”, which had been truncated at their 5’ ends. Indeed, our laboratory hasidentified several cellular transcripts that are polyadenylated by an RTVL-H LTR (Mager,1989; Goodchild and Mager, unpublished observations). Transcripts that have correctlyinitiated in an LTR but that contain no other retroviral sequences would hybridize only withthe U5 probe. If one or more of these types of transcripts are present in cells, their isolation116from a cDNA library should be relatively straightforward. Thus, I have exploited thestructural differences among type “d” and the other expected transcripts in a differentialhybridization strategy to screen for cellular transcripts that have been promoted by an RTVLH LTR.Because the NTera2Dl cell line expresses high levels of endogenous RTVL-Helements (Wilkinson et al., 1990) and because six of seven RTVL-H LTRs tested were shownto promote expression of a linked heterologous reporter gene in these cells (Chapter 3; Feuchterand Mager, 1990), many LTRs are expected to be active in NTera2Dl cells. Thus, to increasethe probability of identifying cellular transcripts initiating in an RTVL-H LTR, a cDNAlibrary derived from NTera2Dl cells was used for this screen.Isolation and Analysis ofRTVL-H Promoted TranscriptsI screened 250,000 phages from the NTera2Dl eDNA library with the U3 probe, the U5probes, probe 1, and a combination of probes 2, 3, 4, and 5 (as described in figure 5-1 and inChapter 2). Four distinct phage clones positive only for the U5 specific probes (designated AF-1,AF-2, AF-3 and AF-4) were identified and sequenced. Figure 5-2a shows a comparison of the5’ termini of these cDNA clones with the sequence of a consensus RTVL-H element. The 5’end of each clone begins within the R region of an RTVL-H LTR. Homology to RTVL-Hextends, in all cases, through US to the 3’ terminus of the LTR. At this point, the sequences ofthe isolated clones vary from the consensus internal RTVL-H sequences and from each other,as would be expected of cellular sequences being transcribed from either a solitary LTR or the3’ LTR of an intact RTVL-H element.The structures of these eDNA clones are illustrated schematically in figure 5-2b.Clone AF-1, of approximately 1.9 kb, was partially sequenced (137 bp at the 5’ end and 258 bp atthe 3’ end) and found to contain a 3’ A rich terminus which was part of an Mu element (Schmidand Jelinek, 1982) within the clone. Thus, this tract of poly A was most likely used to prime thetranscript during the construction of the library and consequently is not expected to represent alegitimate polyadenylation event. Clone AF-2, a 1412 bp cDNA, was sequenced in its entirety1172 4 5 U3 U5U3U5 1___________3_=a U3u i 2 3 iiI— - AAAAb U3JU3 U5I- AAAAC U3U3=F- *8AAd u PU)U5I- AAAAFigure 5-1. Strategy for isolating cellular LTR promoted genes. Genomic DNA is representedby thick lines (striped lines are RTVL-H sequences and black lines are unrelated cellularsequences). The functional domains U3, R, and U5 are indicated in the LTRs (large openarrows). The probes used in the differential screening strategy are shown at the top of thefigure and are described in Chapter 2. Possible genomic arrangements of RTVL-H elementsand the expected transcripts (dashed lines) are labelled a, b, c, and d. Probes that would beexpected to hybridize with cDNA clones derived from these potential transcripts are indicated.118and contains no open reading frames. A search of the Genbank database (release 68) revealedno significant sequence similarity with known sequences other than with an Alu elementextending from position 328 to 601. As these cDNA clones were not promising candidates forLTR promoted protein coding genes, they were not analyzed further.The AF-4 cDNA clone was originally isolated as a head to head fusion between aputative LTR promoted transcript and position 149 of a 5’ truncated human epoxide hydrolasecDNA (Jackson et al,. 1987) that was most likely generated during construction of the cDNAlibrary. The portion representing the 459 bp putative LTR promoted transcript is shown. CloneAF-3 is 2516 bp with an 852 bp long open reading frame, beginning 2 bp upstream of the 3’ end ofthe LTR (position 67). AF-3 contains ATG sequences at positions 319, 337, 370, and 583. TheATG at position 370 is in a favorable context for translational initiation (Kozak,1987). Theselatter two clones are discussed further below.Clone AF-4 Contains a GC Island Linked to the LTRSequencing of the AF-4 clone revealed RTVL-H R and U5 sequences linked to a highlyG + C rich sequence (411 bp) with features characteristic of a CpG island (Bird, 1986).Specifically, the non LTR region is 73% G + C with a comparable number of CpG and GpCdinucleotide pairs (52% and 55%, respectively). CpG islands are large (usually 0.5 to 2 kb)unmethylated regions that have been found to be associated with the 5’ end of many vertebrategenes (for review see Bird, 1986, 1987). Because these sequences are not deficient in thedinucleotide CpG, as is the rest of the vertebrate genome (Bird, 1986), rare cutting restrictionenyzmes that contain CpG in their recognition sequence have been useful in identif’ing theseCpG islands (MacLeod et al, 1991; Bird, 1987). Figure 5-3á is a restriction map illustrating thehigh frequency of restriction sites diagnostic of CpG islands in this sequence.To investigate the structure of this region in the genome, the genomic locus inNTera2Dl cells and an unrelated leukemia cell line KG-la (Koeffler and Golde, 1980) wasstudied by Southern blot analysis with the AF-4 specific probe shown in figure 5-3a. Figure 5-119AR U5CCC CTC C CcCccccc...Ar-I ACTC*CCCCGCCTGCACCCIIGGTCAAATAAAC*CCCACGTTGCTCACAC GCCGJTIGOTGCTCTCTTCACATGGACCCGCATG GAAGAATACTTflAAAAAAAAAAA...ii I liiiAF—2 CCCCTCCCTCCACCCflCCTC. .ATTAAAAGCTTTATTGCTCACAC AACCTCTTTCCTCCTCICTTCACACGQA.TCGCATC CTCCCCGCATTTGGTACAACGTTC...1111111 III] II111IIIj1 III) Ii liii! liii II 11111111111 liii ii II IAF—3 GOTA. .ATTAAAACCTTTATTGCTCACAC AACCTGTTTCCTCCTCTCTTCACACGAACGCGCATG CCCCTGGAAAGTGMTTTCT7TCC..III, liii IIAr-I CAT ACCCTATTTOGTGGTCTCTTCACACGIACGCGCATGM TCACTC7TCCAACTT056TCAcAA...B Expected Structure:LLObtained cDNAs:AluAF-1 AA. 1.9 kbAluAF-2 1412 bpAF-4 459 bpATG TGAAF 3I‘W 2516 bpORF (284 as)Figure 5-2. 5’ termini and structure of the NTera2D 1 eDNA clones. (a) The 5 termini of theclones AF-1-4 are aligned with an RTVL-H consensus sequence that extends from the firstbase of R in the 5’ LTR into RTVL-H internal sequences. LTR derived sequences are boxedand the functional domains R and U5 are indicated. The match lines above each sequenceindicate identity with the RTVL-H consensus rather than with the element directly above. Inthe consensus sequence, derived from 10 LTRs reported in Mager (1989), R represents A or Gand Y represents C or T. The two classes of poly A signal represented in the clones,AAATAAA (AF-1) and ATTAAA (AF-2 and AF-3) have previously been observed in differentRTVL-H LTRs (Mager, 1989) and are underlined. Match lines are shown between AF-2 andAF-3 in this region to illustrate this point. (b) Structure of eDNA clones. Expected structureof an LTR promoted transcript is compared with structures of the clones identified. Clones aredrawn to scale, except for the LTR subregions R and U5 (open arrows), that have been enlargedfor clarity. Striped boxes are non-RTVL-H cellular sequences. The open box designates anopen reading frame in clone AF-3 extending 183 amino acids from the ATG that is infavorable Kozak consensus (also see figure 5-4).1203b illustrates that this probe hybridizes with the same single 5 kb Hindill fragment in DNAfrom NTera2Dl and KG-la cells. This result indicates that AF-4 is present in the genome as asingle copy locus. Knowledge of RTVL-H LTR structure was next exploited to analyze thegenomic linkage of the LTR to the AF-4 CpG island in the following way: RTVL-H LTRs varyin size from 400-450 bp and 95% (19/20) of those sequenced contain a Stu I restriction enzymesite at their 5’ terminus (Mager, 1989; unpublished data). Thus, if a complete LTR isassociated with the CpG island, the AF-4 probe (a 170 bp EarI/EcoRI fragment) should detect anapproximately 800 bp fragment in a Stul/Sma I double digest of genomic DNA (as diagrammedin figure 5-3a). Figure 5-3c illustrates that a band of the predicted size is, indeed, found inDNA from both NTera2Dl and KG-la cells. In addition, 25% of sequenced LTRs contain anMspI site approximately 300 bp from their 3’ ends (figure 5-3a). Thus, if the LTR in questioncontains this site, the 170 Earl/EcoRl probe would detect an approximate 550 bp MspI fragment.A fragment of this size is seen in Msp I digested NTera2Dl and KG-la DNA (figure 5-3d).These results are consistent with a genomic structure as diagrammed in figure 5-3a andindicate that this arrangement is not specific to NTera2Dl cells.The AF-4 cDNA clone originally isolated is not polyadenylated and is shorter thanmost CpG islands, which suggests that it is probably a 3’ truncation of a cellular LTR promotedtranscript. Given the strong association between cellular genes and 5’ CpG islands, it islikely that regions downstream of the AF-4 sequence will be found to encode a gene. Althoughthe 170 bp EarI/EcoRI probe was used to rescreen the NTera2Dl cDNA library and to performNorthern analysis of total RNA from several different human cell lines, related clones ortranscripts were not detected. As it has previously been shown that the majority of RTVL-HLTRs have relatively weak promoter activity (Chapter 3; Feuchter and Mager, 1990) and asomewhat limited range of expression (Wilkinson et al., 1990), the difficulty in observing thefull length LTR promoted transcript is not unexpected. Conversely, it is possible thattranscription originating in the LTR is a rare occurrence and that the CpG associatedsequences are usually121A-• • • • • .4. • 0J3R—550bp_________________________________— -8000p_________________________100 bpB C Dz z z9.4-5.6- 49- .. 4.9--4.5kb 4-2.3-4 2.3-2.3-1.6- • 1.1-•800bp 1.1-I 0.5-O.5• •550bpFigure 5-3. Mapping analysis of clone AF-4. (a) Restriction map of clone AF-4. Solid linesindicate the cloned cDNA sequence. Dashed lines represent the putative upstream portion ofthe RTVL-H LTR expected in the AF-4 genomic locus. Symbols represent restriction enzymesites diagnostic of CpG islands as follows: •, Thai (CGCG); +, Nan (GGCGCC); 0, EagI(CGGCCG); A, Sacli (CCGCGG); X, XmaI (CCCGGG); and, V, NaeI (G4DCGGC). The 170 bpEarTiEcoRI probe (dark box) probe used is shown, as are the restriction enzyme sites used inSouthern analysis. The relationship of the 800 bp StulJSmaI fragment and the 550 bp MspIfragment observed in Southern analysis to the putative genomic structure of AF-4 is indicated.(b, c, and d) Southern analysis. The AF-4 specific probe shown in part A was hybridized with(b) Hindlil (c) StulJSmaI and (d) MspI digests of genomic DNA from KG-la and Ntera2Dlcell lines. Size markers are shown for reference and the bands of interest are indicated by thearrows.122regulated by promoter sequences located downstream of the LTR. If this is the case, the portionof the CpG island isolated might not typically lie within the transcribed region and thus wouldbe unable to detect downstream transcripts.Clone AF-3 was Most Probably Generated via a Recent Transposition ofan RTVL-H Elementinto a Cellular ExonClone AF-3 is a 2516 bp polyadenylated cDNA with the 3’ terminal 68 bp of RTVL-H Rand U5 sequences at its 5’ end. The DNA sequence of the non LTR portion of the clone is notsignificantly similar to any sequence in the Genbank database (release 68). The sequence ofthis clone is shown in figure 5-4 and contains an 862 bp region free of termination codonsextending from position 67 (2 bp upstream of the 3’ end of the LTR) to position 918 followed by a1598 bp 3’ untranslated region. The ORF contains four potential translation initiation (ATG)sites, one of which is in correct Kozak consensus context (indicated in figure 5-2 and 5-4), andencodes a putative protein of 284 amino acids from the start of the region, 183 amino acids fromthe Kozak ATG at position 370. A search of the PIR database (release 28) for proteins withhomology to the predicted amino acid sequence of AF-3 revealed significant resemblance (42%identity over 60 amino acids and 28% identity over a larger region of 108 amino acids) to arepeated motif found in the carboxyl half of the yeast cell division cycle 4 (CDC 4) gene(Yochem and Byers, 1987). Figure 5-5 shows an alignment between repeated motifs in AF-3and CDC 4 and illustrates the conserved arrangement of these motifs (in all of the AF-3 and infour of the six CDC4 repeats., the repeated segments are separated by 9 amino acids). AlthoughAF-3 contains three tandem CDC 4 like repeats, the greatest sequence similarity to the CDC 4motif occurs in the third segment (positions 224-268) in which the infrequently found aminoacids tryptophan and histidine are well conserved.Northern analysis using the 2.3 kb EcoRI fragment of AF-3 as a probe (lower part offigure 5-6) did not detect a transcript of a size similar to the cDNA clone in any of the differenthuman and monkey cell lines tested. However, high levels of a transcript of approximately 10kb was detected in all of these cell lines, including NTera2Dl. There are at least three possible123R U5IDTIATTAAAAICTTTATTICTCACACAAAICCTITTTIOTIITCTCTTCACACDIACICICATIAAAIICCCTIIAGAGTGAATTTDTT 90* K ALES E F VTCCTICCAICTTCACCAATIIATTIATCTCATTTTTIDCTATAAACAICAAIIACCADAAGCTITCCIAICCCTCAATITDTTCTATTAC 180Sc Q L A Q AID LI F C Y K Q Q I FE A V R AL NV F YETTGACCTATIAAIIAICTITCAATCTIAATTCAATAACTIATCCTITITTIAIADADICTITTDAAICTCAAATCCIAAGTTTTDIACAD 270L T YE CA V N LID IT CF V L HEAVE A Q IA SF C QACTccTTcTcAACTACTCATAIAICCcCATccTcccAIAIITTCTICCATICAAITIAITCCATTIATITTCACADACAAAIcccAIcAI 360T F SQL LIE F HF? ADDS A Q V SF L HF T OK A Q QDATITTATCATIITCcTcAAITTTCCcTCCAACTCCCCTITTACTCACITIICAICCAACACCCAICCTIITTTDICAACTCccICTITI 450S V I ii V L K F F 5 N 5 F V T H V A A N T Q F I L A T F A VATCACAITCACTICTAACAOITTATTTDCIITIAACAAATIICACAACCTTCCTCCTCATCAAIITDCTITACAAIACCAICCATACCAI 540IT V TAN AL F A V AKW HALF AD Q CA V Q SQ F SQCTIccAITIIAAATcIATccTCTCATAICcAICAATACAIIAATICACAIIAIICAAATCACTIACCTTTTAIACCAAAITATTCAAITI 630L F V 11SF LOADS TIM DAA Q ITS L L SQ SEQ VCATTCCCAITICTTTDTCATCACTTCAIACAACCICTATATTCTCITCTITIICTTCTIIDATAAAAITTTCADAITCTATTCTACAIAC 720N SQ C F V ITS OKAY EL V C IF AS K SF A V Y ST SACAIIAAIATTIATCCAADTIITITTTDICCATTIIIATDTCITCACTTDCCTTICTCITTCTIADTCATATATTDDDIDAAATTICTAC 810TI DLI Q V V F ID AS V VT C LADS ES YE IDA CEATTCTCTCADIIYCACITIATICAACTCTTTTICTITIITATTIIAATIIAAATICAITIIDATTIIAIATAACCCAIICAITIAIACTI ASSOLD IS AS AT L L L AID KIDS V DLII T Q A V ALCTICTCCTCIIICCATTTTDACCIICCATIACTATIAOITCACATITICTICIITITITICIIAICTAIICCTIITITTIAITIITTCAC ADSL L LI F F *AAIAAIIACCATITCTCATACATTCCATIAATIIAIACTTITTIAIIACCTTIIAIIITCCTIAAAACTICCTIAAACCAAAACTCATTC 1 ID IAIICTTCAAIAIAIIITCATTITITCATATTCTATIAAAACIICCTCTTCTITACATTCAITITIAATIIAAAACTCCAIICCACIATII 1170AAACAIATIATAACATAAIAICCATCCAICIIAICCIAIATIIICAITACCTICTCACAIIAIIAIACAIAIIAITIITCITIITCCIIC 12DSAIITITCIIACCTCAAICAICTCTTTICCTATCCAIIATITIACICTIIAATCCIIICCATIICICTITCTTACIACCAIAIITICATCA 1310TTTCTIICATIICTTCAIIAAICATTITICTATTTTACAACACTTTAACCIITIICATCATIATACCAAACCCICTACTIATIITIACAI 1440CTITACATCAACTCTICCCCTAIIATIAICAIAAITACCTIIAICATAATTCTCTTCTACCACATCTIAATITAACTTAAATTTICTCAA 1530AIAAICAAAATATTTTTTAAAITTATTAATAATTICTATTTTTITAITCTTTICTTIACTTTTTTIIIIIIIATTAICAAAICAIAAAAC 1620TIICTICTIIITTCTITAITTTTAAAAAATCTATATTTTTAITCATAATIAIAAITCTATTTTCACCAATTATIIAAACATACAITIIAI 0711CAAITAAACCACTYATTCCAICTATAATATTATIAAIAACATTTCCCATICATCTACAAICTTIAIAAATAIIATTTTCACAITIIIIAA 1DSOITIIITTACAATATTIIAATAIIAAAATTTIATICTITAATTTIOITCCTITATTTTTCCAAACAACTITICTTCTTCAICACTAATATT SD ACTCTIIIATTTAAATAITAATIYTTAAATYATIITAAATITAATTTAAAACATCTCAATTAAITCTITCTATCAATATTITTCTTCAAICA 1980TITTETAIACCTAIAAAADTATIITTTIIIIIAIICTATTTTATTITITIITTAICACAAIIAATCTAATTTATAICAAITITIAAATAI 2070TYACCATTTTTTCCTOATCTITCATCTTCATAICACAACAAAACOAJDATIATIIAAATICTCTTIAICTCACAACATTT0TTTTTCTTTT 21609AAITAAATICAAITACCAAAICTCACTACTICIITTTICCTOTOCCTIIACAATIAIICIIAICCACTITTITTIIIICACCCCCTTCC 2251CTCCCCIIITTTICAAATAIAIICTACCIIITICTITATTCAICAACACCTITTTTACTATTTITTATTAAACTATCATCTCCACCTTCC 2340TTTTIATTAICAATTTITACTAAIAAACIAAACAATITTATTTIITIITITATAATTCTACTTTTCTAITAIATTACTITITIIAATTCT 2430ITIAAAAATATTTIAIAAAAIICCTITATTICATAAATAAATTCTTTITATITTITIAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 2516Figure 5-4. DNA sequence of clone AF-3. The LTR sequences are overlined with an arrow.Four potential translation initiation sites (ATGs) are overlined, the one which most closelyfits the Kozak consensus is bracketed. The predicted potential open reading frame extends tothe stop codon TGA at position 919. The amino acid sequence of the putative protein encoded byAF-3 is indicated below.124* * * * * *(138-179) VNKNNNLPANQCAVQ- -- -D P-S LPVEIDPLASNTCMNRRQITDAF-3 (180-223) LD SIQVNS CFVS-TS-DNRYILMCCFWDKUF5TDTCRLIQV(224-268) V F CNN DV VT CL AR sjsi CC N C - S I LCR I A[3jL LL!JY N N C N A V CL E(377-417) RCN?4T5VITCLQF-- --ED--YVITCADDKMRV SINKKFLLQ(418-458) JSCNDCCVNALKTA----NCC--ILVCSTDRTVRVwIKKCCCTsV(459-502) FE C N N S TV N CLII V-K N 1K - - S IV T CS NI NT L N VN K L P K ES s V PD..CDC4 (526-565) LNCNMASVRTVSCN -- SVVSCSSDNTLIVN[VAQMKCL5I(566-607) LSCNTDNI5STI5D---NENK--RCISASMDTTININ LENINNNCK...(628-667) LQCNTALCLNLS K--FjSAAADCSSNCN ANDSSRKFSFigure 5-5. Amino acid sequence compari son of the putative AF-3 protein with conservedrepetitive segments of the yeast CDC4 gene product (Yochem and Byers, 1987). The segmentshave been aligned to highlight regions of sequence identity (boxed). Residues that are repeatedin the predicted AF-3 sequence, but not in CDC4 are marked with an asterisk.125explanations for this observation: i) If the primary AF-3 transcript is promoted from an LTRand contains alternate polyadenylation signals, the 2.6 kb clone originally isolated may havebeen generated via an upstream, rarely used polyadenylation signal. In this case, it is notlikely that this strategy would have identified the more abundant 10 kb cDNA clone, due to thelimitations of generating and isolating long cDNAs. ii) Another potential explanation forthese findings is that the RTVL-H LTR resides within an intron of the gene encoding thelarger transcript and only rarely promotes transcription of downstream sequences. Since themore abundant 10 kb transcript would splice the LTR out of its coding sequences, it would nothave been detectable using RTVL-H probes. iii) A third possibility is that the particularNTera2Dl cells used by Skowronski et al. (1988) to produce this cDNA library contained anRTVL-H rearrangement that was not present in the NTera2Dl cells used for Northernanalyses. Consequently, the expected 2.6 kb LTR promoted transcript would not be present inthe NTera2Dl cells used in figure 5-6.To help distinguish between these possibilities, a cDNA clone representing the 10 kbtranscript was compared to the original AF-3 clone. Several cDNA clones derived from the 10kb transcript were isolated by rescreening the same library for phage clones that hybridizedwith the 160 bp EarL’EcoRI and 330 bp EcoRIJPvuII probes shown in figure 5-6 but which did nothybridize with the RTVL-H U5 specific probes. Nineteen clones which met this criteria wereidentified and five were selected for further analysis. One of these clones, AF-3a, 3.2 kb insize, contains the same 2.3 kb EcoRI fragment found in clone AF-3 (figure 5-6), in addition toapproximately 850 bp 5’ to this fragment. If the 2.6 kb AF-3 cDNA clone was promoted from anRTVL-H LTR residing within an intron that is normally spliced out of the larger transcript,the sequences derived from both types of eDNA clones should be identical at their 3’ ends anddiverge at a potential 3’ splice acceptor sequence present in the 2.6 kb clone. Figure 5-7a showsa comparison of the 2.6 kb AF-3 sequence with that of the 3.2 kb AF-3a clone upstream of thecommon 5’ EcoRl site. The point of sequence divergence does not identify a potential splice1261.1uEj9.5-7.5-4.4-2.4-1.4-0.24-(‘1c’1 ca’a)(ø_JU)I_u, a)0z’.zoAAAAFigure 5-6. Detection of AF-3 related transcripts. 20 p.g of total cellular RNA from differentcell lines was hybridized with an AF-3 specific probe. Size markers (in kb) are shown forreference. The lower part of the figure shows a schematic diagram of cDNA clone AF-3 andthree fragments used as probes. The 2.3 kb EcoRI fragment (striped box) was used in theNorthern analysis shown above. The 160 bp EarIlEcoRI (white box) and the 330 bp EcoRJJPvuH(dark box) fragments were used to isolate additional eDNA clones (see text). To ensure that thebands observed were not due to DNA contamination of the samples or to nonspecifichybridization, this filter was rehybridized with the 587 bp Apal/Hincli fragment of AF-5 thatwas used in figure 5-11.127acceptor site, but rather occurs at the 3’ terminus of the LTR in clone AF-3. This precisedivergence at the LTR boundary strongly suggests that an RTVL-H element has inserted intoone allele of the gene encoding the 10 kb transcript.If a rearrangement involving insertion of an RTVL-H element has occurred inNTera2Dl cells and become fixed in this cell line, one would expect to detect thisrearrangement by Southern blot analysis. However, such analyses performed in ourlaboratory using a probe from AF-3a which spans the presumed insertion site has failed todetect differences between NTera2Dl DNA and DNA of unrelated cell lines, although theseexperiments do indicate that the locus is single copy. DNA samples derived from differentpassages of NTera2Dl cells obtained recently from the laboratory of Dr. M. Singer in whichthe library was constructed also showed no differences. Thus, it is quite possible that theRTVL-H insertion occurred in the particular NTera2Dl cells used in 1986 to construct thelibrary but that the cells containing the insertion were subsequently lost from the population.When the sequence derived from the 5’ region of the 3.2 kb clone is combined with thenon-LTR portion of clone AF-3, a continuous open reading frame of 1545 bp is observed. Thededuced amino acid translation of this reading frame is shown in figure 5-7b where the regionwith homology to CDC4 is indicated. Further searches of the Genbank and PIR databases withthe 5’ region (not present in clone AF-3) revealed no other significant homologies. Efforts arecurrently underway to isolate and characterize cDNA clones spanning the entire 10 kbtranscript.Identification of a Chimeric Transcript Utilizing an RTVL-H Splice Donor SitePrevious studies in our laboratory have identified several spliced RTVL-H transcriptsin NTera2Dl cells (Wilkinson et al., 1990) and have determined that a consensus 5’ donorsequence located approximately 150 bp downstream of the 5’ LTR was being used in all cases(Wilkinson et al., 1990). This finding, together with the observation of a smear of transcripts(700-1600 bp) that hybridize only with RTVL-H probes upstream of the 5’ splice donor site inNorthern blots (Wilkinson et al., 1990) predicts that chimeric transcripts in which an RTVL128A R U5AF-3 GGTGATrAAAAGctTTArTGcTcAcACAAAGCcTGTTTGGTGGTCTCTCACACGGACGCGCATGAAAGGCCCTGCAGACTGUtrTGTTI I II III I I II IlIIIIIIIIIIIIIIlIlIII 90AF-3 GTAGTGTCTGATGTCQAACTCCTCCTTGQGCCWACCTCAGAAGAATTtGTTCACATAAACAGAflGGCCCTGGAGAGTGUTTTGTTAF-3 TCCTGCCAGCTtCACCAATGGATTGATCTCAtTTTTGGCTATAAACAGCAAGGACCAGAAGCTGTCCGAGCCCTCAATGTGTTCThTTACliii lililIlli 11111 III 1111111111 I liii lilt Ill liii ii iii 11111 lii Ill Ii liii III 180AF-3 TCCrGCCAGCTTCACCAATGGATTGATCTCATTTTTGGCTATMACAGCAAGGACCAGAAGCTGTCCGAGCCCTCAATGTGTTCTATTACAF-3 TTGACCTATGAAGcAGcTGTCp.ACTGAAtClii 1111111 III lii 111111 II II 212AF—3 TTGACCTATGAAGAGCTQTCAATCtGAATCGAATrccAcrTDcAccTcGGTTGGccTGcAAGTTTTccArTGccTcwccAGAccTATTrcArrAGcrAGrccAccrcAGcr 90H FL R V V G SF CL P Q T R R t S A S PRO LAGGCTTCTAAAtGACCCAGCGATGGCAACACAGAGAGATATCTAATTTTGAGTACTTGATGTTTCTCAACACGATAGCAGACGG 180F K A S N OR W Q HR E IS F £ Y L M FL NT ZAG RAGrTATAATGAcrTAAATcGmTCcAGTGTTTccTTGGaTcATcAcrAATrATGAATcAGAAGAACrGGATCTTACCTTGCCCACCAAC 270N DL N Q y P V F P WV IT YES EEL 0 f T L P TNTrCAGAGATTrGTccAGccp.ATAGGAGcTcDGAAcccAAAAAGAGcQcATTcTrcGcTcAGcGTtATGAATcArGArGATcAA 360F RD L S K PIG AL N P KR A A F F A ER YE S WED P QGTTccAAAGTTcAcATGGTAcTcA:TAcTcAAcTGcAAGTTTTGTTctTGcATGGcTGcrAAGMrAcAAcccrrrAcAAcrrArTc 453VP F 4 y G H ST A SF V AWL L RI P FTP Y F540N LOG G K F D H APR T S S IS RAW RN SQ RD T630IKE LIP E F? Y C P EM F V N N N L CV MD D GACAGTAGTGTCTGAGtCQAACTTCCTCCTTOGGCCAAAACCTCAGAAGAATTtGTTCACATAAACAGATTGCCCTGGAGACtGAAttT 720TV VS DV L P P WA K IS E EF V H t N RCA L £ SE FGTTTCCTGCCAGCTtCACCAATGGATTGATCTCTTTTGGCTATAAACAGCAAGGACCAGAAGCTGTCCGAGCCCTCAATGtGTTCTAt 810VS C Q C How I DL! F S Y K Q Q S P E A V R AL NV FTACtTGACCTATGAAGGAGCTGTCPATCTGAATTCAATAACTGATCCTCTGTTGAGAGAGGCTGTTGAAGCTCAAATCCGAAGTTTGGh 900Y L t YE GA V N L N S Ito P V I. RE A V A Q IRS F GcAcAcTccTTcTcAAc:AcTcATAGAGCcccTccTcccAGAGGTTcTGCcATGcAAGTGAGrccATTGATGTTcAcAGAcAAGCccAG 9900 T PS OIL! E P1 pp R GSA MO VS PC M T D K A 01 D83Q DVI XV C K F P S N S P V ii VA A N TO p L AT PAGTGATcAcAGTcAcTcTAAcAGGTTATTracGGrGAAcAATGGcAcAAccT:cc:GcrcTcAAGGTGcTGrAcAAGAccAGccTAc 1170V I T V A N R L F A V N K W H L P A H 0 G A V Q. 0 Q P Y1260Q L P V EL D P LI A S NT G 4 H R Q IT DLI D Q SI Q1350VII SOC F V IT SON R Y IL V CC F WOK SF RU? STGACACAGGAAGATrGATCCAAGTGGTGTTrGGCCATTGGGATGTCGTCAcTTGCcTrGCTCTTCtGAGTCATATATTGGGGGAAATTGC 1440D T G R LI Q V V F G [4W Dv Vt CL AR S ES Y I GO NCTAcATTcTcrcAGGTeAcGTATGcAAcTcrTTGcGTGGmTTGGAATGGwTGCAGTGGGArTcGAGATAAcccAGGcAGTGAGA 1530y IL SQ SR DAt C Li WY W N G N AG L El T Q A V RCtGCtGCTCCTCGGCCArTTTGA 1554L L L G P F *Figure 5-7. Comparison of AF-3 and AF.3a. (a) Sequences of the 2.6 kb AF-3 and the 3.2 kbrelated clone, AF-3a are compared. The RTVL-H LTR sequences in clone AF-3 are indicated.Sequence identity extends precisely to the 3’ end of the LTR in clone AF-3, where the sequencesdiverge. Numbering refers to clone AF-3. (b) Composite open reading frame derived fromclones AF-3 and AF-3a. The portion flanked by arrows was derived from the overlappingportion of the two clones. The region upstream was derived soley from AF-3a and the regiondownstream was derived soley from AF-3. The EcoRI site, GAATTCC, at the beginning of thesequence was generated during cloning. The regions corresponding to the CDC 4 like repeatsshown in figure 5-5 are underlined.129H element has spliced into an exon of a downstream gene may exist in NTera 2D1 cells. Suchsplicing events may occur via deletion or mutation of the appropriate RTVL-H splice acceptorsequences, or conversely, by mutation involving sequences controlling splicing of thedownstream gene. To identify transcripts of this type, the NTera2Dl library (250,000 phages)was rescreened for cDNA clones that hybridized with probe 1 and the U5 specific probes, butwhich did not hybridize with the RTVL-H probes 2 -5 used previously (figure 5-1) or with probe 6(figure 5-8a) which maps just downstream of the splice donor sequences.Three phage clones that met this criteria were.identifled. However, two of these cloneswere subsequently shown to hybridize very weakly with a probe mapping downstream of the 3’splice acceptor site typically used by RTVL-H elements (probe 7 in figure 5-8). Future studieswill address the structure of these clones. One 2.4 kb cDNA clone that hybridized only to probe1 and the U5 probes was purified and sequenced. Figure 5-8b is a comparison between the 5’region of this clone, AF-5, and relevant regions of an RTVL-H consensus sequence. Thisfigure illustrates that the clone begins in an RTVL-H LTR one bp downstream of the expectedCAP site. As anticipated, the homology with RTVL-H extends to the 5’ splice donor site andterminates precisely at the 0 residue shown, as expected in a spliced transcript. This structurestrongly suggests that AF-5 is, indeed, the product of a chimeric splicing event between anRTVL-H element and a cellular exon.Clone AF-5 Represents a Novel Phospholipase A2 Related GeneThe complete nucleotide sequence of clone AF-5 and the deduced amino acid sequence(707 amino acids) encoded by its open reading frame are shown in figure 5-9. The ORF shownbegins within the LTR sequence and continues in frame into the non-RTVL-H AF-5 sequence.Clone AF-5 contains a 253 bp 3’ noncoding region, and is not polyadenylated, suggesting thatthis clone is a 3’ truncation of a larger transcript. A search of the PIR database for proteinswith homology to the predicted amino acid sequence of AF-5 revealed two Phospholipase A2(PLA2) like repeats within the putative AF-5 protein (underlined in figure 5-9). PLA2s are adiverse family of enzymes that hydrolyze the sn-2 fatty acyl ester bond of phospholipids to130AI’I’IS.D. S.A.U3 U51 6 7 U3= — — =-. AAAABCap SiteRTVL-H CAAATCTTATAAAACGGCCCCACTCCTATCTCCCTTTGCTGACTC’rCTTTTTGGACTCAG.1111111111111 11111111AF-5- CTGACTCTCTTTTCGGACTCAG.Splice DonorRTVL-H -. .. CCAAGAAACATTTCACCAATTTCAAATCCGGTAAGTGGCCTCTTTTTACTGTCTTCTCCA.II I II IAF-5- CCAAGAAACATCTCACCTTTCAAATCCGCTAccAGGAGGGTGGccAGCTcAGrGGT...Figure 5-8. Identification of a spliced RTVL-Hlgenomic transcript. (a) Strategy for isolatingchimeric spliced cDNAs. Genomic DNA is represented in the upper portion of the figure. Thethick striped line flanked by arrows is an RTVL-H provirus. Hatched boxes are the exons of alinked unrelated cellular gene. A possible readthrough spliced transcript is shown below.The probes used include those shown in the figure, in addition to the probes shown in figure 5-1.Probes expected to hybridize with a cDNA derived from a transcript with the structure shownare darkened. (b) Structure of clone AF-5. The 5 terminal sequence of clone AF-5 is shownand compared with an RTVL-H consensus. Homology with RTVL-H begins one bpdownstream of the RTVL-H CAP site and extends to the expected position in theRTVL-H splicedonor site, as shown. The three dots represent 200 bp of homology with RTVL-H that has beenexcluded from this figure for clarity.131R US120* I A 0. 1 0. T S S 0. F 0 0 L F T H I H 04 K F S A V 0 25240H 0 5 5 F F 1 0 N 0 S F 0 S C S I L H £ H S F F 7 T 5 0 F 0 T 0 Q F H K H I T N 65360F H S A I H H V A H 1 Q H L 7 H 0 H 0 H A S 0 S 0 V F H A S C 0 S K K K F 0 5 1054 HOO I H E I H S S F P T 7 A 0 7 E H A H N 7 T I L I A F A A £ 0 H A 7 F S S F 0 0 145600A 5 F T S 0 A F 7 H S 7 0 7 F S F 7 K A 7 H F H 0 F 0 7 0 5 L S A £ H F F T A S OHS720O £ F V I V F H F H 0 V S H V C S S C C L S F 0 L C S F A H I A F L L T S V I H 225840S F H A 0 0 H F L 0 7 F H L F 0 £ 1 F F 0 1 F H N I N I 7 F F S S H F K S V K S 265960V A H I F 0 C L S A H F T H 1 5 A V F I N F F V L I 0 F V N S H K C V A 5 1 C F 30510800 0 F K 5 0 5 C 7 C H F S H H 5 1 F V I £ S 0 S C C F S H H H C C £ E A A K H 03451200C L 5 0 F A K I 5 7 £ V 5 7 V 5 K K I I C S S 0 0 N C £ H L L C I C 0 K A A IK 3851320CLAR55LN$SLNLL0TSFCLAQTIE7T I H H 0 1 T 1 L L F H V V 4251440F V H F 7 I 1 S L T A L 5 0 £ V A A E 7 H A I H L I I I S K K K A S H 0 0 K S V 46515600 A A H A I S F F 0 S A H 0 V A 7 H V 7 A H I V C L V F A S I H S L S L A V 5 S 5051680V K N 5 F E £ 1 7 K K A C 0 H F I F 7 H L S 5 0 0 N H S V N F Q 1 5 £ H I F C 1 5451800ISRCF K £ F E S 0 5 C 0 C S 5 £ 0 H S E F H 0 0 L 0 H C C I S H 6 C CLEO 5851920V H H L 5 C L 1 £ H I F H S F V V C V I H 1 F K C 5 5 5 S L C £ K L I C A C I 5 6252040TAAECM7SASFEQ5LK5PSELQCPQQPAAK 5 5 L H F V F H A 6652160F T 1 0 S S S £ £ 0 S £ £ 0 F F 0 £ 0 1 0 H A K H 0 L H K S L 0 F L 5 I S F L 8 705TSSAAIAIAGAISCCCAGAGAAAA755C7AACACCTTCAGTA505IGSCTCC75CICCACC77C 2280S H * 7452400SCACISISSAASC55AATTC 2420Figure 5-9. Nucleotide and predicted amino acid sequence of clone AF-5. RTVL-H derivedsequences are overlined. The 3’ terminus of the LTR is indicated by an arrowhead and anATG within the RTVL-H LTR is bracketed. The amino acid sequence of the putative proteinencoded by SF-S is indicated below. The PLA2-like repeats shown in figure 5-10 areunderlined by the dotted arrows. The 587 bp probe used in the Northern analysis of AF-5(figure 5-11) extends from the Hincli site (GTTGAC) at position 280 to the Apal site (GGGCCC)at position 867.132produce free fatty acids and lysophospholipids (Kramer et al., 1989). PLA2sperform a varietyof biological functions, such as regulating general phospholipid metabolism, controllingmembrane fluidity, and initiating the release of arachidonic acid from phospholipids as partof the onset of inflammation. PLA2sare abundant in mammalian pancreatic juices and inthe venom of snakes and bees, where they play a digestive role. However, they occur in traceamounts in all cell types (Kramer et al., 1989). Extracellular forms of PLA2 from a variety ofsources have been characterized and cellular genes encoding several have been cloned andsequenced (Seilhamer et al., 1986, 1989; Kramer et al., 1989; Kusunoki, 1990). The threedimensional structure of recombinant human rheumatoid arthritic synovial fluid PLA2 hasalso recently been determined (Wery et al., 1991). Intruigingly, none of the PLA2sexaminedhave a repeated structure similar to AF-5. These secreted PLA2sare small (130-146 aa) rigidproteins in which the key active site residues and the alignment of cysteines are highlyconserved. These enzymes have been classified as belonging to group I or group II based on thepositioning of certain Cys residues (Henrikson et al., 1977; Davidson and Dennis, 1990).Figure 5-10 is an amino acid sequence comparison of the AF-5 PLA2-like repeats with that ofthe bovine pancreas (group I), Crotallus atrox. venom (group II), and a PLA2 consensus andillustrates the conservation of residues characteristic of PLA2sin the AF-5 repeats. Althoughthe most striking similarity of both repeats to the PLA2sis in the calcium binding loop(residues 25-37 and 49), these sequences have differences at two key residues. G30 is replacedby an R in the first repeat and D49 is substituted by a basic residue in both repeats. Alterationsat these positions have been observed in some naturally occurring inactive homologs of PLA2(Davidson and Dennis, 1990). The lipophilic residues Leu2,Phe5, and Ile9 of the conservedalpha helical amino terminal segment (Kramer et al., 1989) are conservatively substituted inall positions of the 2nd repeat and at position 9 of the 1st Residues implicated in the hydrolyticmechanism of other PLA2s(Asp99,Ala102,Ala103 [Dufton and Hilder, 1983; Renetseder et al.,1985]) are maintained except for Ala103 of the first repeat, which has undergone a conservative133*AF-5 (1st rep)AF-5 (2nd rep)-type 1-type 2-consensus-* * *W. )c =l ‘- -= )______• . I•_1.0 20 30I•4r i50AF.5 (let rep)AF-5 (2nd rep)type 1-type 2-consensusAF-5 (1st rep)AF-5 (2nd rep)type 1-type 2-consensus-* * * * **____•_-2? • ‘ WW 2? flJ\ -.!:.E :::::z.! !!!!!! ! ::60 70 60 90 100* * * *• -2• ••2?= ====110 120 130Figure 5-10. Amino acid sequence comparison of the putative AF-5 protein with group I (bovinepancreatic) PLA2 , group II (Crotallus atrox. venom) PLA2, and a consensus of PLA2sequences (from Kramer et a!., 1989). Regions of sequence identity are boxed. Asterisksrepresent residues that are repeated in the predicted AF-5 sequence but that are not conserved inPLA2s. Numbering is as in Kramer et al. (1989).40134substitution to an lie. The active site residues His48,Asp99,Tyr52and Tyr73 (Kramer, 1989)are less well conserved, as both tyrosine residues are substituted non conservatively in eachcase.The AF-5 PLA2-like repeats both have the characteristic group I-like half cystines atpositions 11 and 77 and retain 9/13 and 8/13 residues of the elapid loop (positions 54-66,Henrickson et al., 1977; Kramer, 1989). Interestingly, the 2nd PLA2-like segment alsocontains a half cystine at residue 50 and a carboxyl extension ending in a half cystine atposition 132, features characteristic of the type II enzymes.To investigate the levels and specificity of expression of AF-5 related transcripts,Northern analysis using RNA from NTera2Dl, Tera 1 (an unrelated human teratocarcinomacell line), K562, Hep 2, and COS-1 cells was performed. The results of a typical experiment areshown in Figure 5-11. Of the lines tested, expression of AF-5 is limited to the teratocarcinomacell lines, with higher levels in Tera-1 cells. Transcripts of 9.4, 5.2, 3.8, and 2.5 kb areobserved. Although it is tempting to speculate that the 2.5 kb transcript is related to the 3’truncated 2.4 kb AF-5 clone originally identified, the origins of the transcripts observed areunclear. If an RTVL-H LTR is involved in the normal regulation of this gene, the differentsized transcripts may have been generated via alternate splicing or polyadenylation of theLTR promoted transcript. Conversely, if the LTR is ipstream of or within an intron of afunctional transcription unit, the AF-5 related transcripts may be heterogeneous with respect tothe presence of RTVL-H sequences. Northern analysis of these AF-5 related transcripts withRTVL-H probes is complicated by the fact that endogenous RTVL-H elements express highlevels of similarly sized transcripts in these cell lines (Wilkinson et al., 1990). Therefore,future studies will attempt to determine the origin of these transcripts via the use of specificoligonucleotide probes that will hybridize only with transcripts containing both RTVL-H andAF-5 related sequences.1354—9. 4-+4—S. 2-+4—3.4—2. 5-+zLt) 0j 4—Actin- • IFigure 5-11. Detection of AF-5 related transcripts. 20 .ig of total RNA was hybridized with the587 bp Apal/Hincli fragment of AF-5 (see legend of figure 5-9). The results of two experimentsare shown. Blots were rehybridized with a chicken -actin probe (a 1.9 PstI fragment).0zbI136DISCUSSIONStrategy to Identify RTVL-H Promoted Cellular TranscriptsThe RTVL-H family of endogenous retrovirus-like elements consists ofapproximately 1000 full length members in addition to several hundred solitary LTRsdispersed on all chromosomes. The abundance and dispersed nature of these LTRs suggeststhat some may direct the expression of unrelated cellular genes, either because they haveevolved to play this role or because of recent genomic rearrangements. This idea is supportedby recent findings suggesting that retroviruses preferentially integrate into transcriptionallyactive “open” genome regions, such as DNAse I hypersensitive sites (Vijaya et al., 1986;Rohdewohld et al., 1987) and CpG islands (Scherdin et al., 1990) that frequently demarcate the5’ controlling region of genes. Similarly, other studies have shown that cellularrecombination events, for example immunoglobulin gene rearrangements (Blackwell et al.,1986) and mating type switching in yeast (KIar et al., 1984), as well as the integration offoreign DNA (Schulz et al., 1987), are enhanced in transcriptionally active genomic regions.Thus, it is likely that, either by virtue of their original dispersal in the genome or as the resultof more recent integration events, at least some RTVL-H elements integrated into or adjacentto genes and may have functionally replaced their native promoters. In this study, adifferential screening strategy was successfully employed to identify several NTera2D 1clones which either hybridized only to RTVL-H U5 specific probes or only to probes mappingupstream of the RTVL-H 5’ splice donor. Sequence analysis has revealed that thecorresponding transcripts were most likely promoted from an RTVL-H LTR. In each case, the5’ end of the clone begins within the R region of an RTVL-H LTR and the LTR homologyextends either to the 3’ terminus of the LTR (clones AF-1, AF-2, AF-3, and AF-4) or to theRTVL-H splice donor sequence (clone AF-5). Thus, the strategy is a simple and direct methodfor identifying cellular LTR promoted genes. It will detect genes under the normaltranscriptional control of an LTR as well as recent rearrangements resulting in LTR drivenexpression. Moreover, as long as some transcripts are produced, this technique has thea137potential to identif’ subtle mutations, such as the down regulation of transcription of oneallele, as well as the increased or deregulated transcription more classically associated withretroviral promoter insertion.The five clones identified in this screen ranged from 1.4-2.7 kb in size. Nevertheless,it is possible that NTera2D 1 cells contain larger LTR promoted transcripts which were notidentified. This strategy is based on the fact that the clones of interest will contain shortregions of homology with RTVL-H at their 5’ ends. Larger transcripts with this structure mayhave been difficult to clone as their U5 regions may have been underrepresented in the librarydue to the difficulties in generating long eDNA copies. Future studies using randomly primedlibraries will be useful in identifing these putative transcripts.Identification of a GC Island Linked to an RTVL-H LTROne incomplete clone, AF-4, which consists of a CpG island linked to an RTVL-H LTRwas identified. Because CpG islands have been associated with the 5’ end of genes and becauserecent reports indicate that retroviruses integrate preferentially into transcriptionally activeregions, it is probable that a cellular gene is located in close proximity to this LTR. AF-4associated transcripts could not be detected on a Northern blot, possibly because this sequence isnormally transcribed at low levels or is only infrequently promoted from the LTR. However,Southern analysis indicated that the normal genomic position of the LTRis in close proximityto the CpG island. Since RTVL-H LTRs contain enhancer sequences capable of activatingheterologous genes (Chapter 3; Feuchter and Mager, 1990), it is possible that the RTVL-H LTRnormally contributes to the regulation of a gene associated with the CpG island by providingenhancer activity. Isolation of the genomic regions associated with this LTR and CpG islandmay clarify the nature of RTVL-H involvement in transcription of these sequences.A Probable Recent RTVL-H Insertion into a CDC4 Related GeneThe clone AF-3 appears to have been generated from an RTVL-H LTR located withinan exon of a large cellular gene containing a region of homology with the yeast CDC4 generepeated motif. Sequencing of related eDNA clones strongly suggests that AF-3 was promoted138from an RTVL-H LTR that had recently inserted into this gene in the cells used to produce thelibrary. As RTVL-H is structurally similar to known retrotransposons such as endogenouslAPs, Ty, and Copia elements, it is likely that the insertion was mediated by aretrotransposition. If so, this event would be the first example of a transposition involving anendogenous retrovirus-like element in human cells.The predicted protein sequence of AF-3 contains a repeated motif similar (42% identityover 60 aa) to that found at the carboxyl half of the yeast CDC4 gene (Yochem and Byers, 1987).A homologous motif (18 - 19% identity with CDC4), repeated contiguously, makes up the fsubunit of bovine retinal transducin, a G protein involved in signal transduction inphotoreceptor rod cells (Fong et al., 1986). The functional role of this motif in CDC 4 and f3transducin is not clear. However, that this motif is functionally important in these proteins isdemonstrated by the observation that all known CDC4 mutants map to this region and that theentire repeated segment is required for a functional CDC4 protein (Breck Byers, personalcommunication), in addition to the fact that the transducin subunit is entirely made up of thisrepeat. Thus, the presence of these sequences in the AF-3 clone, together with the high levels ofexpression of the related 10 kb transcript in all the cell lines tested, suggest that this geneencodes an essential protein. As this repeated domain is shared with a cell division cycle geneand a component of a G protein, it is quite possible that the gene is somehow involved in growthcontrol. Isolation and analysis of the entire coding region of the 10 kb transcript should yieldinsight into the function of this protein.An RTVL-H Element has Spliced into a NovelPLA2-Related GeneOne clone, AF-5, in which an RTVL-H element has used its 5’ splice donor sequence tosplice into an exon of a cellular gene was identified. The predicted amino acid sequence of thenon-RTVL-H derived portion of this clone contains two regions at its carboxyl end withstriking homology to secreted forms of the enzyme PLA2. Although the protein, and morerecently, genomic and cDNA sequences of PLA2sfrom several different tissues and specieshave been analyzed, no PLA2 containing this repeated structure has previously been reported.139The AF-5 PLA2-like repeats contain certain amino acid alterations that have been observed insome naturally occurring, inactive, homologs (Davidson and Dennis, 1990). Such inactiveforms have been frequently observed in snake venoms, where they can increase the potency ofactive PLA2s, possibly by preventing the binding of PLA2 inhibitors or PLA2 degradation byproteases. The presence of these inactive PLA2sin several different subgroups of venomPLA2shas been used to support the contention that a similar regulatory system exists inmammals (Davidson and Dennis, 1990) and raises the possibiltiy that if AF-5 does, indeed,prove to be enzymatically inactive, it may have such a role.The AF-5 clone isolated here does not contain a signal sequence, however, it doescontain the highly conserved pattern of Cys residues characteristic of secreted PLA2s. Thisdegree of conservation would not be expected in a cytosolic protein that is fully exposed to areducing cytoplasmic environment. In fact, a human cytosolic PLA2 which lacks homologywith both the secreted forms of PLA2 and with AF-5 has recently been isolated (Clark et al.,1991).The expression of AF-5 related transcripts in teratocarcinoma cell lines which expresshigh levels of endogenous RTVL-H elements is consistent with a role for RTVL-H in theregulation of this gene. However, the extent of RTVL-H involvement is unclear. In aviansystems, transcripts reading through the 3’ LTR have been shown to account for 15% of the totalviral mRNA. Moreover, mutation of a single base in the polyadenylation signal increasedreadthrough to 80% of viral RNA (Herman and Coffin, 1986; Keshet et al., 1991). Thus, it isconceivable that RTVL-H readthrough transcripts account for a significant proportion of theNTera2D 1 and Tera 1 RNAs. The different sized transcripts observed in these cells may bedue to alternative splicing or 3’ processing, or conversely, may represent transcription from anon-RTVL-H cellular promoter. Future studies will address the genomic structure of thisgene. The identification of upstream exons may reveal the presence of a signal sequence orother structural features that may shed light on the function of AF-5.140The AF-5 clone was generated via splicing of RTVL-H 5’ sequences to a downstreamexon. Two cases involving the similar processing of exogenous retroviral transcripts withcellular oncogenes have recently been reported. Integration of avian leukosis virus (ALV)into the c-erb-B gene yielded transcripts initiating in the 5’ LTR and linking ALV gagsequences to c-erb-B or splicing from gag to enu and from env into c-erb-B (Goodwin et al.,1986). In MoMLV tumors, transcripts initiating in the 5’ LTR and splicing between gag andexon 1 of c-myb have been observed (Shen-ong et al., 1986). As this mechanism, by necessity,links proviral LTRs to gene exons, it may be a frequent mechanism of gene regulation byendogenous retrovirus-like sequences.141REFERENCESAndrews, P. W., I. Damjanov, D. Simon, G. S. Banting, C. Carlin, N.C. Dracopoli,and J. Fogh. 1984. Pleuripotent embryonic carcinoma clones derived from the humanteratocarcinoma cell line Tera-2. Lab. Invest. 50:147-162.Banville, D. and Y. Boie,. 1989. Retroviral long terminal repeat is the promoter of thegene encoding the tumor-associated calcium-binding protein oncomodulin in the rat.J. Mol. Biol. 207:481-490.Bird, A. P. 1986. CpG-rich islands and the function of DNA methylation. Nature.321:209-213.Bird, A. P. 1987. CpG islands as gene markers in the vertebrate nucleus. Trends.Genet. 3: 342-347.Blackwell T. K., M. W. Moore, G. Yancopoulos, H. Suh, S. Lutzker, E. Selsing, and F.W. Alt. 1986. Recombination between immunoglobulin variable region genesegments is enhanced by transcription. Nature. 324:585-589.Blankenstein, T., Z. Qin, W. Li, and T. Diamantstein. 1990. DNA rearrangementand constitutive expression of the interleukin 6 gene in a mouse plasmacytoma. J.Exp. Med. 171:965-970.Blatt, C., D. Aberdam, , R. Schwartz, and L. Sachs. 1988. DNA rearrangement of ahomeobox gene in myeloid leukaemic cells. Embo J. 7:4283-4290.Clark, J.D., L. L. Lin, R. W. Kriz, C. S. Ramesha, L. A. Sultzman, A. Y. Lin, N.Milona, and J. L. Knopf. 1991. A novel arachidonic acid-selective cytosolic PLA2contains aCa2-dependent translocation domain with homology to PKC and GAP.Cell. 65:1043-1051.Davidson, F. F. and E. A. Dennis. 1990. Evolutionary relationships and implicationsfor the regulation of phospholipase A2 from snake venom to human secreted forms. J.Mol. Evol. 31:228-238.Davis, L.G., M. D. Dibner, and J. F. Battey. 1986. In: Basic methods in molecularbiology. Science Publishing, Inc., New York.Dufton and Hider. 1983. Classification of phospholipase A2 according to sequence:Evolutionary and pharmlogical implications. Eur. J. Biochem. 137:545-55 1.Fanning, T. G., and M. F. Singer. 1987. LINE-i: A mammalian transposableelement. Biochem. Biophys. Acta. 910:203-2 12.Feinberg, A. P. and B. Vogeistein. 1983. A technique for radiolabelling DNArestriction endonuclease fragments to high specific activity. Anal. Biochem. 132:6-13.Feuchter, A., and D. Mager. 1990. Functional heterogeneity of a large family ofhuman LTR-like promoters and enhancers. Nuci. Acids Res. 18:1261-1270.142Fong, H. K. W., J. B. Hurley, R. S. Hopkins, R. Miake-Lye, M. S. Johnson, R. F.Doolittle, and M. I. Simon. 1986. Repetitive segmental structure of the transducin 3subunit: Homology with the CDC4 gene and identification of related mRNAs. Proc.Natl. Acad. Sci. USA. 83:2162-2166.Fraser, C., R. K Humphries, and D. L. Mager. 1988. Chromosomal distribution of theRTVL-H family of human endogenous retrovirus-like sequences. Genomics. 2:280-287.Goodwin, it G., M. Rotman, T. Callahan, H-J. Kung, P. A. Maroney, and T. W.Nilsen. 1986. c-erbB activation in avian leukosis virus-induced erythroblastosis:Multiple epidermal growth factor receptor mRNAs are generated by alternative RNAprocessing. Mol. Cell. Biol. 6:3128-3133.Greenberg, R., IL Hawley, and K. B. Marcu. 1985. Aquisition of an Intracisternal A-Particle element by a translocated c-myc gene in a murine plasma cell tumor. Mol.Cell. Biol. 5:3625-3628.Hawley, K G., M. J. Shulinan, H. Murialdo, D. M. Gibson, and N. Hozumi. 1982.Mutant immunoglobulin genes have repetitive DNA elements inserted into theirintervening sequences. Proc. NatI. Acad. Sci. (USA). 79:7425-7429.Hayward, W.S., B. G. Neel, and S. M. Astrin. 1981. Activation of a cellular oncgene by promoter insertion in ALV-induced lyjnphoid leukosis. Nature. 290:475-479.Henrikson, R. L., E. T. Kreuger, and P. S. Keim. 1977. Amino acid sequence ofphospholipase A2 a from the venom of Crotalus adamanteus. J. Biol. Chem. 252:4913-4921.Herman S. A. and J. M. Coffin. 1986. Differential transcription from the longterminal repeats of integrated avian leukosis virus DNA. J. Virol. 60:497-505.Jackson, M. A., J. A. Craft, and B. Burchell. 1987. Nucleotide and deduced aminoacid sequence of human liver microsomal epoxide hydrolase. NucI. Acids. Res.15:7188.Kazazian Jr., H. H., C. Wong, H. Youssouflan, A. F. Scott, D. G. Phfflips, and S. E.Antonarakis. 1988. Haemophilia A resulting from a de novo insertion of Lisequences represents novel mechanism for mutation in man. Nature. 332:164-166.Keshet, E., R. Schiff, and A. Itin. 1991. Mouse retrotransposons: A cellular reservoirof long terminal repeat (LTR) elements with diverse transcriptional specificities.Adv. Cancer Res. 56:215-251.Koeffier, H. P. and D. W. Golde. 1980. Human myeloid leukemia cell lines: Areview. Blood. 56:344-350.Kongsuwan, K., J. Allen, and J. M. Adams. 1989. Expression of Hox 2.4 homeoboxgene directed by proviral insertion in a myeloid leukemia. Nucl. Acids Res. 17:1881-1892.Kozak, M. 1987. An analysis of 5’ noncoding sequences from 699 vertebratemessenger RNAs. Nuci. Acids. Res. 15:8125-8132.143Kramer, R.. M., C. Hession, B. Johansen, G. Haynes, P. MCGray, E. P. Chow, H..Tizard, and H.. B. Pepinsky. 1989. Structure and properties of a human non pancreaticphospolipase A2. J. Biol. Chem. 264:5768-5775.Kuff, E. L. and K. K. Lueders. 1988. The Intracisternal A-particle gene family:Structural and functional aspects. Adv. Cancer. Res. 51:183-276.Kunsunoki, C., S. Satoh, M. Kobayashi, and M. Niwa. 1990. Structure of genomicDNA for rat platelet phospholipase A2. Biochim. Biophys. Acta. 1087:95-97.Larsson, E., N. Kato, and M. Cohen. 1989. Human endogenous proviruses. Curr.Topics. Microbiol. Immunol. 148:115-132.MacLeod, I)., H.. Lovell-Badge, S. Jones, and L Jackson. 1991. A promoter trap inembryonic stem (ES) cells selects for integration of DNA into CpG islands. Nuci.Acids. Res. 19:17-23.Mager, D. L. 1989. Polyadenylation function and sequence variability of the longterminal repeats of the human endogenous retrovirus-like family RTVL-H.Virology. 173:591-599.Mager, D. L. and J. D. Freeman. 1987. Human endogenous retrovirus-like genomewith type C pol sequences and gag sequences related to human P cell lymphotrophicviruses. J. Virol. 61:4060-4066.Mager, D. L., and P. S. Henthorn,. 1984. Identification of a retrovirus-like repetitiveelement in human DNA. Proc. NatI. Acad. Sci. (USA). 81:7510-7514.Mm Man, Y., Delius, H., and Leader, D. 1987. Molecular analysis of elementsinserted into mouse g actin processed pseudogenes. Nuci. Acids Res. 15:3921-3304.Morse, B., P. G. Rotherg, V. J. South, J. M. Spandorfer, and S. M. Astrin. 1988.Insertional mutagenesis of the myc locus by a LINE-i sequence in a human breastcarinoma. Nature. 333:87-90.Neel, B. G., and Hayward, W.S. 1981. Avian leukosis virus-induced tumors havecommon proviral integration sites and synthesize discrete new RNAs: Oncogenesis bypromoter insertion. Cell. 23:323-334.Nilsen T. W., P. A. Maroney, H.. G. Goodwin, F. M. Rottman, L. B. Crittenden, M. A.Raines, and H. Kung. 1985. c-erbB activatiDn in ALV induced erythroblastosis: NovelmRNA processing and promoter insertion result in expression of an amino truncatedEGF receptor. Cell 41:719-726.Payne, G. S., S. A. Courtneidge, L. B. Crittenden, A. M. Fadly, J. M. Bishop, and H. E.Varmus. 1981. Analysis of avian leukosis virus DNA and RNA in bursal turmors:Viral gene expression is not required for maintenance of the tumor state. Cell. 23:311-322.Paulson, K. E., A. G. Matera, N. Deka, and C. W. Sehmid. 1987. Transcription of ahuman transposon-like sequence is usually directed by other promoters. Nuci. Acids.Res. 15:5199-5215.144Peters, G. 1989. “Oncogenes at viral integration sites”, in Oncogenes. Ed D.M.Glover & B.D. Hames. IRL Press.Renetseder, H.. S. Brunie, B. W. Dijkstra, J. Drenth, and P. B. Sigler. 1985. Acomparison of the crystal structures of phospholipase A2 from bovine pancreas andCrotalus atrox venom. J. Biol. Chem. 260:11627-11634.Rohdewohld, H., H. Weiher, W. Reik, H.. Jaeniseh, and M. Breindi. 1987. Retrovirusintegration and chromatin structure: Moloney murine leukemia proviral integrationsites map near DNAse I hypersensitive sites. J. Virol. 61:336-343.Schell, B. P. Szurek, and W. Dunnick. 1987. Interruption of two immunoglobulinheavy-chain switch regions in murine plasmacytoma P3.26Bu4 by insertion ofretrovirus like element ETn. Mol. Cell. Biol. 7:1364-1370.Scherdin, U., K. Rhodes, and M. Breindi. 1990. Transcriptionally active genomeregions are preferred targets for retrovirus integration. J. Virol. 64:907-912.Schmid, C. W., and W. H.. Jelinek. 1982. The Alu family of dispersed repetitivesequences. Science. 216:1065-1070.Schulz, M., U. Freisen-Rabien, it Jessberger, and W. Doerfier. 1987.Transcriptional activities of mammalian genomes at sites of recombination withforeign DNA. J. Virol. 61:344-353.Seilhamer, J. J. W. Pruzanski, P. Vadas, S. Plant, J. Miller, J. Kioss, and L. K.Johnson. 1989. Cloning and recombinant expression of phospholipase A2 present inrhematoid arthritic synovial fluid. J. Biol. Chem. 264:5335-5338.Seilhamer, J., T. Randall, M. Yamanaka, and L. Johnson. 1986. Pancreaticphospholipase A2: Isolation of the human gene and cDNAs from porcine pancreas andhuman lung. DNA. 5:519-527.Skowronski, J., T. G. Fanning, and M. Singer. 1988. Unit-length Line-i transcriptsin human teratocarcinoma cells. Mol. Cell. Biol. 8:1385-1397.Shen-Ong, G. L, H. C. Morse ifi, M. Potter, and J. F. Mushinski. 1986. Two modes ofc-myb activation in virus-induced mouse myeloid tumors. Mol. Cell. Biol. 6:380-392.Stavenhagen, J. B., and D. M. Robins. 1988. An ancient provirus has imposedandrogen regulation on the adjacent mouse sex-limited protein gene. Cell. 55:247-254.Temin., H. M. 1982. Function of the retrovirus long terminal repeat. Cell. 28:3-5.Varmus, H. and P. Brown. 1989. Retroviruses. In: Mobile DNA, Berg D. E. andM.M. Howe (eds) Am. Soc. Microbiol., Wash. D. C. pp.53-108.Vijaya ,S., D. L. Steffen, and H. L. Robinson. 1986. Acceptor sites for retroviralintegrations map near DNAse I hypersensitive sites in chromatin. J. Virol. 60:683-692.145Wery, J. P., H. Schevitz, D. Clawson, J. Bobbitt, E. Dow, G. Gamboa, T. Goodson Jr., R..Herman, H. Kramer, D. McClure, E. Miheich, J. Putnuni, J. Sharp, D. Stark, C.Teater, M. Warrick, and N. Jones. 1991. Structure of recombinant humanrheumatoid arthritic synovial fluid phospholipase A2 at 2.2 A resolution. Nature.352:79-82.Wilkinson, D. A., J. D Freeman, N. L. Goodchild, C. A. Kelleher, and D. L. Magerl1990. Autonomous expression of RTVL-H endogenous retrovirus-like elements inhuman cells. J. Virol. 64:2157-2167.Ymer, S., Tucker, W. Q. J., Sanderson, C. J., Hapel, A. J., Campbell, and Young, L G.1985. Constitutive synthesis of interleukin-3 by leukemia cell line WEHI-3B is due toretroviral insertion near the gene. Nature. 317:255-258.Yochem, J. and B. Byers. 1987. Structural comparison of the yeast cell division cyclegene CDC4 and a related pseudogene. J. Mol. Biol. 195:233-245.146CHAPTER VISummary and ConclusionsIn recent years, several families of sequences related to integrated retroviruses(proviruses) have been discovered in the human genome. The largest of these families,RTVL-H, has close to 1000 members in addition to several hundred solitary LTRs. Thestriking similarity of these LTRs in structure and organization to the LTRs of proviruses andthe observation of autonomous RTVL-H expression in several cell lines and in normalplacenta suggested that they may act as transcriptional regulators of gene expression.Consequently, the primary objectives of this study were to assess individual RTVL-H LTRsfor promoter and enhancer activity and to investigate the possibility that these elements may beinvolved in the regulation of cellular genes either in normal cells or as the result ofrearrangements.Analyses of RTVL-H Promoter/Enhancer ActivityTo determine whether various RTVL-H LTRs can act as transcriptional regulators ofgene expression in vitro, I made use of a vector system in which the bacterial gene coding forCAT can be placed under the transcriptional control of heterologous promoter sequences. TheRTVL-H LTRs used in this study were isolated from both cDNA (peripheral blood, Hep2, andNTera2Dl) and genomic libraries. Cell lines used were human embryonal carcinoma(NTera2Dl), human embryonal kidney (293), HeLa, 3T3, and COS-1 cells. The results ofthese studies showed that all of five LTRs tested were capable of promoting transcription in atleast one of the above cell types. However, a great deal of heterogeneity was observed in both thelevel of promoter activity of a particular LTR and the cell types in which it functioned. Themost striking differences observed were between the H6 LTR (derived from a Hep2 cDNAlibrary) , that was capable of promoting transcription to a high level in all cell types, and the147PB-3 LTR (from the peripheral blood eDNA library) which was functional only in COS-1 cells.In addition to its promoter capabilities, the H6 LTR was shown to contain a transcriptionalenhancer in transient assays. The observation that the majority of the LTRs tested displayed aweaker, more limited range of promoter activity was not unexpected since the presence of alarge number of mobile sequences with strong promoter activity may prove to be detrimental tothe genome and thus be selected against. Indeed, the human genome contains a relatively lownumber (50 - 100) of LTRs that are similar in sequence to H6 (termed type Ia) and studies in ourlaboratory have shown that the type Ia LTRs are a relatively recent addition to the pool ofRTVL-H elements since they are present the genomes of the great apes (orangutan, gibbon,gorilla, and human) but not in the old world monkeys which do contain type I and II LTRs (D.Mager and N. Goodchild, unpublished observations). Interestingly, sequence comparisonssuggest that type Ia LTRs formed via a recombination event between a type I and a type II LTRsand Northern analysis using probes specific for the different LTR subclasses indicate that typeIa LTRs are expressed in the widest range of cell types (D. Wilkinson, unpublished data).These findings, together with the promoter studies presented in this thesis, indicate that theolder type I and II LTRs are more tightly regulated than the younger population of type IaLTRs. This raises the possibility that the species may require time, in this case millions ofyears, to develop mechanisms to control the expression of newly generated promoters of thistype.These studies suggest several future lines of investigation. Currently, deletionanalyses and mobility shift assays are being used to determine important regulatory regionswithin the different LTR subtypes. Strains of transgenic mice have also recently beenproduced containing the f3-galactosidase gene under the transcriptional control of the H6 LTR.These mice will be used to better determine the tissue and developmental regulation of thisinteresting class of RTVL-H LTRs.148Trwzsactivation of RTVL-H LTRs by SV4O Large T AntigenOver the course of my analysis of LTR promoter activity, I have found that the LTRstested varied significantly in their cell specificity. In particular, I noticed that one LTR, PB-3,promoted gene expression in COS-1 cells (SV4O transformed monkey kidney cells) but not inits untransformed parent line, CV-1. This observation suggested that SV4O encoded proteinsmight be activating the LTR. Competition experiments in which an excess of plasmidcontaining only the PB-3 LTR was cotransfected with a PB-3CAT reporter gene constructconfirmed that this effect was specific for the PB-3 LTR and suggested that a transacting factorwas being sequestered by the excess PB-3. Cotransfection experiments with vectors expressingthe SV4O large T antigen CT) have confirmed that P is involved in this transactivation andthat the effect of T on different LTRs differs. These findings suggest that a large number ofRTVL-H LTRs may be affected by this and other oncogenes and open many lines of futureinvestigations. Presently, members of our laboratory are interested in identifying otherpotential transactivators of RTVL-H LTRs. The combination of deletion analysis andmobility shift assays will identify important control regions within different LTRs. Oncethese relevant regions are identified, footprinting analysis will be performed to betterdelineate important transcription factor binding sites. Such studies, comparing COS-1 andCV- 1 nuclear extracts with purified T protein will not only identify potential binding sites butwill also indicate whether T binds directly or if other cellular proteins are involved.CELLULAR TRANSCRiPTS REGULATED BY RTVL-H LTRSIn rodents, there are several cases in which LTR-like sequences have altered theexpression of cellular genes (reviewed in Chapter 1), but no similar events had been reportedin humans at the initiation of the work reported in this thesis. Nonetheless, the abundance anddispersed nature of RTVL-H elements suggested that some may be involved in regulating theexpression of unrelated cellular genes, either because they have evolved to play this role orbecause of recent genomic rearrangements. Thus, to identify cellular genes which have beenpromoted by an RTVL-H LTR, I have used a differential screening strategy to identify cDNA149clones that would hybridize only with RTVL-H U5 sequences. This strategy identified fiveNTera2Dl eDNA clones that appear to be derived from different RTVL-H LTR promotedtranscripts. The identification of several different LTR promoted transcripts from onelibrary screen may seem surprising in light of the fact that these types of events have notfrequently been encountered in human cells. Many of the examples of insertionalmutagenesis by rodent endogenous LTRs were discovered fortuitously, during the analysis ofspecific genes or mutations. Since the human genome has not been studied as extensively,there has been little opportunity to observe these events in human cells and I am unaware ofany other systematic search for human LTR promoted genes.The clones isolated here were representative of different mechanisms by whichretroviral LTRs might be expected to exert their effects. They included an element which isnormally located upstream of a CpG island, and thus may be involved in the normalregulation of the associated gene, a probable recent transposition event in which an LTR mayhave imposed altered regulation on a novel CDC4-related gene, and one case in which anupstream RTVL-H element has spliced into a cellular exon. Furthermore, two apparent LTRpromoted transcripts containing no open reading frames were also identified. It has beenshown that some RTVL-H LTRs contain bidirectional promoters and that some can act toenhance heterologous promoters (Chapter 3; Feuchter and Mager, 1990). Thus, since thescreening method used here would not normally detect transcripts enhanced by an LTR orpromoted by an LTR in the “opposite” orientation, the examples reported here are most probablyan underrepresentation of the types of LTR regulated cellular transcripts present in NTera2Dlcells. Nonetheless, this strategy will certainly be useful in identifying further examples ofLTR mediated gene expression in different cells and possibly by different types of endogenousLTRs. Indeed, an analogous strategy has recently been employed to identify a gene promotedby a mouse lAP LTR and expressed in placenta (Cheng-Yeh et al., 1991).This study describes the initial identification and characterization of these clones andsuggests many future experiments. For example, the AF-4 clone is derived from an LTR next150to a CpG island. Although related transcripts were not observed, the possibility thattranscriptional enhancers present within the LTR contribute to the expression of a gene linkedto the CpG island may be further investigated. Firstly, genomic probes 5’ and 3’ of the LTRsequence can be isolated and used to screen Northern blots. Transcripts under the control ofan RTVL-H enhancer would be expected to be expressed in embryonal carcinoma cells.This strategy also identified a clone, AF-3, that appeared to result from a new insertionof an RTVL-H element into a gene with homology to yeast CDC 4. Future plans include thecloning of the genomic region corresponding to the proposed insertion site. In addition, cDNAclones encompassing approximately 6.5 kb of the 10 kb AF-3 related transcript have now beenidentified. The sequencing of these and additional clones spanning the entire 10 kbtranscript should provide insight into the function of this gene which appears to beconstitutively expressed.This study also identified a clone, AF-5, apparently derived from a splicing event inwhich an RTVL-H element has spliced into a downstream gene with homology to PLA2proteins. Future studies will be of interest to address the origin of the different AF-5 relatedtranscripts observed. Oligonucleotide probes specific for the RTVL-HJPLA2-likejunctionwill identifSr which of the transcripts are derived from this splicing event, and the library willbe rescreened with probes specific for the non-RTVL-H portion of the clone in an effort toisolate upstream exons. In addition, expression vectors will be used to determine whether theAF-5 encoded protein has PLA2-like enzyme activity or whether it is an inactive homolog.Concluding RemarksRetroviral LTRs are widely accepted as mutagenic agents. In addition to the manyreports of oncogenic activation by exogenous retroviruses, and by non defective endogenousviruses, retrovirus-like elements or retrotransposons, particularly those of the lAP family inmice, have been implicated in both the inactivation and activation of cellular genes (Chapter 1;Kuff and Lueders, 1988). In contrast to their frequency in rodents, these types of mutations byhuman endogenous LTRs have not often been reported. Undoubtedly, one reason for this151difference is that humans, unlike rodents, are not known to contain fully functionalendogenous retroviruses. In addition, members of the murine lAP family appear to beparticularly competent to undergo transposition events, and have been implicated in a largeproportion of the retroelement induced mutations observed in mice (Kuff and Lueders, 1988).Nonetheless, the high copy number of RTVL-H elements in the human genome suggested thatthis family may have the potential to induce such mutations. In fact, the only other descriptionof LTR mediated transcription of a human cellular gene also involves an RTVL-H element(Liu and Abraham, 1991). In this case, a eDNA clone in which an RTVL-H element hasspliced into the second exon of the human calbindin gene was identified in a cell line derivedfrom a prostate metastasis. However, whether this event was caused by a recent RTVL-Hinsertion or is the consequence of a mutation involving an element normally locatedupstream of the calbindin gene has yet to be determined. The single copy human endogenousproviral element HERV-3 has been shown to splice into a Kruppel related gene locateddownstream of its genomic locus (Kato et al., 1987; Kato et al., 1990). This case differs from theRTVL-H examples in that the HERV-3 transcripts extend through the 3 LTR and use a splicedonor downstream of the provirus. Interestingly, an abundant RTVL-H related transcript inhas been observed in NTera2Dl cells, which does read through the 3’ LTR into flankingcellular sequences (Wilkinson et al., 1990). Thus, the many reports of LTR promoted genes inrodents and now in humans suggest a general evolutionary role for these sequences in theregulation of cellular genes.152REFERENCESChang-Yeh, A., D. E. Mold, and R. C. C. Huang. 1991. Identification of a novelmurine lAP promoted placenta-expressed gene. Nucl. Acids. Res. 19:3667-3672.Feuchter, A., and D. Mager. 1990. Functional heterogeneity of a large family ofhuman LTR-like promoters and enhancers. Nucl. Acids Res. 18:1261-1270.Kato, N., S. Pfeifer-Ohlsson, M. Kato, E. Larsson, J. Rydnert, H.. Ohlsson, and M.Cohen. 1987. Tissue specific expression of human provirus ERV3 mRNA in humanplacenta: Two of the three ERV3 mRNAs contain human cellular sequences. J. Virol.61:2182-2191Kato, N., K Shimotohno, D. VanLeeuwen, and M. Cohen. 1990. Human proviralmRNAs down regulated in choriocarcinoma encode a zinc finger protein related toKruppel. Mol. Cell. Biol. 10:4401-4405.Kuff., E. L. and K. K. Lueders. 1988. The intracisternal A particle gene family:Structural and functional aspects. Adv. Cancer. Res. 51:183-276.Liu, A.Y., and B. A. Abraham. 1991. Expression of a hybrid human endogenousretrovirus and calbindin gene in a prostate cell line. Cancer Res. 51:4107-4110.Wilkinson, D. A., J. D Freeman, N. L. Goodchild, C. A. Kelleher, and D. L. Mager1990. Autonomous expression of RTVL-H endogenous retrovirus-like elements inhuman cells. J. Virol. 64:2157-2167.153


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items