Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Transcriptional regulatory elements in the long terminal repeats of the human endrogenous retrovirus,… Nelson, David Troy 1997

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


831-ubc_1997-0449.pdf [ 5.18MB ]
JSON: 831-1.0088264.json
JSON-LD: 831-1.0088264-ld.json
RDF/XML (Pretty): 831-1.0088264-rdf.xml
RDF/JSON: 831-1.0088264-rdf.json
Turtle: 831-1.0088264-turtle.txt
N-Triples: 831-1.0088264-rdf-ntriples.txt
Original Record: 831-1.0088264-source.json
Full Text

Full Text

TRANSCRIPTIONAL REGULATORY ELEMENTS IN THE LONG TERMINAL REPEATS OF THE HUMAN ENDOGENOUS RETROVIRUS, HERV-H. by DAVID TROY NELSON B.Sc. (Microbiology), The University of Victoria, 1991 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE FACULTY OF GRADUATE STUDIES MEDICAL GENETICS PROGRAM We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA August 1997 © David Troy Nelson, 1997. In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head o* m y department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. The University of British Columbia Vancouver, Canada DE-6 (2/88) ABSTRACT HERV-H sequences comprise a large family of human endogenous retrovirus-like elements. Previous DNA sequence comparisons of HERV-H long terminal repeats (LTRs) have led to their classification into three subtypes, Type I, la and II. Type la appears to have been generated by recombination between Type I and II LTRs. These subtypes differ in evolutionary age and transcriptional activity with Type la LTRs being younger in evolutionary terms and possessing stronger promoter function than the other two subtypes. In this study, possible mechanics responsible for the functional difference between LTRs have been explored. Type I and II LTRs each contain different sets of repeated segments in their U3 regions which are disrupted in Type la LTRs. Using reporter gene assays, both types of repeated segments were shown to suppress activity of the human R-globin gene promoter when cloned at a distant site. Both sets of repeats also repress promoter activity of a Type la LTR when directly inserted within its U3 region. In further support of these findings, removal of the strongly supressing Type II repeat set from a Type II LTR increased promoter activity in one test cell line. However, this result was not observed in all cell lines or with both LTR types, emphasizing the complexity and cell-type dependence of HERV-H promoter regulation. In addition, using deletion constructs, two positive regulatory segments have been localized within the Type la LTR, both of which contain a potential binding site for the transcription factor Sp1. Gel mobility shift assays demonstrated that fragments containing these sites do bind Sp1. Although Type I LTRs are generally similar to Type la LTRs in the regions surrounding the Sp1 sites, there are sequence differences within the sites. Gel shift analysis revealed no, or much reduced, Sp1 binding of Type I LTR fragments containing these sites. Thus, it appears that the loss of repeated suppresser elements and the acquisition of Sp1 binding sites have both contributed to the relatively strong transcriptional activity of the Type la LTRs. ii TABLE OF CONTENTS ABSTRACT ii TABLE OF CONTENTS iii LIST OF TABLES v LIST OF FIGURES vi ACKNOWLEDGMENTS vii INTRODUCTION 1 1. NEGATIVE REGULATION OF TRANSCRIPTION... 4 Binding Site Competition 4 Quenching Transcription Activators 5 The Utility of Short Range Repression 6 Repressors of the Silencer Type 7 2. RETROELEMENTS OF THE HUMAN GENOME 9 A. Long Terminal Repeats 9 B. LTR containing Retroelements in Humans 12 THE-1 12 Human Endogenous Retroviruses 12 C. Class I HERVs 13 HERV-ERI Superfamily 13 Class I HERV continued 16 HERV-I (RTVL-I) 16 HERV-P Families 17 ERV-9 ....17 Other Single Class I Members 18 HRES-I 19 D. Class II HERVs 19 HERV-K 19 E. HERV-H 22 3. IMPACT OF RETROELEMENTS ON THE HUMAN GENOME 24 Rearrangements 25 Insertion Events 26 iii Effects on Adjacent Cellular Genes 26 Donation of Promoter Function 28 Polyadenylation 29 HERV Encoded Proteins 29 Summary 31 4. THESIS OBJECTIVES 32 MATERIALS AND METHODS 33 LTR Descriptions and applied PCR methods 34 Transfections and CAT assays 35 Plasmid Constructs 37 DNA Sequencing 40 Gel mobility shift assays 40 RESULTS 42 Features of different HERV-H LTR types 43 Type I and II repeat regions can repress activity of a heterologous promoter 45 Presence of Type I and II repeat regions decreases activity of a Type la LTR 49 Effects of removing the repeat regions from Type I and II LTRs depend on test cell line 54 Deletion analysis of the H6 Type la LTR identifies positive regulatory regions 57 Comparison of Sp1 binding to the Type la and Type I LTRs 61 DISCUSSION 67 REFERENCES CITED 73 iv ( LIST OF TABLES Table 1: Distribution and functional characteristics of different HERV-H LTR types 43 Table 2: Comparison of Type I and la LTRs in regions of the Sp1 sites in the H6LTR 65 v LIST OF FIGURES Figure 1. General structural features of Long Terminal Repeats (LTRs) 10 Figure 2. Example of an autoradiograph of a generic CAT assay TLC plate exposed for 24 hrs 36 Figure 3. Graphic representation of the strategy employed in removing the repeat sets from the HERV-H, N10-14 Type I and PB-3 Type II LTRs 39 Figure 4. Representation of the features of the three HERV-H LTR subtypes 44 Figure 5. CAT assay vector testing the effects of the repeats on the activity of a B-globin promoter in 293 cells 46 Figure 6. CAT assay vector testing the effects of the repeats on the activity of a Type la HERV-H LTR 48 Figure 7. Summary of CAT assays on lysates from transient transfections of Ntera 2D1 cells 50 Figure 8. Autoradiograph of a sample CAT assay on lysates from transient transfections of 293 cells 51 Figure 9. Representations of the "repeat free" HERV-H Type I and II LTRs. : 53 Figure 10. Summary of CAT assays on lysates from transient transfections of the 293 cell line 55 Figure 11. Summary of CAT assays on lysates from transient transfections of Ntera 2D1 cells 56 Figure 12. Representation and sequence of the H6 Type la LTR 58 Figure 13. Representation of serial deletion constructs of the H6 LTR paired with a summary of their relative promoter activities in Ntera 2D1 cells 60 Figure 14. Representations of the H6 Type la and N10-14 Type I LTRs and the fragments derived from them for use in gel mobility shift assays 62 Figure 15. Autoradiographs of gel mobility shift assays 64 vi ACKNOWLEDGMENTS I feel extremely fortunate to have worked with Dixie Mager during my time as a graduate student. She is a wonderful supervisor and terrific role-model. I most appreciate her patience (which I tested ) and her persistence ( which I also tested). I want to thank her for the friendly productive way she approached our challenges and to say I admire her balanced outlook on work and life. The Mager Lab was a great place to be. I thank Jack and Nancy for showing me the ropes. I owe much to Doug Freeman. He added his own special charm to the lab and always had time to give me technical assistance. Paul is just plain super. I could not have ask for a better "partner in crime" both in the lab and "on the outside". I would like to thank my thesis committee members for doing their duty. In particular, I thank Keith Humphries for effort above and beyond the call of duty and for being a great neighbour. I also thank all of the other neighbours in the Terry Fox Lab community. It was a great group that provided a lot of support and assistance. The real bedrock of support for me has always been my parents and I thank them very much for all the opportunity they have given me. They are a big part of everything I have done and will ever do. I also thank my Aunt Linda and my Grandmother who have always showed they care. Finally, I thank Monica for her commitment to me and my goals. We make a great team and I am so grateful for her support. I look forward to many future opportunities to return it. To everyone mentioned here, and any I've missed out: It took a while but we did it! vii 1 Introduction 2 Transposable elements appear to be ubiquitous and play an important role in the evolution of host genomes (Berg and Howe, 1989). Their spread is controlled by factors contributed by the host and by the elements themselves and, because an uncontrolled rate of transposition would be deleterious to the host cell, this process must be tightly regulated. It is probable therefore, that transposable elements and their hosts evolve together such that the host also affects the evolution of the mobile element. One mode of potential regulation is at the level of transcription. For example, the ability of retrotransposons to transpose is dependent on the amount of RNA as a substrate for reverse transcription and this level is controlled by both trans acting host factors and c/'s acting sequences within the element (Berg and Howe, 1989; Boeke and Corces, 1989). Characterisation of factors regulating transcription of a particular family of transposable elements may lend insight into its evolutionary history, its potential for future transpositions and its ability to affect expression of adjacent genes. Endogenous retroviruses can be considered as retrotransposons bearing similarity to exogenous retroviral genomes. In humans, such elements, termed HERVs, are classified into several families ranging in copy number from one to 1000 members per haploid genome (for reviews see Larsson et al., 1989; Wilkinson et al., 1994). Most HERVs have been present in the primate germ line for at least 25 million years but it is unclear why some families have amplified to a greater extent than others during the course of evolution. The HERV family with the greatest number of full elements, as opposed to solitary LTRs, is the HERV-H (or RTVL-H) family which contains approximately 1000 members and close to 1000 solitary LTRs found on all chromosomes (Mager and Henthorn, 1984; Fraser et al., 1988). Most HERV-H elements share several deletions in pol and env and thus are translationally defective (Mager and Freeman, 1987; Wilkinson et al., 1990). A minority (-5-10%) are 3 undeleted and so could potentially encode protein (Hirose et al., 1993; Wilkinson et al., 1993) although an element with full open reading frames has not yet been identified. Retrotransposition of HERV-H has not been demonstrated experimentally, but structural similarities to retroviruses and the detection of spliced forms in the genome strongly suggest that they expanded primarily via a virus-like retrotransposition process (Goodchild et al., 1995). An analysis of the transcriptional regulatory sequences within the LTRs of HERV-H elements is of interest for two primary reasons. First, their abundance in the genome suggests that they could alter the expression of nearby genes and, indeed, instances have been identified in which HERV-H LTRs or other human endogenous retroviruses act to regulate heterologous cellular genes (Ting et al., 1992; Kato et al., 1990; Feuchter-Murthy et al., 1993; Di Cristofano et al., 1995). Second, as stated above, elucidation of LTR functional capacities may help in the understanding of the evolution and present day activity of HERV-H and perhaps other HERV families. HERV-H LTRs can be classified into three main subtypes, Types I, la and II, based on different repeated segments in their U3 regions (Mager, 1989; Goodchild et al., 1993). HERV-H elements with Type I or II LTRs are more numerous and their existence can be traced further back in time than elements with Type la LTRs (Goodchild et al., 1993) indicating that Type la elements represent a more recently expanded sub population. Interestingly, Type la LTRs are expressed endogenously in the widest range of cell lines and are also the strongest promoters when tested individually in reporter gene assays (Feuchter and Mager, 1990; Goodchild et al., 1993). In this study, sequences that contribute to the promoter function of a Type la LTR have been characterised. The results presented in this thesis suggest that the strong activity of these LTRs, relative to Type I and II LTRs, is at least partly due to a lack of negative regulatory elements and to the presence of binding sites for the Sp1 transcription factor. 4 As an introduction to this material, the following includes a general review of the negative regulation of transcription and an overview of some of the retroelements found in the human genome. More detailed consideration is given to the HERV-H family of endogenous retroviruses. Motivation for studying human endogenous retroviruses is illustrated in a discussion of the impact of these and other retroelements on the human genome. The introduction then concludes with an outline of the objectives of this thesis. 1. Negative Regulation of Gene Transcription Three major mechanisms have been described whereby negative regulators exert their effect on the expression of a given gene (reviewed in Renkawitz, 1990; Clark and Docherty, 1993; and Johnson, 1995). A negative regulatory protein can compete with an activator protein for a DNA binding site. Alternatively, the repressor can bind a sequence near the sequence bound by the activator and repress its function in a short range manner. This method of repression is termed quenching. These two repression mechanisms are termed passive as the repression is simply blocking the function of an activator. A third negative regulation mechanism is termed active repression. A repressor of this type functions as a "reverse enhancer" or silencer, binding a target sequence and acting directly on the transcription machinery to down regulate transcription. Binding Site Competition: Blocking the binding of an activator to its target sequence is a relatively straightforward mechanism by which a protein can act as a negative regulator. This regulatory protein can, but need not, bind the identical site that is recognised by the 5 activator. The negative regulator can bind an overlapping or an adjacent site as long as its location sterically hinders activator binding to its own target. This mechanism does not require the negative regulatory protein to possess any additional functional domains (reviewed in Clark and Docherty, 1993). An example can be found in the LTR of the human immunodeficiency virus type-1 (HIV). The HIV promoter contains a GC-Box which is the binding site recognised by the ubiquitous transcription activator Sp1 (Westin and Schaffner, 1988) and related family members,Sp3 and Sp4 (Kingsley and Winoto, 1992; Hagen et al., 1992). Binding of this GC-Box by the members Sp1 and Sp4 leads to activation of transcription. Binding of the sequence by Sp3 does not support transcription and has an inhibitory effect by competing for the binding site with Sp1 (Majello et al., 1994). Though widely utilised in prokaryotes, examples of this mechanism in eukaryotes remain quite rare. Johnson (1995) suggests this may be a result of the numerous activator binding sites present in the typical eukaryotic promoter. It would be impractical to have a repressor corresponding to each activator binding site. Some notable exceptions are discussed below in a section on the implications and utility of short range repression mechanisms. Quenching of transcription activators: Quenching is an interesting and relatively common case of a transcription activator being neutralised by a transcription repressor. Both the activator and repressor are able to bind their respective DNA target sites but the repressor still impedes the function of the activator. This effect is dependent on the repressor binding site being near the activator binding site and is alternatively referred to as short range repression (reviewed in Johnson, 1995; Renkawitz, 1990). In most cases the specific mechanism by which the repressor debilitates the activator is yet to be 6 determined. Repressors that quench activators can do so nonstoichiometrically (Gray et al., 1994). Surprisingly, due to the proximity of the binding sites for the repressor and the activator, binding of one factor may facilitate binding of the other. Binding of unrelated DNA binding proteins to nucleosomal DNA has been shown to be inherently cooperative, with this cooperativity being independent of any other protein function (Adams and Workman, 1995). The proteins involved in quenching need not be dedicated repressors. This negative regulation is observed between a number of sets of proteins previously described to be activators of transcription. An example can be found in the insulin control element (ICE) of the insulin gene promoter. The ICE is bound by the activators E2A and c-Jun which are antagonistic in this context, each quenching the transcription activator function of the other (Robinson et al., 1995). The utility of short range repression: A major implication of the limited range of quenching and binding site competition, is the possibility of having independent sets of regulator regions, in one promoter, each with its own specific repressor. If the range of the repressor were longer, like those of the silencer type repressors discussed below, this level of complexity would not be possible. Striking examples of this can be found in the regulation of genes expressed in Drosophila development. The zinc finger repressor, snail, is active against a variety of activators but over a limited range. A separation of 50 bp between binding sites allows much stronger suppression than a separation of 120 bp (Gray et al., 1994). The use of multiple short range repressors in promoter organisation is key to the complex expression pattern of the pair-rule gene even-skipped (eve). This gene is transcribed in seven clearly defined stripes along the Drosophila embryo (MacDonald et al., 1986). A model presented by Small et al. 7 (1991) was derived from their own work and that of others. Complex autonomous regulatory elements control the expression of eve in each of the seven stripes. For example, the regulatory element responsible for eve expression in stripe 2 contains binding sites for the activators bicoid and hunch-back as well as sites for the repressor Kruppel. In cells contained within stripe 2, bicoid and hunch-back bind and activate eve expression. At the boundary of stripe 2, and in all cells posterior, Kruppel binds and deactivates the stripe 2 control element. In the cells of stripe 3, eve expression is activated by the stripe 3 control element. The stripe 2 element is still "off but the repression of Kruppel is specific to this element and leaves the promoter sensitive to other signals in stripes further along the embryo. This system demonstrates both the power of negative regulatory elements in gene regulation and the specific utility of short range repression mechanisms. Repressors of the "Silencer" type: While the above discussion of repressors has focused on the protein components involved, the discussion of silencer type repressors begins referring specifically to the DNA elements involved. The first silencer element described was from yeast and is a sequence which represses transcription while behaving in a way analogous to an enhancer element. HMRE is a sequence from the promoter of the silent mating type gene HMRa which suppresses transcription from heterologous promoters in a position and orientation independent manner (Brand et al., 1985). Since this original finding, negative regulatory elements (NREs), which act as silencer type repressors of transcription, have been widely described. The precise mechanism of action of silencer type repressors is not well characterised, as investigations to date have tended to focus on the behaviour of the NRE and not on the proteins that bind them. In many cases however proteins have 8 been identified that bind the NRE, including examples that bind the NRE in the HIV-1 LTR (Calvert et al., 1991), the osteocalcin gene (Frenkel et al., 1994) and the original silencer example, HMRE (Shore and Nasmyth, 1987; Bell and Stillman, 1992). Three major possibilities have been suggested as mechanisms by which these types of repressor proteins could act on the general transcription machinery. Binding could simply sterically interfere with complex assembly or alternately, binding could lock the assembly in an inactive state by inhibiting phosphorylation or conformational changes. Essentially these interactions would disrupt the steps in the formation of the initiation complex or block steps required in the transition leading to clearance of the promoter. A final possibility is the physical tethering of the polymerase and associated proteins to the promoter which would also block the promoter clearance event (reviewed in Clark and Docherty, 1993 ; Johnson, 1995). A distinct feature of this type of repression is that it can be employed despite numerous other transcription enhancing elements that may be active in a given promoter. In reality the control of a complex eukaryotic promoter is an interplay between all of these promoter elements. The human insulin promoter contains an NRE with multiple binding sites which normally represses transcription but under some conditions acts to increase reporter gene transcription (Clark et al.,1995). Dissection of the NRE has identified the binding site responsible for the positive effect on transcription as that of Oct-1, a ubiquitous activator. Similarly, the osteocalcin gene promoter contains multiple negative acting elements mapping around a positive element. The two main negative acting sequences are located 42 bp apart but the loss of either deactivates the NRE. These two elements, tested for transcription regulatory activity, can be overcome by the adjacent positive acting sequence. Inclusion of an additional section of the promoter containing another negative element will again swing the balance of the region to suppress test promoter transcription 9 (Frenkel et al, 1994). These results emphasise the way that silencers and enhancers in a promoter are balanced against each other to provide a means for complex gene regulation. 2. Retroelements of the Human Genome A theme common to a wide range of eukaryotic transposable elements, including many human examples, is the use of an RNA intermediate in the transposition mechanism. Members of this large division of elements are termed retroelements to distinguish this group from the large group of transposons utilising a DNA intermediate. Retroelement fidelity requires that the RNA copy includes all of the information present in the original DNA element. This is not a characteristic of transcripts derived from genes using typical RNA polymerase II promoters. The retroelement RNA species also requires a means to copy itself back to a DNA form. This is believed to be accomplished via an RNA dependent DNA polymerase or reverse transcriptase. These two fundamental requirements are met by various members of the family in different ways, dependent on their individual structure and strategy. A. Long Terminal Repeats Both the Alu and L1 family of retroelements contain internal promoters with transcription start sites at the upstream limit of the elements. As a result of this arrangement, RNA transcripts include the entire sequence of these elements (Ullu and Tschudi, 1984; Swergold, 1990; Minakami et al., 1992). The next level of complexity exhibited by retroelements is the use of the long terminal repeat or LTR. These direct repeats eliminate the need for an internal promoter in ensuring the complete 10 sequence of the element is retained in the RNA intermediate. In both the true retroviruses and the endogenous retroelements, the 5' LTR serves as promoter for the element while the identical or nearly identical 3' LTR provides a polyadenylation signal. With this arrangement a transcript will not contain sequences from the 5' end of the 5' LTR or the 3' end of the 3' LTR, but due to their duplicate nature each partially transcribed LTR is able to complement the other in the reverse transcription process. In this way both LTRs are reconstructed yielding a complete element, retaining all functions of the original (reviewed in Varmus and Brown, 1989). 5' LTR 3' LTR Enhancers TATA Internal Sequences Polyadenylation Signal U3 R U5 U3 R U5 Repressors Transcript Start Site Polyadenylation Site »A(n) Retroviral Transcript Figure 1. Major features of the Long Terminal Repeats (LTRs) present in retroviruses and endogenous retroelements. All features shown are usually present in both the 5' and 3' LTRs due to their identical or near identical sequences. For figure clarity, they are depicted here on separate LTRs, where they are normally active in retroviruses. The endogenous element HERV-H, which is the focus of this thesis, is an example of a element bounded by LTRs. Sizes of HERV-H LTRs vary between 400-450 bps. HERV-H LTRs have many features in common with retroviral LTRs (See Figure 1) and as such are divided into the three regions 1)3, R, and 1)5. It is from the genomic transcript of the retrovirus that the regions of the LTR are named. The R region is repeated at both ends of the transcript, while the U5 region is unique to the 5' end of the transcript, and the U3 region is represented only at the 3' end. The HERV-H 11 LTR U3 region contains direct repeats upstream of a TATA box with the U3 - R boundary defined by the transcription start site. Primer extension analysis with RNA from Ntera 2D1 cells indicated this site is 24 bp downstream of the TATA box's first T (Wilkinson et al., 1990). The R region includes the first bases of transcribed sequence, the polyadenylation signal and the polyadenylation site at a short set of CA repeats. Beyond the polyadenylation site is the U5 region extending approximately 45 bp to the end of the LTR. In the infectious retroviruses the 5' and 3' LTRs are identical, but in endogenous elements mutations begin to accumulate in the LTRs of integrated elements. Pairs of HERV-H LTRs from single elements have been reported to be 96% identical, on average (Mager, 1989). Some similar sequences and motifs are commonly observed between otherwise unrelated LTRs. As promoters, LTRs require either TATA boxes or initiator (Inr) elements. A variety of transcription factor binding sites are also located in the LTR to support its function as a promoter. The human immunodeficiency virus (HIV) LTR has recognition sites for the common transcription factors AP1, N F - K B and Sp1 among others ( Zeichner et al., 1991). These sites as well as a large number of other transcription factor binding sites, can also be found in the LTRs of other retroviruses and endogenous retroviruses. A feature of many LTRs, not as well characterised as many transcription factor binding sites, is the presence of direct repeats. As described in later sections, both HERV-H and ERV-9 have unique direct repeat structures in their LTRs (Mager, 1989; Lania et al., 1992). One well characterised example is the octamer repeats of the mouse mammary tumour virus LTR which are recognised by the Oct transcription factors and are required for LTR basal promoter activity (Bruggemeier et al., 1991; Buetti, 1994). The role of the HERV-H repeats in LTR function is investigated later in this thesis. 12 B. LTR Containing Retroelements in Humans THE-1: Most elements using LTRs contain genes potentially involved in reverse transcription but this is not the case with the most abundant human LTR containing element, THE-1. The THE-1 LTRs are approximately 350 bp, flanking an internal region of 1.6 kb with no known homology (Paulson et al., 1985). The LTR flanked structure coupled with the 5 bp insertion site duplication implicates retrotransposition as the method of THE-1 amplification. Current numbers in the genome are high, around 10^. Human Endogenous Retroviruses: The most complex LTR containing retroelements in the human genome share homology to the structure and sequence of the infectious retroviruses and as a result are referred to as endogenous retroviruses. Organisation is highly conserved between all retroviruses and endogenous retroviruses. In full length elements, the Long Terminal Repeats (LTRs), described previously, flank three major internal coding regions. Downstream of the 5' LTR is the gag open reading frame (ORF) which codes for the major viral core protein. The middle of the three ORFs is always the pol gene, responsible for the reverse transcriptase and integrase. The ORF nearest the 3' LTR is that of the env gene, coding for the viral envelope protein. The major difference between endogenous retroviruses and infectious or true retroviruses is the integrity of these ORFs. Endogenous retroviruses can retrotranspose with gag and pol genes provided in trans and have no requirement for the env gene product in this process (Tchenio and Heidmann, 1991). As a result, all endogenous retroviral elements yet characterised (see below) have some disruption to these ORFs which would be 13 intolerable in an infectious retrovirus. Human endogenous retroviral (HERV) elements are sub-divided into two groups depending on pol domain homologies. The endogenous groups are named Class I and Class II and are related to two of the four infectious retroviral pol gene homology groups (reviewed in Wilkinson et al., 1994). An additional tool in the naming of HERV elements involves the identification the 18 bp HERV primer binding site (PBS). This key structure is located near the 5' LTR and is complementary to a section of one of the cellular encoded tRNAs which binds the site and serves to prime reverse transcriptase. The single letter code for amino acid represented by this complementary tRNA is included in the name of the element. For example, HERV-H is so named because it is a human endogenous retroviral element with a PBS most closely matching the tRNA for the amino acid histidine. Limitations of this nomenclature include unrelated elements sharing the same PBS and deleted elements lacking it. D. Class I HERVs A diverse group of endogenous retroviral elements make up the Class I HERVs. They show a wide range in copy number from single elements to numbers estimated near one thousand. As a rule they are defective in some aspect of retroviral coding or expression. Recently, the organisation within the class has been clarified by the suggestion of a large family encompassing many of the low copy number HERVs. This group is the HERV-ERI super family (Wilkinson et al., 1994). The HERV-ERI Super Family: The most abundant member of the HERV-ERI super family is HERV-E, which occurs primarily in two forms, each with copy numbers in the 35 - 50 range 14 (Steele et al., 1984). The first HERV-E elements were isolated based on cross hybridisation with a portion of an endogenous element found in African green monkey genomic DNA (Martin et al., 1981). That element had in turn been identified by hybridisation with a murine leukemia virus derived probe. This method of isolating endogenous elements, based on homology to existing endogenous retroviral sequences or infectious retroviruses, reoccurs frequently in studies of HERVs. HERV-E is one of a growing number of HERVs with a subset of elements being full length. In this case, half of the elements are found in the 8.8 kb form with approximately 490 bp LTR's flanking sequences with homology to gag pol and env retroviral sequences (Steele et al., 1984; Repaske et al., 1985). The remainder of the HERV-E elements are of a deleted type, approximately 6 kb in length, lacking env sequences. While env deletions are very common among endogenous retroviruses, the truncated HERV-E form shows an additional alteration unique to this subgroup. The LTR's of the full length element are not present. The internal sequences are instead bounded by unique repeat sets made up of 8 - 13 tandem copies of a degenerate 72 - 76 bp sequence (Steele et al., 1984). In addition to these two long forms, solitary HERV-E LTR's have also been identified in the human genome (Steele etal.,1984). The distribution of HERV-E elements throughout the human genome shows evidence for both a random and non-random dispersal. Southern blots, using as probes both 5' and 3' flanking DNA, detected other HERV-E elements (Steele et al., 1984, 1986 ; Repaske et al., 1983) indicating an association with repetitive DNA regions. This implicates chromosomal duplication mechanisms in the amplification of HERV-E. Conserved Southern blot bands could not, however, be associated with a specific chromosome indicating a random component to the interchromosomal arrangement. These findings are consolidated in the results of in situ hybridisation 15 which showed HERV-E elements in clusters associated with both telomeric and centromeric chromosomal regions (Taruscio & Manuelidis, 1991). The same study also provided evidence for elements integrated in isolation. An example of a full length HERV-E element has been completely sequenced, and stop codons were found in each of the gag, pol and env genes. However, a subregion of the env gene representing the surface glycoprotein domain is almost entirely included in a large 1.3 kb open reading frame (Repaske et al., 1985). This finding is most significant in light of the expression patterns observed for HERV-E. In true retroviral expression, a major splice event produces an env specific transcript by using a splice donor proximal to the 5' LTR and a splice acceptor just upstream of the env coding region. Major transcripts from HERV-E elements of 3.0 and 1.7 observed in placenta, colon mucosa, spleen and liver (Rabson et al., 1983; Gattoni-Celli et al., 1985) are a result of the analogous splicing event. The shorter 1.7 kb transcript is a result of expression from elements containing two deletions in the env region, but otherwise representing the same splicing event (Rabson et al., 1985). Full length transcripts of HERV-E are expressed and have been detected in the human brain (results cited in Taruscio & Maneulidis, 1991; Yeh et al., 1991). HERV-E is a very stable constituent of the human genome with a number of attempts failing to detect polymorphism associated with HERV-E integrations (Steele, et al., 1984; Taruscio & Maneulidis, 1991; Nakamura et al., 1991; Sugino et al., 1992). The element is also present in the genomes of Old World monkeys and apes (Repaske et al., 1985; Shih et al., 1991) indicating the original HERV-E integration was an ancient event involving a common ancestor at least 25 million years ago. Several other groups of sequences related to HERV-E have been identified including: HERV-R, originally known as ERV-3 (O'Connell et al., 1984); RRHERV-I (Kannan et al., 1991); ERV-1 (Bonner et al., 1982); Hs5 (Levy et al., 1990) and NP-2 16 (Silver et al., 1987). Several of these have major deletions including deletions involving the 5' LTR . Class I HERV continued: A number of other Class I elements and families have been identified including several with copy numbers over 20. These elements retain internal homology required to be Class I HERVs but do not show the elevated levels of homology exhibited between the members of the HERV-ERI super family (Wilkinson et al., 1994). HERV-I (RTVL-I): This 9 kb element was originally discovered when a segment of sequence from the haptoglobin-related locus showed homology to retroviral sequence and structure. The genome is estimated to contain 25 of these HERV-I elements consisting of approximately 500 bp LTRs flanking gag, pol and env homology regions (Maeda, 1985; Maeda and Kim, 1990). Sequencing of three HERV-I members, two from humans and the analogous element from chimpanzees, indicated no functional protein product is possible due to the frequent mutations disrupting the ORFs (Maeda & Kim, 1990). Both of the sequenced human HERV-I members also include insertions of Alu elements into internal sequences. These are two examples of a somewhat common phenomena where retroelements appear to insert into sequences of elements previously integrated at a given site. HERV-I may be most noteworthy for its association with the haptoglobin gene cluster. Three separate HERV-I insertion events have occurred in the haptoglobin gene cluster during primate evolution and this region is prone to homologous sequence mediated unequal recombination. This is hypothesised to be the mechanism behind both the amplification of the locus in primates and the subsequent 17 deletion of one of the three copies in the human lineage (Maeda et al., 1986; Maeda & Kim, 1990). This locus remains unstable with copy number polymorphisms present in the current human population. It is not known if HERV-I plays a role in this instability but it remains a possibility. HERV-P Families: The HERV-P family, like many other endogenous retroviral elements was identified based on homology with retroviral sequences. In this case however, the homology was to the short (approximately 18 bp) primer binding site complementary to the tRNA for proline. Three otherwise unrelated HERVs were isolated in this way and were named HuERS-P1, P2 and P3 (Harada et al., 1987; Kroger & Horak, 1987). All three members are present in the human genome at around 20 copies. However, this diverse group remains largely uncharacterised with very little internal sequence known and no expression data yet collected. EVR-9 : ERV-9 is an endogenous element with non-typical 1.8 kb LTRs flanking a standard ~6 kb internal region with gag, pol and env homologies. As with the majority of other HERV examples, none of the internal sections of sequenced examples contain open reading frames (La Mantia et al., 1991). Transcripts include a unit length 8 kb transcript and two spliced sub-genomic RNAs of 2.0 and 1.5 kb(La Mantia et al., 1991; Lania et al., 1992). Southern blot analysis detects about 40 copies of the full length element in the human genome and an additional 3000-4000 solitary LTRs (Zucchi and Schlessinger, 1992). The unique ERV-9 LTRs contain two repeat sets both of which occur in variable numbers depending on the LTR sequenced (Lania et al., 1992). The 41 bp repeat is 18 typically present in 3 copies, several hundred base pairs upstream of the transcription initiation site. The larger 72 bp repeat, also typically with 3 copies, is located an equivalent distance downstream. In reporter gene assays the ERV-9 LTR is able to drive transcription of a heterologous gene (La Mantia et al.,1991). Analysis indicates that a binding site for the transcription factor Sp1 and the transcript initiation control element Inr are both required for promoter function (La Mantia et al., 1992). The Inr element which is located from -7 to +6 (relative to transcript initiation) and a TATA Box located at -25 have recently been demonstrated to cooperate in the optimum transcription efficiency of the ERV-9 promoter. As well, the binding of transcription factors upstream has been shown to act synergistically in elevating expression (Strazzullo et al., 1994). The behaviour of the ERV-9 LTR as a promoter is of interest primarily due to the large number of solitary LTRs in the genome and the potential for them to promote cellular sequences. One such LTR has been found to drive expression of the ZNF80 zinc-finger gene (Di Cristofano et al., 1995). Other Single Class I Members : S71 is a single copy element identified by its hybridisation under low stringency with the genome of simian sarcoma-associated virus (Leib-Mosch et al., 1986). More recently, elements related to the original S71 element have been cloned and shown to possess a full length retroviral reverse transcriptase unlike the original (Haltmeier et al., 1995). Mutations disrupting this pol region are conserved between family members indicating these elements were amplified in this defective state and probably relied on another source for the required functional proteins (Haltmeier et al., 1995). RNA expression analysis has been complicated by the late recognition of a HERV-K LTR inserted in the original S71 clone. Detection of expression from other family members has not been reported. 19 HRES-I : HRES-I is an ancient single copy endogenous retroviral element which has been detected in both Old and New World monkeys dating its insertion in the primate lineage to at least 45 million years ago. Despite its age, the element's 684 bp LTRs appear able to drive production of a 6 kb transcript containing two large open reading frames with regions of homology to the gag gene of HTLV-I, II, HIV-2 and feline sarcoma virus (Perl et al., 1989). A 28 kDa protein product has been attributed to the larger HRES-I ORF via antibody studies of H9 cells, a human T-cell line (Banki et al., 1992). Recent investigations of HRES-I suggest a role of the 28 kDa protein in autoimmune disorders. Antibodies to this protein are cross reactive with the HTLV-I p19 gag protein (Banki et al., 1992) and with the retroviral gag related portion of the 70 kd protein component of the Ul small nuclear ribonuclear protein (Perl et al., 1995). E. CLASS II HERVS Compared to the diversity observed between HERVs in Class I, Class II HERVs appear very homogeneous, are grouped based on homology to the pol region of mouse mammary tumour virus (MMTV) and share a single primer binding site. This site matches the tRNA for lysine (K) thus by convention the family is named HERV-K (Callahan et al., 1985; Ono et al., 1986). HERV-K: Of all endogenous retroviral elements currently identified in the human genome, HERV-K is the closest to a fully functional retrovirus. Originally identified using a probe from the Class II retrovirus MMTV (Callahan et al., 1982, 1985), full length HERV-K members were cloned based on a hybridisation to a pol region probe from a Syrian 20 hamster intracisternal A particle (Ono, 1986). HERV-K has a total proviral length of 9.1 or 9.4 kb with LTRs of approximately 970 bp and a primer binding site corresponding to the lysine (K) tRNA (CUU anticodon) (Ono, 1986). The only HERV-K element completely sequenced and published to date is the isolate HERV-K10. The 970 bp LTRs were found to be nearly identical, indicating this element has inserted relatively recently. A unique feature of HERV-K10 is the extensive homologies to infectious retroviruses with the identification of gag, prt, pol and env genes complete with specific sub domains. Aside from one 290 bp deletion at the pol/envjunction, only single point mutations or frame shifts disrupt most coding regions and the protease ORF remains intact. Approximately 50 copies of the HERV-K element can be detected in the human genome (Ono, 1986), but the number of LTRs is estimated at 25,000 (Leib-Mosch et al., 1993). HERV-K internal and LTR-like sequences have been detected in Old World monkeys, indicating an original integration between 30 and 40 million years ago (Leib-Mosch, 1993; Steinhuber et al., 1995). Transcripts bearing HERV-K LTR sequences can be widely detected in human tissues and cell lines. This expression, however, is thought to be primarily due to the activity of other promoters including HERV-K sequences in unrelated transcripts (Leib-Mosch, 1993). LTR driven HERV-K expression can be detected in the form of an 8.8 kb full length mRNA in a number of cell lines, human placenta, normal leukocytes and peripheral blood mononuclear cells (Ono et al., 1993; Franklin et al., 1988; Medstrand et al., 1992; Brodsky et al., 1993). A result from expression studies in teratocarcinoma cell lines is indicative of the high level of similarity between HERV-K and infectious retroviruses. A subset of HERV-K elements are expressed in teratocarcinoma cells with full length and spliced sub-genomic transcripts observed (Li et al., 1995; Lower et al., 1993). One of the small spliced RNAs (1.8 kb) is doubly spliced to generate an 21 ORF that could code for a 12 kd protein. A portion of this ORF bears resemblance to the RNA binding domain of the rev, rex, and tat genes of the infectious retrovirus HTLV-1 and HIV-1 (Lower etal., 1993). As mentioned above, other HERV-K ORFs are uninterrupted for considerable length. To verify that they possess the potential to be translated, an intact gag ORF from a transcribed HERV-K sequence was expressed in E. coli. This ORF contains a protease homology region and the resulting protein showed autoproteolytic processing (Mueller-Lantzsch et al., 1993). Open reading frames have been isolated from expressed HERV-K sequences for each of the retroviral gene homologs (Lower et al., 1993). It may be expected that each of these can produce a functional protein, as was demonstrated for the protease. Very strong evidence exists to indicate both gag and env proteins are expressed in cell lines. Antisera against HERV-K gag reacts specifically with the HTDV (human teratocarcinoma derived viruses), an immature viral particle expressed in the GH teratocarcinoma cell line (Boiler et al., 1993). The protein specifically detected is 30 kDa in size, which is the predicted size of the HERV-K gag region encoded major core protein (Mueller-Lantzsch et al., 1993). In the T47D human breast carcinoma cell line, monoclonal antibodies against recombinant HERV-K env protein immunoprecipitate a 67 kDa glycoprotein. The protein's level is up regulated by female steroid hormones, while it has been shown previously that the expression of HERV-K elements will also increase in the T47D cell line with the same treatment. Antibodies can be identified in human sera which recognise HERV-K env protein epitopes. Recently, Steinhuber et al. (1995) reported detecting a polymorphism in a DNA sample from a Chinese individual that the authors attributed to a HERV-K amplification event. Further analysis will be required to verify if this variation is a result of HERV-K retrotransposing, but given what is known about the high level of HERV-K functionality, it is a distinct possibility. 22 F. HERV-H Originally named RTVL-H (retroviral-like with histidine (H) tRNA primer binding site), HERV-H is a large family of defective Class I endogenous retroviral elements. The first HERV-H element was discovered by chance during investigation in the human 3-globin gene cluster (Mager and Henthorn, 1984). Since that time, HERV-H has been determined to be present in the human genome in approximately 1000 copies with an additional 1000 solitary LTRs. Two major populations exist, a deleted form of 5.8 kb present in 800-900 copies, and a relatively intact form of 8.7 kb present in approximately 100 copies (Mager & Freeman, 1987; Hirose et al., 1993; Wilkinson et al., 1993). The deleted form completely lacks an env sequence and, when compared to MLV, shows four major deletions in the pol region (Mager & Freeman, 1987; Wilkinson et al., 1993). No large ORFs were observed in the sequence of the original prototypic element, RTVL-H2 (Mager & Freeman, 1987). Conversely, the less numerous intact population do contain a region with homology to retroviral env genes. They also contain the pol regions deleted in the more abundant form and have members with large pol ORFs (Wilkinson et al., 1993; Hirose et al., 1993). A number of lines of evidence suggest HERV-H amplified to its present numbers by retrotransposition. Inserted elements are flanked by 5 bp direct repeats, a hallmark of integrase activity (Mager & Freeman, 1987). The structural similarities between HERV-H elements and retrotransposons or retroviruses, as well as the distribution throughout the human genome are all consistent with retrotransposition. This evidence does not however, rule out other mechanisms. Gene conversion events have been demonstrated to be a secondary mechanism for Alu family evolution (Kass et al., 1995). Duplication events are in part responsible for the current numbers of HERV-E elements present in the genome (Steele et al., 1986). Both of these 23 mechanism may have acted on the HERV-H population. However, further evidence for retrotransposition as the route employed by HERV-H comes from analysis of splicing within integrated elements. Two of the features HERV-H shares with true retroviruses are internal splice donor and acceptor sequences. In retroviruses, splicing serves to produce sub-genomic RNA species. Some integrated HERV-H elements show analogous splicing events, producing an RNA intermediate which implies a reverse transcription dependent, retrotransposition mechanism (Goodchild et al., 1995). Analysis of transcripts obtained as cDNA from Ntera 2D1 cells demonstrated LTR driven transcription from at least 13 separate HERV-H elements (Wilkinson et al., 1990). HERV-H transcripts are also effectively polyadenylated in the downstream LTR (Mager, 1989). Analysis of LTR promoter function in reporter gene assays indicated HERV-H LTRs can both promote heterologous gene expression and polyadenylate heterologous transcripts (Feuchter& Mager, 1990; Mager, 1989). Whole LTRs isolated at a distant site can enhance transcription from heterologous promoters (Feuchter & Mager, 1990). However, results involving LTR functions have been found to be very heterogeneous. This variation correlates with variation observed in the internal structure of the HERV-H LTRs isolated to date. Three subclasses of LTR have been described varying in structure of internal direct repeats. As a major focus of this thesis, the differences in sequence and behaviour of these three LTR subclasses will be described later. Of the cell lines tested for HERV-H expression, the teratocarcinoma lines Ntera 2D1 and Tera 1 are the highest. Northern blots also detect transcripts in the embryonal kidney cell line, 293, the bladder carcinoma cell line, 5637, and the cervical carcinoma cell line HeLa, among other lower expressing lines (Wilkinson et al., 1990; Goodchild et al., 1993). Of the primary tissues tested, high expression was observed only in amnion and chorion of placenta. Expression was not observed in chorionic villi 24 of placenta or in prostate tissue or peripheral blood lymphocytes (Wilkinson et al., 1990). This work predates the discovery of the full length e n v containing HERV-H elements and the transcripts observed were primarily 5.6 kb in length suggesting an unspliced transcript from deleted HERV-H elements. Shorter spliced forms were also observed and the specificity of the splice events were verified (Wilkinson et al., 1990). When specifically sought, expression of env containing transcripts was observed including an 8.2 kb transcript which represents a good candidate for a unit length transcript. Shorter, presumably spliced sub-genomic transcripts, also contain env sequences. The northern blots performed with HERV-H env probes were exposed 30 times longer than other more common HERV-H region probes, emphasising the relatively low expression level. 3. Impact of Retroelements on the Human Genome By virtue of its behavioral, functional and distributional characteristics, HERV-H has the potential to have considerable impact on the function and evolution of the human genome. As mentioned earlier, HERV-H is a repetitive sequence which has probably amplified via retrotransposition (Goodchild et al., 1995). Though as yet undemonstrated, HERV-H elements may have retained the ability to transpose in this manner. HERV-H LTRs contain elements responsive to cellular factors that are capable of enhancer, promoter and polyadenylation functions. No HERV-H elements which are coding competent in the gag, pol or e n v genes have yet been characterised. However, recent investigations have found examples that do not contain the typical deletions. Members of this intact subgroup may be found to be capable of producing viral proteins (Wilkinson et al., 1993; Hirose et al., 1993). Each of the above 25 mentioned traits provide a means by which HERV-H can affect the host genome. While many of these effects have not yet been attributed to HERV-H, analogies can be drawn to other sequences in the genome that share these traits. Rearrangements: Repetitive sequences have been shown in many cases to affect the stability of the genome by mediating rearrangement events including, duplications, deletions and translocations. Alu sequences can serve as a source of homologous sequences for aberrant recombination within genes. Both internal deletions and duplications have been observed in the low density lipoprotein receptor gene disrupting the gene's function and resulting in individuals with hypercholesterolemia (Hobbs et al.,1986; Horsthemke et al., 1987; Lehrman et al., 1987). A deletion within the adenosine deaminase gene by aberrant homologous recombination between Alu elements has also been described (Markert et al.,1988). Recombination events need not be limited to sequences that are cis linked. Inter Alu recombination, between the X and Y chromosomes, occurred in the germ line of the father of an individual found to be an XX male (Rouyer et al., 1987). A translocation resulting in the disruption of the tre oncogene in a Ewing sarcoma derived cell line is also attributed to inter-chromosome Alu mediated recombination (Onno et al., 1992 ) No HERV elements have yet been observed to mediate this type of repetitive element mediated rearrangement. Approximately 10 5 to 10^ Alu elements are present in the human genome while the most abundant HERV element, HERV-H, has approximately 10 3 copies. Considering this, HERV mediated rearrangements should occur several orders of magnitude less frequently and it is likely merely a matter of time before examples are isolated. 26 Insertion events: While repetitive sequence mediated homologous recombination does not require HERV-H to be in any way active, retrotransposition events require the element to be functional in a number of ways As described previously, evidence indicates that historically, HERV-H has been capable of retrotransposition and has amplified to its present number using this mechanism. Due to its apparent low rate of transposition, the impact of HERV-H insertions on the human genome is probably most relevant on the evolutionary time scale (Goodchild et al., 1995). This observation also extends to all other known HERVs as no new insertions have been detected involving any other HERV (see Wilkinson et al., 1994). Possibly the most recent HERV insertion event involves a spliced HERV-H element found in 6.of ,12 human DNA samples tested (Goodchild et al., 1995). This example will require verification, but appears to indicate that this insertion event is polymorphic in the human population. One possible result of an integration event is the insertional inactivation of a given gene (reviewed in Amariglio and Rechavi, 1993). In rodent and avian systems, this is the principle mechanism of transformation by slow transforming retroviruses and numerous examples have been described of more active endogenous elements inserting into genes. In humans, an L1 element has been identified as an insertional mutagen of the myc gene in a breast carcinoma (Morse et al., 1988) and the ARC gene in a colorectal carcinoma (Miki et al., 1992). In mice, several endogenous elements exist that are more active than their human counterparts. Its estimated that 5% of the recessive mutations in mice are caused by C-type retrovirus insertions (Stoye et al., 1988). Effects on adjacent cellular genes: HERV-H and other endogenous retroviral elements possess sequences, 27 particularly in their LTRs, that are functionally active. These sequences need not be limited in their range of action to within the HERV element and as such have the potential to act on other sequences in the genome. LTRs have been demonstrated to donate enhancer, promoter and polyadenylation functions to cellular genes. Occurring in the internal HERV region, splice donor and acceptor sites can involve HERV sequences in cellular mRNAs in which they would otherwise remain intronic and be spliced from the mRNA. An interesting example of a HERV lending regulatory sequences to a neighbouring cellular gene can be found in the salivary amylase gene cluster. Both the pancreatic and salivary versions of the amylase gene arose from a single progenitor gene. Duplication of this progenitor was followed by the insertion of an HERV-E element just upstream of the promoter for one copy but in the opposing orientation (Samuelson et al., 1990; Ting et al., 1992). The HERV-E element has since been shown to provide regulatory sequences responsible for salivary gland specific expression of this gene (Ting et al.,1992). The importance of this event in human evolution is underscored by the fact that in a convergent process, the murine lineage has also evolved an orally secreted amylase. As well as having discrete elements acting on cellular genes, the portions of HERV elements bearing active sequences can be incorporated into evolving genes. A possible example of this is the inclusion of an HERV-I LTR in the human cytochrome C1 gene promoter (Suzuki et al., 1990). HERV-H LTRs have not been found acting over a distance on cellular promoters in the genome. However, in reporter gene assays whole HERV-H LTRs have been demonstrated to have an enhancing effect on test construct promoters when cloned at a distant site. Results vary depending on the LTR tested, but some HERV-H members clearly have the potential to influence transcription from promoters of adjacent cellular genes (Feuchter and Mager, 1990). 28 Donation of Promoter Function: As mentioned previously, many HERV families currently present in the human genome possess LTRs capable of promoting transcription. Many members also have a large number of solitary LTRs dispersed throughout the genome. These observations led investigators to hypothesise that HERV LTRs could promote cellular genes directly. This is indeed the case. The widely conserved ZNF80 zinc-finger gene is transcribed from a solitary ERV-9 LTR in humans (Di Cristofano et al., 1995). The human plk gene is a kiippel-related zinc-finger gene transcribed when a transcript initiating in the 5' LTR of a HERV-R element reads through the 3' LTR into the H-plk ORF (Kato et al., 1990). A HERV-H element is also involved in promoting a cellular gene by this mechanism. At the PLA-2L locus, a HERV-H type I element is inserted upstream of a multi-exon region with the 5' LTR acting as promoter for a transcript that reads through the 3' LTR and into the cellular sequences beyond. The resulting transcripts show splice events between a splice donor in the HERV-H internal sequence and splice acceptors from either the first or second downstream exons (Feuchter-Murthy et al., 1993). To date expression of PLA-2L at the northern level has only been demonstrated in teratocarcinoma cell lines, though suggestion of very low level expression in some placental tissues has been obtained with RNA-PCR (Feuchter-Murthy et al., 1993) HERV-H, as a promoter of cellular transcripts, has been specifically investigated by Feuchter et al. (1992) in the Ntera 2D1 cell line where HERV-H expression has been shown to be high (Wilkinson et al.,1990). Their cDNA hybridisation screening method identified four transcripts apparently promoted by solitary HERV-H LTRs. Two of these transcripts read into non-coding sequence while a third short clone includes sections of a CpG island that may indicate close proximity 29 to a gene. The fourth of the clones initiates in a HERV-H LTR and reads into the CDC4L gene. Evidence suggests the insertion event leading to the HERV-H/CDC4L chimeric transcript is unique to the specific Ntera 2D1 cell sub-population used to prepare the cDNA (Feuchter et al., 1992). Polyadenylation: One of the key functions of HERV LTRs is to provide a polyadenylation signal to the 3' end of the transcript. As an example, HERV-H LTRs have been specifically tested for polyadenylation function and evaluated for sequences associated with transcript polyadenylation. HERV-H LTRs contain polyadenylation signals, sites and putative support sequences. Both type I and II LTRs were demonstrated to provide polyadenylation to a heterologous gene when assayed in a test construct. More important to considering genome impact, cDNAs were isolated representing cellular transcripts where HERV-H LTRs provided the polyadenylation function (Mager, 1989). The cellular role of these particular transcripts is unclear, as one of the completely sequenced examples lacked an open reading frame. However, this is not always the case. Goodchild et al. (1992) described a gene, termed PLT, where an alternate splice event utilised an exon where a HERV-H type II LTR provided the polyadenylation signal. This transcript can be detected in placental tissues by northern blot analysis. Examples of other HERV LTRs polyadenylating transcripts have been described for HERV-E (Tomita et al., 1990) and THE-1 (Paulson et al., 1987). HERV encoded proteins: An additional avenue that retroviral related sequences may use to affect the human genome is via the proteins for which they code. Both of the highly successful 30 repetitive elements, Alu and THE-1 lack the ability to code for reverse transcriptase (RTase), yet they appear to have utilised its function in amplifying to the high levels observed in humans today. One possible source of the RTase is other elements in the genome with coding potential. As mentioned earlier, a member of the L1 family has been shown to code for a functional enzyme. Of the HERV elements, HERV-K appears to be the most likely candidate for production of a functional RTase with HERV-H and other elements also possibly capable. The reverse flow of information from RNA back to DNA leads to duplication of sequences represented in the original transcript. This is key to the replication of retroelements and can also be applied to cellular transcripts resulting in processed pseudogene formation. The evidence for such a process is the existence of intronless sequences with high homology to normal cellular genes, but possessing a polyadenylation tail and flanked by short direct repeats resulting from reintegration. Typically the processed pseudogene is non-functional, but some examples termed retrogenes have acquired a promoter and are expressed. In humans these include the autosomal human phosphoglycerate kinase gene (Boer et al., 1987; McCarrey & Thomas, 1987) and the human prointerleukin-1 B gene (Clark et al., 1986), as well as examples in a number of other species (mouse, rat, chicken and woodchuck) (reviewed in Wilkinson et al., 1994). The LTR containing elements THE-1 show evidence of a retroviral type integrase involvement in the element's integration, specifically the 5 bp direct target site duplication and LTRs beginning with 5' TG and ending with CA 3' (Paulson et al., 1985). With no apparent coding potential, THE-1 elements must exploit another integrase in trans , and again endogenous retroviral sequences represent a plausible source. The proposed complementation between endogenous retroviral proteins and nucleic acids derived from other endogenous sources has been firmly established in 31 related examples. Using a retroviral construct defective in the gag region and deleted for pol and env, Tchenio and Heidman (1991) provided gag and pol genes in trans and demonstrated they were capable of supporting transposition of the defective retrovirus construct. This experiment verified that while env sequences are required for infectious particle formation, they are not required for retrotransposition. An example of true complementation between endogenous elements can be found in the human mammary carcinoma cell line T47-D. In this cell line, retrovirus-like particles associated with reverse transcriptase activity appear to package three different types of coding defective endogenous viral genomes (Seifarth et al., 1995). The source of the RTase and the particle protein is still under investigation, with early results indicating at least two distinct types of particles are present. It is clear however, that the presence of functional HERV-derived sequences has wide implications for the human genome. Summary of the Impact of HERVs on the genome: As predicted by the functions attributed to HERV elements and their wide distribution in the genome, HERV elements have donated functional sequences to adjacent cellular genes. This suggests that the process of evolution has utilised HERV elements as a source of functional units in shaping the genome. It may appear that both the majority of cellular transcripts initiated by LTRs and of those polyadenylated via HERV LTRs are of no consequence to the genome at this time. They do however, represent a considerable pool of exploitable building blocks for further evolution. HERVs then have the potential to contribute coding and regulatory sequences directly to the development of new cellular genes. They may also facilitate duplication of existing cellular sequences for this purpose using mechanisms involving reverse transcription of cellular RNAs. This theme is a strong one in evolution. While the 32 human genome contains approximately 100,000 genes, these are estimated to have been built by duplicating and rearranging only 3000 exons . 4. Thesis Objectives The general goal of this thesis was to expand our knowledge of HERV-H LTRs. This interest in HERV-H LTRs was on two levels. Given the previously described potential impact of HERV-H and its LTRs on the human genome, a better understanding of the functional units within the LTRs was hoped to increase the ability to predict or explain their effects in the genome. Secondly, since LTR functions are fundamental to endogenous retroviral biology, further characterising these functions is very relevant to the understanding of the effects HERV-H elements may have in the genome. Specifically, the objectives involved assessing the role of various LTR sequence motifs in LTR function and determining if these motifs are responsible for functional differences observed between subclasses of HERV-H LTRs. Sequence differences exist between subclasses of HERV-H LTRs, primarily in repeat structures present in some types but not others. These repeat structures represented one focus of this investigation. A second focal point was potential binding sites for the transcription factor Sp1, again present in some LTRs but not others. These sites were studied in an attempt to correlate differences in LTR sequence with differences in LTR function and ultimately to shed light on factors affecting the dynamics of the current HERV-H population in the human genome. Materials and Methods 34 LTR descriptions and applied PCR methods Isolation and characterization of all HERV-H LTRs used has been previously described: RTVL-H2 (Mager and Freeman, 1987); PB-3 and H6 (Mager, 1989); N10-14 (Wilkinson et al., 1990); and cPj-LTR (Goodchild et al., 1992). Primers used to PCR amplify the type I repeat set from the RTVL-H2 5' LTR also facilitated cloning by providing 5' non-complementary tails with restriction sites for Xbal. They were: xv10rep5A- 5' ACGTCTAGATGGCCTGAAGTAAOTGA 3' and xv10rep3A- 5' GATTCTAGAGCTAGGATGAGCCCGGAA 3'. The type II repeat set from the cPj-LTR was amplified and cloned using Xbal tails provided by the primers: pjrep5B- 5' CTTTCTAGATGACATTTCACCATTG 3' and pjrep3A- 5' TGTTCTAGACGGTGGAGTTCGGA 3'. Cloning of complete LTRs, manipulated or otherwise, was via PCR amplification utilizing primers: LTRhind- 5' ACTTAAGCTTGTCAGGCCTCTGAGCCCC 3' and LTRpolyA- 5' AACCAAGC M i l l ATTTCACCTGGG 3', both of which contained a 5', non-complementary, tail with a Hindlll recognition site. Use of these primers ensured that all constructs were of an equivalent length. Many LTRs, such as H6 used here, have been isolated as the source of transcript polyadenylation signals and were therefore limited to the U3 and R regions occurring before the LTR polyadenylation site. The LTRpolyA primer binds at this site and allows longer LTRs to be limited accurately to the same length as the cDNA derived LTRs. PCR primers used exclusively for the construction of LTRs with the repeat regions removed (Repeat Free LTRs) were: NGBgIA- 5' ACGAGATCTTCAGTTACTTCAGGCC 3'; NGBgIB- 5' TTCAGATCTTGTGAMGTCCTTTTCC 3'; PB3BglA- 5' GGTAGGTCTGGAATGTCATCAGTTAAGG 3' and PB3BglB- 5' CCTAGATCTAAAACATTGCTCTTAACTTC 3'. PCR reactions used standard conditions. Taq DNA polymerase, 10X reaction buffer (BRL) and 50 mM MgCl2 were used at final concentrations of 0.25U, 1X and 1.5 mM in a 50 ml reaction, respectively. 30 pmol of each primer and 10 nmol of each dNTP 35 completed the reaction mix. Cycling temperatures were 95°C, 51 °C, 72°C for 1 min. each, over 30 cycles. Transfections and CAT assays Plasmids were prepared by alkaline lysis of overnight cultures followed by CsCI density centrifugation as described in Sambrook et al., (1989). The cell line 293 is an adenovirus transformed human embryonal kidney cell line obtained from the American Type Culture Collection. The cell line NTera2D1 is a human teratocarcinoma line which was obtained from Dr. Peter Andrews (Andrews et al., 1984). 293 cells were maintained in Dulbecco Modified Eagle Medium with 10% horse serum while NTera2D1 cells were maintained in DMEM with 10% fetal bovine serum, all at 37°C and 5% C O 2 . Transfections utilized 10 u.g of supercoiled plasmid DNA in experiments using modified versions of the H6 type la LTR, or 20 u.g of supercoiled plasmid DNA in experiments based on other LTR types or using the 3-globin promoter. Transfections used the calcium phosphate precipitation method (Graham and van der Eb, 1973) with 1 X 10 6 NTera2D1 cells or 3 X 106 293 cells on 10 cm round plates. The first media change was done 16 hrs post-transfection. Cells were harvested 48 hrs post transfection with lysis by freeze-thaw. Equal amounts of protein were assayed for CAT activity using 0.25 u.Ci 1 4 C labelled chloramphenicol as the acetyl-acceptor from 0.6 mM acetyl coenzyme A. Acetylated forms of chloramphenicol were resolved by thin layer chromatography on Bakerflex TLC plates in 95:5 chloroform:methanol solvent and located by autoradiography (see figure 2). Levels of conversion were quantified by scintillation counting. Conversion was determined by comparing counts from acetylated forms to total counts. The transition from the unacetylated form to acetylated forms of chloramphenicol is proportional to the amount of enzyme present which is dependent 36 "O CD A—1 o < TJ o o 0_ 1,3-diacetyl chloramphenicol 3-acetyl chloramphenicol 1-acetyl chloramphenicol chloramphenicol ( unacetylated substrate) Origin ( Sample loaded here Figure 2. Example of an autoradiograph of a generic CAT assay TLC plate exposed for 24hrs. The 3 samples loaded result from assays of cells transfected with constructs of varying promoter strength. The left most sample shows the lowest promoter activity while the far right shows the highest. Extracted 14C-Chloramphenicol is spotted at the origin with the solvent front migrating as indicated. The resulting spots, representing acetylated (reaction products) and un-acetylated (reaction substrate) forms of chloramphenicol, are alined with the original TLC plate to allow those sections of the plate to be cut out. The chloramphenicol present is quantified by scintillation counting with all acetylated forms (spots shown above the horizontal dividing line) pooled in a single tube. The plate section bearing the unreacted chloramphenicol (the single spot shown below the dividing line) is placed in a second tube and counted. 37 on the relative promoter strength of the test construct. In all CAT reporter gene experiments, the results are presented as the data from 2 to 4 trials. Reporter activities were quantified and compared in each experiment to the activity obtained for the basic or control vector. The activity of the basic or control vector was given a reference value of one. This provided each experiment with an internal control to facilitate pooling data. Statistical measures of significance utilised the Studenf's T-test and the results were presented as p-values included in the text of the Results chapter. Plasmid Constructs Plasmids pSVAOCAT(X), pSVABGCAT(X) and pSVABGCAT(LR) have been previously described (Kadesch and Berg, 1986; Kadesch et al., 1986; Henthorn et al., 1988) and were obtained from Dr. Paula Henthorn and Dr. Tom Kadesch, while pSVAH6fCAT(X) was constructed in our laboratory (Feuchter and Mager, 1990). Plasmids BGT1, BGT1X2a and BGT1X2b have the type I repeat structure from the RTVL-H2 5' LTR amplified between the primers xv10rep5A and xv10rep3A and cloned into the single distant Xbal site of pSVABGCAT(X). BGT2 is the analogous construct containing the type II repeat structure from the cPj LTR amplified between the primers pjrep5B and pjrep3A. Plasmids constructed to test the type I and II repeats effects on the Type la H6 LTR were manipulated at the internal Ncol restriction site by blunt end ligation. Inserted at this site were the PCR amplified repeat sets of type I [pSVAH6T1CAT(X)] and type II [pSVAH6T2CAT(X)]. As an insert size control, the 210 bp Hinfl fragment from pUC19 was also cloned into this site [pSVAH6210CAT(X)]. H6 LTR manipulations were carried out in pUC19 and subsequently cloned into the Hind III site at -35 bp of pSVAOCAT(X) via amplification with the Hind III tailed primers LTRhind and LTRpolyA. H6 deletion constructs were developed by one of two methods. The first four 38 constructs utilized the internal restriction sites after which they are named: Accl, Drall, Ncol, and Dralll. Each removed a 5' portion of the H6 LTR larger than the preceding restriction digest, resulting in a series of progressively shorter LTR fragments. The 3' end of these LTR fragments are the same as the parental LTR. Cloning of the first four fragments was via blunt-end ligation into the Hindlll site of pSVAOCAT(X). The final deletion construct, Sp1 (-), was cloned via PCR amplification of a section of the H6 LTR between the Sp1 minus primer- 5' AACCAAGCTTCCCACCAGAGAACAA 3', and the LTRpolyA primer. Cloning into pSVAOCAT(X) was facilitated by the Hindlll sites in the primers non-complimentary tails. The repeat free LTR constructs pSVANGrfCAT(X) and pSVAPBrfCAT(X) are CAT vectors containing modified versions of the native N10-14 and PB-3 LTRs which were cloned in a multi-step process (See figure 3). The two portions of the LTR surrounding the repeat structures, front and back, were amplified in independent PCR reactions with the native LTRs serving as templates. The front reaction also contained both the LTRhind primer, which binds the extreme 5' end of the LTR, and a Bgl II site tailed internal primer from immediately 5' of the repeat structure: NGBgIA or PB3BglA depending on construct. The back reaction contained the LTRpolyA primer which binds at the polyadenylation site, the 3' limit of all constructs. This was paired with a Bgl II site tailed primer which binds immediately 3' of the repeat structure, NGBgIB or PB3BglB, depending on construct. For each LTR, the products of a front and a back reaction were digested with Bgl II, mixed, and ligated, to yield multiple products. The target product was essentially a correctly oriented LTR lacking the central repeat region. These correctly ligated front to back products were isolated by electrophoresis in a 5% polyacrylamide gel [25:1 acrylamide:bis and digested with Hind III for cloning into the Hind III site of pSVAOCAT(X)]. The resulting plasmids were digested with Bgl II to access the original location of the repeats and clone in size matched control DNA Start: Native L T R = generic repeat set / P G R => E a Bgl II Digest » . B [ T 3 Ligate and Size Separate * 4 i : 3 H • H • ' * 4 H Hind III Digest H ] H Clone Hind III Fragment into CAT vector H H D H JH  H H C A T ^ \ Select Orientation B l C o n t r o l l B Clo ^ Bgl II Digest Control | Cl ne Sau 3a Fragment I Fin i sh : LTR with repeats replaced by control fragment Figure3. Graphic representation of the strategy employed in removing the repeat sets from the HERV-H, N10-14 Type I and PB-3 Type II LTRs. The completed constructs with the modified LTRs cloned into the CAT test vector are pSVANGrfCAT(X) and pSVAPBrfCAT(X), respectively. The flow diagram starts with a generic HERV-H LTR. By the end of the diagram the LTR repeat set has been removed and replaced with a size matched control DNA fragment and the entire modified LTR is cloned into the CAT assay vector in the correct orientation. The triangular arrowhead symbols indicate orientation relative to the original LTR. Half-headed arrows represent the primers used in the PCR amplifications. H represents a Hindlll restriction site while B represents a Bglll restriction site. These sites are introduced as non-binding tails in the 5' ends of the PCR primers. 40 fragments. Two Sau 3A fragments from pUC 18 were used; pSVANGrfCAT(X) contains the 105 bp fragment and pSVAPBrfCAT(X) contains the 141 bp fragment. DNA Sequencing The content of all constructs were verified by DNA sequencing. The method was a modified version of dideoxy chain termination for double-stranded templates (Tabor and Richardson, 1987). Reactions used Sequenase T7 polymerase and included [a32p]dCTP for autoradiography. Sequence data entry and all sequence comparisions were completed using the software package of the Genetics* Computer Group (Devereux et al., 1984) Gel Mobility Shift Assays LTR fragments were amplified between the LTRHind and LTRpolyA primers in duplicate 200 ml batches, pooled and extracted in phenol/chloroform/isoamyl alcohol (25:24:1). Four pg of fragment DNA was cut sequentially with Hinfl and Apol for the H6 LTR with Hinfl and Fokl used for the N10-14 LTR. 100 uCi [a 3 2P] dATP or [a 3 2P] dCTP was added directly to the completed restriction digest along with 4 ml of 5 mM mix of remaining dNTPs and 2.5U Klenow fragment (USB). After 30 min at room temperature, the reaction was loaded on a 10% polyacrylamide gel (25:1 acrylamide to bis) and run for 2.5 hrs at 250V. Desired fragments were identified by autoradiography and the bands excised. The gel section was soaked in elution buffer (0.5 M NaOAC, 0.5% SDS, 10 mM EDTA) overnight, spun through glass wool and ethanol precipitated. The precipitated DNA probes were rigorously washed in 70% EtOH, with three changes over 24 hrs to ensure no SDS contaminants were present to interfere with binding in the assay. Probes were resuspended in 100 pi dH20. Sp1 protein binding reactions contained 12.5% glycerol, 12 mM Hepes pH 7.9, 4 mM Tris HCI, 60 mM KCI, 1 mM EDTA, 1 mM DTT, 1 |ig poly(dl-dC), 0.05% NP40 and 1 jig bovine serum albumin. The competitor used was a double-stranded oligonucleotide containing the Sp1 consensus binding site (GATCCCTTGGTGGGGGCGGGGCCTAAGCTGCGCAT, Promega) in a half-step dilution series from 1250 fmols to 40 fmols. 0.5 fpu purified Sp1 protein (Promega) was added to each binding reaction on ice and incubated for 5 min before 10 fmols of labelled test fragment was added and incubated for 30 min. Reactions were run on a 4% polyacrylamide gel (15:1 acrylamide to bis) with 2.5% glycerol in 0.25X TBE running buffer. Gels were pre-run at 100V and run 2.5 hrs at 14 mA and 0°C before vacuum drying and exposure to X-ray film without an intensifying screen. All band intensities were quantified by densitometer scans of each lane. 42 Results The majority of the data presented in this chapter has been incorporated into the following publication: Nelson, D.T., Goodchild, N.L. and Mager, D.L. 1996. Gain of Sp1 sites and loss of repressor sequences associated with a young, transcriptionally active subset of HERV-H endogenous Long Terminal Repeats. Virology 220, 213-218. 43 Features of different HERV-H LTR types Figure 4a compares the overall structures of the three types of HERV-H LTRs. All LTR types begin with a very similar 5' region extending for the first 85 bp and end with a ~110 bp region of equal conservation. In the type-defining intervening sequence, three variations have been described thus far. The mid-section of the Type I LTR contains two or three copies of the degenerate 49-52 bp "type I" repeat followed by a type I common block of approximately 100 bp. The Type II LTR contains only a single unit of the type I repeat followed by four to five copies of the degenerate 27 to 34 bp "type II" repeat. The Type la LTR has a midsection comprised of a single type I repeat followed by a single type II repeat and finished by the -100 bp type I common block. Comparison of the sequence of the Type la LTR to the other two classes suggested that it arose via a recombination event between a Type I and a Type II LTR (Goodchild et al., 1993) as depicted in figure 4a. Figure 4b shows the Type I and Type II repeats (in bold) from the individual LTRs, RTVL-H2 5' and cPjLTR. Table 1 summarizes some distinguishing characteristics between the LTR subtypes. Table 1: Distribution and functional characteristics of different HERV-H LTR subtypes. LTR Type Copy No. Evolutionary Age Expression Promoter Strength in Humans ( Phytogeny range ) RNA in cell lines in CAT Assays* Type I - 1000 30-40 Myr Restricted to + Old World Primates Teratocarcinomas Type la -100 15-20 Myr High in a variety ++++ Great Apes only of cell types Type II -700 30-40 Myr Generally very low +/-Old World Primates Myr = Million years * - indicates no promoter activity, + to ++++ = weak to very strong. B Type I repeat region: X b a l 59 t c t a gaTGGCCTGA AGTAACTGAA GA 85 86 A T C A C . A A A G A A G T G A A A A T G C C C T G . C C C C A C C T T A A C T G A T G A C A T T C C A 134 13 5 A C C A C A A A A G A A G T G A A A A T G G C C G G T C C T T G C C T T A A G T G A T G A C A T T A C C 185 186 TTGTAAGAGT CCTTTTCCTG GCTCATCCTA tctaqa 223 X b a l R e p e a t | A - C A C - A A A G A A G T G A A A A T G - C C - G - C C - — C C T T A A - T G A T G A C A T T - C - | C o n s e n s u s 1 : — — Type II repeat region: X b a l 121 tctagaGGACA TTTC 13 6 137 C C A T T G T G A T T T G T T C C T G C C C C A C C C T A A C T G A . 170 171 T C A A T T G A C T T T G T G A C A A G A C A C C C T . C C C C A C C C T . . . T G C A 210 211 A T A A T G T A C T T T G T G A T A T T C . . C C C C G C C C T . . . T G . . 241 242 T G A A T G T A C T T T G T . A C G A T A C A C C C T . C C C C A C C C T T G . . 280 281 A G A A G G T A C T T T G T A A T A T C C T . C T C C G C C C T T G . . 313 314 A G A A T G T A C T T T G T A A G A T C C A . C T T C C T G C C T G . . 346 347 CAAAAAATTG CTCCGAACTC CACCGtctaga 378 X b a l R e p e a t | - - A A T G T A C T T T G T - A - A T C C T . C C C C - C C C T . . . T G . - I C o n s e n s u s 1 Figure 4. (A) Representation of the features of the three HERV-H LTR subtypes. The boxed T represents the TATA box. Striped box represents the 52 bp type I repeat unit. Light grey box represents the 27 to 34 bp type II repeat unit. Dark grey box represents the 100 bp section common to Type I and la elements. All unshaded boxes represent sequences relatively conserved between LTR types. The heavy line indicates the regions of the Type I and Type II LTRs that are hypothesized to make up the recombinant Type la LTR. Small arrows represent Xbal tailed PCR primers used to amplify the repeat regions. (B) Sequences of the PCR amplified repeat regions from the Type I and Type II LTRs. Lower case letters indicate mismatch bases in primers creating the Xbal site. Bold sequences are the repeats proper, stacked sequentially to show matching bases. Type I repeat consensus sequence is composed of bases maintained in both copies shown. Type II repeat consensus sequence is composed of bases maintained in at least 4 of 6 type II repeats shown. Numbering is from the beginning of each LTR. 45 It is important to note, Type la LTRs are the strongest transcriptional promoters and show less restriction in their pattern of expression. Type I and II repeat regions can repress activity of a heterologous promoter. The structural and functional differences between LTR types have led us to hypothesize that the Type I and II repeat units may be associated with a suppressor function and that the disruption of these repeat units in Type la LTRs contributes to their relative promoter strength. As an initial test of this hypothesis, a chloramphenicol acetyltransferase (CAT) reporter gene system was employed. The expression vector used was pSVABGCAT(X), shown in figure 5a, and has the human B-globin gene promoter driving the CAT gene. In most cells, this promoter is enhancer dependent (Kadesch and Berg, 1986) but it does have some activity without an added enhancer in the human 293 cell line (unpublished observations). Therefore, both potential enhancer or repressor activity of DNA segments inserted into the distant Xbal site can be measured using this vector in 293 cells. The Type I and II repeat regions shown in figure 4b were PCR amplified using primers with Xbal tails and cloned into the Xbal site of the parent plasmid, pSV2ABGCAT(X). CAT assays were then performed on lysates of 293 cells transfected with the different plasmids. As a positive control, the construct pSVABGCAT(LR), containing the SV40 early enhancer at the Xbal site, was included in each experiment. In the pooling and presentation of data the activity level of the parent vector served as an internal control and was set to one. The activity of each test vector was then given relative to the parent vector. The results of multiple experiments are shown in figure 5b. One set of Type I repeats, in the construct BGT1, had only a slight negative effect on B-globin promoter activity and did not differ significantly from the control ( p > 0.40 ). However, the promoter activity of two 46 3 Figure 5. CAT assay vector testing the effects of repeats on the activity of a 6-globin promoter in 293 cells. (A) Diagram representing the CAT expression vector, pSVABGCAT(X), where the human B-globin promoter is 5' to the CAT gene. An represents the SV40 derived polyadenylation signal. Ori is the origin of replication and Amp is the ampicillin resistance gene, both from pBR322. Sequences to be tested for enhancer or suppressor effects are cloned into the single Xbal site. These include the repeat sets from the Type I and II HERV-H LTRs, as shown in fig. 1B, in constructs BGT1 and BGT2 respectively. BGT1X2a and b are independent clones each containing two copies of the type I repeat set (4 repeat units). The positive control, BGLR, has the SV40 enhancer cloned at this site. (B) Summary of CAT assays following transient transfection into human 293 cells, quantified by scintillation counting. Data from three experiments was pooled by using BGX [pSVABGCAT(X)] as an internal control and setting its conversion level to one for each round of transfections. Results for test constructs are given relative to BGX. Grey bars indicate sample standard deviation. 47 independent constructs, BGT1X2a and b, each containing two sets (4 copies) of Type I repeats, was markedly reduced (0.38 of contol with p< 0.025 and 0.34 of control with p< 0.01). Thus, the double insert constructs increased detection of weak repression by the Type I repeat set. The independently developed but identical constructs also served to demonstrate the reproducibility of the method. The naturally occurring 6 unit Type II repeat set in the construct BGT2 reduced reporter output by approximately one-half (0.47 of control, p< 0.025). These results suggest that the repeat regions can have a suppressing effect on a distant cellular promoter and that the effect of the native form of the Type II repeat set is greater than the effect of the native form of the Type I repeat set. 48 50bp Nco ^ i i i i i r r Control s Type la LTR (H6) Type I repeat region Type II repeat region 210bp pUC 19 fragment B TR as Promoter Nco I Figure 6. CAT assay vector testing the effects of repeats on the promoter activity of a Type la HERV-H LTR. (A) Representation of the H6 Type la HERV-H LTR. Shading of boxes is as in figure 4A. Type I and II repeat regions are the same fragments shown in Fig. 4B. Both repeat regions and a size matched control fragment of pUC19 DNA were cloned independently into the internal Ncol site of the H6 LTR to make H6-T1, H6-T2 and H6-210 respectively (see Materials and Methods for details). (B) CAT expression vector with H6 Type la LTR as a promoter. H6-T1, H6-T2 and H6-210 have modified LTRs cloned in place of the H6 LTR. 49 Presence of Type I and II repeat regions decreases activity of a Type la LTR. Further investigation of the effects of the Type I and II LTR repeat sets served a dual purpose. The primary goal was to test the hypothesis that the lack of intact repeat regions in the Type la LTR contributes to its strong promoter activity relative to the other two LTR types. The second was to determine if the LTR repeat sets are suppressive in another test system and if they also have a suppressive effect against a HERV-H promoter. To accomplish both of these objectives, the same Type I and II repeat regions, of 171 and 262 bp respectively, used in the above experiment were inserted into the Type la LTR termed H6 at an Ncol site as shown in figure 6a (see Materials and Methods for details). This position is analogous to the location of the repeat sets in the other two LTR subtypes. Two key variables to be controlled for were the potential effects of lengthening the LTR internally, and the specific effect of an insertion at this Ncol site. For this, an H6 LTR was also constructed with a 210 bp fragment of pUC19 plasmid DNA inserted at the same site. These recombinant LTRs were then tested for promoter activity using the vector shown in figure 6b. This plasmid, pSVAH6fCAT(X), has the H6 LTR driving expression of the CAT gene and has been used previously to show that this Type la LTR has.strong promoter activity (Feuchter and Mager, 1990). The H6 LTR was replaced with the recombinant LTR constructs and CAT activity was measured after transfection into the test cell lines. Promoter activity of the Type I LTR N10-14 was also measured in this system for comparison. Results of multiple assays in the human teratocarcinoma cell line NTera2D1 are shown in figure 7. Activity was normalized to the level of the native H6 LTR which served as an internal standard for each round of assays. Insertion of the Type I repeat region (H6-T1) significantly suppressed activity of the promoter to 0.29 of the unaltered H6 LTR (p< 0.0005) while the Type II repeat region (H6-T2) showed an 3 50 >* 2 > 0> > CO D C 1 -• I I P •v > CO I CO C M I— • CO C M • CO • o Construct Figure 7. Summary of CAT assays on lysates from transient transfections of NTera2D1 cells (n = 4). H6 is the original H6 HERV-H Type la LTR driving the CAT gene. N10-14 is the same CAT vector with the N10-14 HERV-H Type I LTR cloned place of the H6 LTR. H6-T1 is a modified H6 LTR with a type I repeat set cloned into an internal Ncol site. H6-T2 is the analogous modification using a type II repeat set. The H6-210 construct has a pUC DNA insert and is a control for the effects of insertion at the Ncol site, dependent on fragment size alone. All three of the modified LTRs are driving the CAT gene in place of the original H6 LTR. Activity was quantified by scintillation counting. Data was pooled using the conversion level of the parent construct H6 as an internal control set to one for each round of transfection. Grey bars indicate sample standard deviation. 51 Figure 8. Autoradiograph of a sample CAT assay on lysates from transient transfection of 293 cells. H6 is the original H6 HERV-H Type la LTR driving the CAT gene. N10-14 is the same CAT vector with the N10-14 HERV-H Type I LTR cloned place of the H6 LTR. H6-T1 is a modified H6 LTR with a type I repeat set cloned into an internal Ncol site. H6-T2 is the analogous modification using a type II repeat set. The H6-210 construct has a pUC DNA insert and is a control for the effects of insertion at the Ncol dependent on fragment size alone. All three of the modified LTRs also drive the CAT gene in place of the original H6 LTR. 52 even greater reduction in activity to 0.20 of the H6 LTR level (p< 0.0005). The decrease in activity observed was not attributed to the effects of an insertion at the Ncol site or the effects of lengthening the LTR. The insertion of the pUC19 210 bp fragment (H6-210) resulted in no reduction in LTR promoter activity relative to the control H6 LTR. The average for the activity of the H6-210 construct was somewhat higher than that of the native LTR but the standard deviation of the sample was large, resulting in a difference that was not significant (p> 0.05). Assays were also performed using these constructs in the 293 cell line. The autoradiograph of a sample assay is presented in figure 8 with the results in this cell line matching those in the Ntera 2D1 line. These results support the earlier assertion that both repeat regions have a negative regulatory funtion which can act on c/'s-linked promoters. This function is maintained within the context of a Type la LTR, consistant with the hypothesis that the absence of the repeat sets contributes to the relative strength of the Type la LTR. 53 A Type I repeats removed Control insert s Type I LTR Type II LTR Control insert | |jj 50bp Type II repeats removed B N10-14 Type I LTR sequence TGTCAGGCCT CTGAGCCCAA GCCAAGCCAT CGCATCCCCT GTGACTTGCA CGTATAAGCC 60 CAGATGGCCT GAAGTAACTG A A G A A T C A C A A A A G A A G T G A A T A T G C C C T G C C C C A C C T T A 120 TCCTTT TAAATTAAAA ATGAAGTTTT AAATCAATCT A C T G A T G A C A T T C C A C C A C A A A A G A A G T G T A A A T G G C C G G T C C T T G C C T T A A G T G A T G A C 180 AAAGTATATA TGAGTAAACT TGGTCTGACA GTTACCAATG CTTAATCAGT GAGGCACCTA A T T A C . . . . C TTGTGAAAGT CCTTTTCCTG GCTCATCCTG GCTCAGAAAG CACCCCCACT 236 CTCAGCGAT GAGCACTTGC AACTCCCACT CCTGCCCACC AGAGAACAAA CCGCCTTTGA CTATAATTTT 296 CCTTTACCTA CCCAAATCCT ATAAAACGGC CCCACCCCTA TCTCCCTTTG CTGACTCTCT 356 TTTTGGACTC AGCCCGCCTG CACCCAGGTG AAATAAACAG 396 PB-3 Type II LTR sequence TGTCAGGCCT CTGAGCCCAA GCCAAGCATC GCATCCCCTG TGATTTGCAC GTATACATCC 60 AGATGGCCTA AAGTAACTGA AGATCCACAA AAGAAGTAAA AATAGCCTTA ACTGATGACA 120 T T C C A C C A T T G T G A T T T G T T C C T G C C C C A C C C T A A C T G A T C A A T G T A C T T T G T A A T C T C C 180 GATCG GTGCGGGCCT CTTCGCTATT ACGCCAGCTG GCGAAAGGGG GATGTGCTGC C C C A C C C T T A A G A A G G T A C T T T G T A A T C T T C C C C A C C C T T A A G A A G G T T C T T T G T A A T T C 240 AAGGCGATTA AGTTGGGTAA CGCCAGGGTT TTCCCAGTCA CGACGTTGTA AAACGACGGC T C C C C A C C C T T G A G A A T G T A C T T T G T G A G A T C C A C C C T G C C C A C A A A A C A TTGCTCTTAA 300 CAGTGCCAAG CTTGCATGCC TGCAGGTCGA CTCTAGAGGA T C . . CTTCACCGCC TAACCCAAAA CCTATAAGAA CTAATGATAA TCCATCACCC TTCGCTGACT 360 CTCTTTTCGG ACTCAGCCCA CCTGCACCCA GGTGAAATAA ACAG 404 Figure 9. (A) Representations of the "repeat free" HERV-H Type I and II LTRs. The LTRs were modified by replacing the type specific repeat sets with fragments of control DNA matched in size to the removed portions. In both sections, (B) N10-14 Type I HERV-H LTR sequence, and (C) PB-3 Type II HERV-H LTR sequence, the bold sequence indicates the sections removed while the black line indicates the extent of the repeat set. The standard text shown below the bold sequences indicates the sequence of the control DNA insert. Following the standard text from the beginning of the LTR sequence, through the subscript insert, and on through the end of the LTR shows the sequence of the modified repeat free LTR. In the N10-14 case, the modified LTR is N10rf, and in the PB-3 case the modified version is PBrf. C 54 Effects of removing the repeat regions from Type I and II LTRs depend on test cell line. The previous experiments have shown that the repeat structures of the Type I and II HERV-H LTRs suppress transcription from both a Type la LTR promoter and a heterologous promoter, in this case the human PB-globin promoter. The next question addressed was the role of the repeat sets as they occur in the Type I and II HERV-H LTRs. To do this, "repeat free" Type I and II LTRs were constructed where the repeats sets were removed from the LTRs and replaced with size-matched control DNA fragments. Figure 9a shows a representation of the Type I and II repeat free LTRs. The Type I repeat free LTR, N10rf, was constructed starting with the N10-14 HERV-H LTR while the Type II repeat free LTR was based on the Type II HERV-H LTR, PB-3. In the construction process, the regions of the LTR surrounding the type specific repeat sets were PCR amplified to include all non-repeat sequences of the LTR (see materials and methods for details). In the N10rf LTR, the Type I repeats (111 bp) were replaced with 115bp of pUC18 DNA and in the PBrf LTR the Type II repeats (159bp) were replaced with 157bp of pUC18 DNA (sequences shown in figure 9b and 9c, respectively). These modified LTRs were assayed for promoter activity, using a CAT vector arrangement parallel to the pSVAH6fCAT(X) vector shown in figure 6b. The H6 LTR was replaced with the native N10-14 and PB-3 LTRs as well as the modified N10rf and PBrf repeat free LTRs. With this set of constructs, the promoter strength of the native LTRs could be compared to that of the LTRs lacking the repeat regions. Figure 10 shows a summary of CAT assays preformed on lysates from transient transfections of 293 cells. In this cell line, any difference in promoter activity between the N10-14 LTR and the NGrf modified LTR was not resolvable with two repetitions of the experiment (p> 0.40). The CAT assay can be quite variable as was evident from the large sample 55 7 - . 6 -5 -4 -N10-14 N10rf P B - 3 P B r f Construct Figure 10. Summary of CAT assays on lysates from transient transfections of the 293 cell line (n = 2). All constructs tested here were parallel to the pSVAH6fCAT(X) vector shown in figure 6B. N10-14 is the same CAT vector with the N10-14 HERV-H Type I LTR cloned place of the H6 LTR. N10rf is a repeat free version of the N10-14 LTR with the Type I repeat region replaced with a pUC18 DNA fragment. PB-3 is the vector containing the native PB-3 HERV-H Type II LTR and PBrf contains the repeat free version with the Type II repeat set replaced with a fragment of pUC18 DNA. Activity, presented on the Y-axis, is the percent conversion of radiolabeled chloramphenicol to acetylated forms, as quantified by scintillation counting. In this case, data for each construct was pooled directly. Grey bars indicate sample standard deviation. N10-14 N10fr PB-3 PBrf Construct Figure 11. Summary of CAT assays on lysates from transfections of NTera2D1 cells (n = 2). All constructs tested here were parallel to the pSVAH6fCAT(X) vector shown in figure 6B. N10-14 is the same CAT vector with the N10-14 HERV-H Type I LTR cloned place of the H6 LTR. N10rf is a repeat free version of the N10-14 LTR with the Type I repeat region replaced with a pUC18 DNA fragment. PB-3 is the vector containing the native PB-3 HERV-H Type II LTR and PBrf contains the repeat free version with the Type II repeat set replaced with a fragment of pUC18 DNA. Activity, presented on the Y-Axis, is the percent conversion of radiolabeled chloramphenicol to acetylated forms, as quantified by scintillation counting. As with figure 10, data was pooled directly. Grey bars indicate sample standard deviation. 57 standard deviation in the N10-14 results. The results for the Type II LTR constructs showed much better agreement within sets. The activity of the repeat free PBrf LTR (0.95) was almost five-fold greater than the activity of the native PB-3 LTR (0.21) when assayed in 293 cells (p< 0.005). Figure 11 shows the same constructs assayed in the Ntera-2D1 cell line. In this case, the activity of the native N10-14 LTR (1.75) was four-fold greater than the activity of the N10rf LTR (0.43) which is a significant difference (p< 0.01). The native Type II LTR.PB-3, also showed greater activity (0.65) than the repeat free version, PBrf (0.45) which is also significant (p< 0.05). These experiments were intended to test the hypothesis that the repeat sets, in the context of the Type I and II HERV-H LTRs, suppress transcription from the linked promoter. It was hypothesized that removal of the repeat sets from either LTR Type would free the promoter from the suppressive influence and result in higher levels of activity in the CAT assay. This was the case only when the repeat free Type II LTR (PBrf) was tested against the native LTR (PB-3) in 293 cells. In this cell line, removing the repeats from the Type I LTR (N10-14) left the N10rf repeat free LTR with approximately equivalent promoter strength. The Type I repeats were not strongly suppressing transcription in this context. The results in the Ntera-2D1 cell line indicated the repeat regions in the Type I LTR support transcription while the equivelent region in the Type II LTR were weakly supportive of transcription. Taken together, the results from the two cell lines indicated that the function of the repeat regions were very dependent on cell type. Deletion analysis of the H6 Type la LTR identifies positive regulatory regions. In previous experiments, it was shown that the presence of sets of Type I or II repeats within the backbone of a Type la LTR was associated with a reduced promoter 58 50bp J 4 ? ^ ° LTJ c> Type la LTR (H6) Sp1 minus LTRpolyA 1 TGTCAGGCCT CTGAGCCCAA GCCAAGCCAT 61 CAGATGGCCT GAAGTAACTG AAGAATCACA 121 AC TGATGACA Ncol i TTCCACCATG GTGATTTGTT 181 TGTGAATTTG CTTCTCCTGG CTCAGAAGCT |*Sp1(-) 2 4 1 C T G C C C A C C A c t t C C C A C C A GAGAACAACC CCCTTTGACT GAGAACAA Sp1 minus AccI 'CGCATCCCCT GTGACTTGCA CGTATACGCC 60 Drall i AAAGAACTGA AAAGGCCCTG CCCCGCCTTA 120 CTTGCCCCAC CTTAACTGAG TGATTAACCC 180 Dralll i C C C C C A C T G A GCACCTTGTG ACCCCCGCCC 240 A A C C a a g GTAATTTTCC A T T A C C T T C C CAAATCCTAT 300 ITATAI 3 0 1 AAAACGGCCC C A C C C C T A T C TCCCTTCGCT GACTCTCTTT TCGGACTCAG CCCGCCTGCC 360 3 61 CCCAGGTGAA A T A A A C A G 378 GGGTCCACTT T A T T T t t c g a a C C A A LTRpolyA 3' 5' • Figure 12. Representation and sequence of the H6 Type la LTR. Restriction sites shown are those used in the construction of serial 5' deletions. The 3' ends are the same as the parental H6 LTR construct. Triangles represent putative Sp1 binding sites, underlined in the sequence. Small arrows indicate location and direction of PCR primers listed below their complimentary sequence sections. Lower case letters in the primers indicate a Hindlll site which may mismatch with LTR sequence. The Hindlll sites mark the ends of the PCR generated deletions. The Boxed T represents the TATA box also indicated in the sequence. 59 activity. However, insertion of these repeat regions did not suppress activity to a level comparable to Type I or II LTRs. This was evident from the results obtained for the Type I LTR shown in figures 7 and 8. Previously, the Type I LTR N10-14 had been reported to be the strongest promoter among all Type I and II LTRs examined (Feuchter and Mager, 1990). It can be seen from both figures 7 and 8 that the promoter activity of the N10-14 LTR was still significantly below that of the Type la LTR H6 with the repeat regions inserted. This finding indicated that lack of an intact set of Type I or II repeats was not solely responsible for the strong promoter function of Type la LTRs. It had been suggested that the acquisition of binding sites for the transcription factor Sp1 may have contributed to the promoter strength of Type la LTRs (Feuchter and Mager, 1990). The consensus Sp1 binding site in its inverted orientation is RYYC/ACGCCYC/A (Westin and Schaffner, 1988). Sequence comparisons between the H6 Type la LTR and other LTRs identified a unique potential Sp1 binding site at position 237 (underlined in figure 12). A similar CG rich site is present at position 115. To investigate the importance of these regions and to localize other potential positive regulatory regions in the Type la LTR, a series of 5' deletions were constructed using the restriction enzyme sites shown in figure 12. The enzymes Accl, Drall, Ncol and Dralll were used to remove successively larger regions from the 5' end of the H6 LTR. These constructs, and the parental H6 construct (Feuchter and Mager, 1990), all terminated three bp 3' to the polyadenylation signal as shown in the lower part of figure 12. One final deletion construct, termed Sp1(-), was constructed by PCR using the Sp1 minus and LTRpolyA primers indicated in figure 12. This deletion shortened the Dralll construct by 15 bp, thus removing the potential Sp1 binding site at position 237. The full H6 LTR in pSVAH6fCAT(X) (figure 6b) was replaced by the deleted forms which were then tested for promoter activity in NTera2D1 cells. Pooled results from five separate experiments are shown in figure 13. Deletion of the first 104 bp 5' to Figure 13. Representation of serial deletion constructs of the H6 HERV-H LTR paired with a summary of their relative promoter activities in NTera2D1 cells. Constructs are all named for the restriction sites used in their 5' truncation except Sp1(-) which was constructed using PCR between the Sp1 minus and LTRhind primers (see figure 12). Promoter activity was assessed using the CAT vector described in figure 6B where each deleted LTR replaces the H6 full length LTR. CAT assay results were quantified by scintillation counting. Data was pooled using the H6 construct as an internal control with activity set at 100% for each of five rounds of transfections. 61 the Drall site resulted in only a small change in promoter activity (76% of control). However, deletion of the next 32 bp 5' to the Ncol site caused a three fold decrease in activity. This 32 bp region contained the potential binding site for Sp1 at position 115 (underlined in figure 12). Deletion of the next 91 bp between the Ncol and Dralll sites caused only a slight further decrease in activity but removal/alteration of the next 15 bp including the second potential Sp1 site resulted in a further reduction of approximately four-fold to 4.3% of control. Comparison of Sp1 binding to the Type la and Type I LTRs. It was next determined if Sp1 does bind sequences in the Type la LTR and whether there is a difference in Sp1 binding to analogous regions in the Type I LTR. To do this, restriction fragments were isolated from the Type la H6 LTR and the Type I N10-14 LTR, and used in gel mobility shift assays in the presence of purified Sp1 protein. The fragments used are shown in figure 14. The H100 fragment was a 100 bp segment from the H6 LTR containing the first putative Sp1 site at position 115. The N113 fragment contained the analogous region in the N10-14 LTR. These two segments are compared in figure 14b and it can be seen that the N10-14 LTR has an "A" instead of a "G" within the core of the potential Sp1 site. The H145 fragment from the H6 LTR contained the second putative Sp1 site at position 237 and the analogous region in N10-14 was contained on the N153 fragment. While the two LTRs were quite similar in the region encompassed by the H145 and N153 fragments, there were several nucleotide differences within the putative Sp1 site (figure 14b). Each fragment was radiolabeled and used in a series of binding reactions with purified Sp1 protein. Experiments included a dilution series of a double-stranded Sp1 consensus binding site oligonucleotide. This specific cold competitor was used to provide a comparative measure of binding affinity between test fragments. Amounts of bound fragment were 62 H100 Fragment H145 Fragment H6 Type la LTR N113 Fragment TATA TATA N153 Fragment N10-14TypelLTR B Drall I Ncpl H6 "H100" AATCACAAAAGAACTGAAAAGGCCCTGCCCCGCCTTAACTGATGACATTCCACC ATGGTGATT . H n „ > M H „ m i n i m i n m i i i m i m i i n m m m i m i m m i i i N10-14N113 AATCACAAAAGAAGTGAATATGCCCTGCCCCACCTTAACTGATGACATTCCACCACAAAAGAAGTGTAAA TGTTCTTGCCCCACCTTAACTGAGTGATTAACCCTGTG II I II 111111 III I Ml M i ! TGGCCGGTCCTTGCCTTAAGTGATGACATTACCTTGTGAAAGT Dralll I [•Spl(-) H6 "H145" AATTTGCTTCTCCTGGCTCAG . AAGCTCCCCCACTGAGCACCTTGTGACCCCCGCCCCTGCCCACAA K H f t , . . M « o . 1 1 1 1 m m i i m m i m i I I I I I I I I I I I I I II I I I I m i m i n i N1 0-14 N153 CCTTTTCCTGGCTCATCCTGGCTCAGAAAGCACCCCCACTGAGCA. CTTGCAACTCCCACTCCTGCCCACAA C.AACCCCCTTTGACTGTAATTTTCCATTACCTTCCCAAATCCTATAAAACGGCCCCACCCCTATCTCGCTG i n n I M I M I I 111,;sIIi l i n n i i i n n m n n n i n n n m n n n n i n i CAAACCGCCTTTGACTATAATTTTCCTTTACCTACCCAAATCCTATAAAACGGCCCCACCCCTATCTCGCTG Figure 14. (A) Representations of the H6 Type la and N10-14 Type I LTRs and the fragments derived from them for use in gel mobility shift assays. Restriction sites shown were used in the deletion analysis described in figure 13. Triangles show putative Sp1 binding sites in the H6 LTR and the corresponding location in the N10-14 LTR. Fragments were isolated using restriction digests and separated by poiyacrylamide gel electrophoresis. (B) Sequence comparison of analogous fragments from the H6 Type la and the N10-14 Type I LTR. Restriction sites serve as landmarks and putative Sp1 binding sites are underlined or overlined. 63 quantified by densitometer scans. Results are shown in figure 15. Parts A and B compare the ability of the H100 and N113 fragments to bind Sp1. H100 did associate with Sp1 and this binding was specific since it could be competed by the unlabeled Sp1 consensus fragment. In contrast, there was no detectable Sp1 binding to the N113 fragment. The lower half of figure 15, parts C and D, compare the H145 and N153 fragments. The H145 fragment bound Sp1, and this association was even stronger than that observed for the H100 fragment (compare parts A and C). The N153 fragment also associated with Sp1 but relatively weakly. Even this weak association was at first surprising since the Sp1 site at position 237 in the H6 LTR is quite disrupted in the N10-14 LTR. However, closer examination of the sequence revealed another possible Sp1 site in N10-14 located 15 bp downstream (underlined in figure 14b) that could be responsible for the weak Sp1 binding. These results indicate that the Type la LTR H6 contains at least two strong binding sites for Sp1 that are lacking in the Type I LTR N10-14. As shown previously in Table 1, there are approximately 100 Type la LTRs and several hundred Type I LTRs per haploid genome. While these LTRs are similar within a group, they are certainly not identical. For example, it has been found that any two unlinked Type I LTRs are 87-93% identical (Mager, 1989). To ascertain if the difference in Sp1 binding between H6 and N10-14 may be typical of the entire population of Type la and Type I LTRs, segments were compared spanning the two H6 Sp1 sites in all LTRs that have been sequenced. This comparison is shown in Table 2. In the case of Type la LTRs, only one other HERV-H element, termed cH-4, with Type la LTRs has been isolated (Mager and Goodchild, 1989; Goodchild et al., 1993). Both the 5' and 3' LTRs of this element are identical to the H6 LTR at the first Sp1 site. At the second site, the cH-4 5' LTR has a T in place of a C. However, when a cH-4 5' LTR fragment with this site was tested in a gel shift assay it was found to bind Sp1 at a 64 H100 Fragment Sp1 protein Cold Competitor Shift Signal (% of total) + + + + + + + - + + + + + + - 125x 63x 31x 16x 8x 0 42 9.2 25 33 44 41 B N113 Fragment Sp1 protein Cold Competitor H145 Fragment Spl protein Cold Competitor + + + + + + + - + + + + + + - 125x 63x 31x 16x 8x N153 Fragment Sp1 protein Cold Competitor Shift Signal (% of total) 85 16 44 69 94 88 Shift Signal (% of total) 0 26 2.1 5.9 8.8 21 18 Figure 15. Autoradiographs of gel mobility shift assays. Fragments were derived from the H6 Type la LTR: H100 - part A and H145 - part C; and the N10-14 Type I LTR: N113 - part B and N153 - part D. Each lane contains one binding reaction with 10 fmol of 3 2 P labelled LTR fragment. Sp1 is 0.5 fpu purified human protein; (+) indicates inclusion and (-) indicates exclusion in the reaction. Cold competitor is a double stranded oligonucleotide containing an Sp1 consensus binding site. Amount of competitor is given as amount molar excess (eg. 125x is a 125 times molar excess of competitor). See Materials and Methods for further details. 65 level comparable to the binding of the H6 fragment (data not shown). The number of Type I LTRs available for comparison is much larger so, where both the 5' and 3' LTRs of the same element were available, only the 5' LTR is shown in Table 2. At the first site, approximately half of the Type I LTRs are identical to N10-14. The other half are different from both N10-14 and the Type la site, and it is unknown whether the site in these LTRs binds Sp1. At the second site, all the Type I LTRs have sequences identical or very similar to N10-14, with all having an A in place of the critical G in the core of the site. This suggests that none of the Type I LTRs will bind Sp1 at this site. Thus, at least with regard to the second site, our findings with H6 and N10-14 are most probably indicative of the Type I and la LTR subsets as a whole. Table 2: Comparison of Type I and la LTRs in regions of the Sp1 sites in the H6 LTR. LTR Type Site 1 Site 2 Reference H6 cH-4 5 cH-4 3 .a a a CCCCGCCTT CCCCGCCTT CCCCGCCTT CCCCGCCCC CCCTGCCCC CCCCGCCCC Feuchter and Mager, 1990 Mager and Goodchild, 1989 Mager and Goodchild, 1989 N10-14 N10-13 RTVL-H2 5' RGH1 RGH2 5' PLA2LLT SollLTR RTVL-H1 5' RTVL-H3 5' RTVL-H4 5' Sol2LTR enk LTR CCCCACCTT CCCCACCTT CCCCACCTT CCCCACCTT CCCTGCCTT CCCCACCTT CCCCACCTT TTCTGCCTT TCCTGCCTT TCCTGCCTT TCCTGCCTT TCCTGCTTT TCCCACTCC CCCCACTCC CCCCACTCC CCCCATTCC CCCCACTCT CCCCACTCC CCCCACTCC CCCCACTCC CCCCACTCC CCCCACTCC CCCCACCCC CCCCACACC Feuchter and Mager, 1990 Wilkinson etal.,1990 Mager and Freeman, 1987 Hirose et al., 1993 Hirose et al., 1993 Feuchter-Murthy et al.,1993 Mager, 1989 Mager and Henthorn, 1984 Mager, 1989 Mager, 1989 Mager, 1989 Genbank #KOQ489 66 Interestingly, the possible weak Sp1 site in N10-14 located downstream of the second site is not found in any other Type I LTR that has been analyzed. That is, all have a string of Cs as opposed to CCGCC found in N10-14 (underlined in figure 14b). This finding may explain why the N10-14 is the strongest promoter among the Type I LTRs that have been tested (Feuchter and Mager, 1990). 67 Discussion 68 The evidence presented here demonstrates that the repeat structures found within Type I and Type II HERV-H LTRs suppress transcription from cis linked promoters, suggesting that they contain negative regulatory elements (NREs). NREs recognised by host frans-acting factors in specific cell types have been identified in a number of exogenous and endogenous retroviral LTRs. Examples include the LTR of Moloney murine leukemia virus (MuLV) (Gorman et al., 1985), MuLV-related endogenous LTRs (Ch'ang et al., 1989), mouse mammary tumour virus (MMTV) LTR (Hartig et al., 1993), the LTRs of human T-cell lymphotropic virus (HTLV-1) (Xu et al., 1993) and human immunodeficiency virus (HIV) (reviewed in Ou and Gaynor, 1995). A variety of cellular genes are also controlled in part by negative repressor elements (reviewed in Clark and Docherty, 1993). Suppressor activity over a distance, as described here for the NREs contained in the HERV-H LTR repeat sets, is a hallmark of the silencer type of negative regulator element (Brand et al., 1985). This class of negative regulation is thought to depend on proteins or protein complexes binding the NREs and interacting with the general transcription factors, or related proteins, to down regulate promoter activity (reviewed in Clark and Docherty, 1993; Johnson, 1995). The observation that HERV-H NREs are active against heterlogous promoters is also consistent with this model. When the HERV-H Type II repeat sections are compared to previously described NREs, similarity is found with an NRE from the Duck Hepatitis B Virus C Gene (Schneider and Will, 1991.) This NRE has been identified in a wide range of viral and cellular promoters and a consensus sequence, CCTCTCT/C, has been established by Schneider and Will (1991). Sequences in the repeat set from the Type II HERV-H LTR match this consensus only once (See Fig. 4) but do show strong similarity in several other copies. In contrast, the HERV-H Type I repeats do not show similarity to known NREs. While the evidence presented here clearly indicates that both types of HERV-H 69 LTR repeat sets contain silencer type NREs, this is not the limit of the repeat sets regulatory input. The experiments involving the creation of "repeat free" LTRs demonstrated the role of the repeat sets is dependent on cell type. In one cell line, the removal of the Type I repeat set from the Type I HERV-H LTR test construct had no detectable effect, while in another its removal resulted in a loss of LTR promoter strength. This finding implicates sequences in the Type I repeat set as supporting transcription. Similarly, the removal of the repeat set from the Type II HERV-H LTR test construct had no effect on promoter strength in one cell line, while apparently freeing the promoter from suppression in another. These results indicate that viewing the HERV-H repeats as being solely equated with NRE activity is overly simplistic. This is not a surprise when considering other examples of promoter regulatory regions, such as that found in the human insulin gene promoter, that have been shown to include negative acting silencer elements in close proximity to positive acting elements both interacting to modulate promoter activity ( Clark et al., 1995). With regard to endogenous retroviruses or other mobile genetic elements, the existence of host cell negative regulatory mechanisms is perhaps not unexpected due to the potential detrimental effects that transposition may have on the host genome. An extreme example of the effects of unrepressed transposition comes from Drosophila. P-M hybrid dysgenesis and the mutations associated with it are caused by uncontrolled P-element transposition events resulting from lack of P-element repression (Rubin et al., 1982; Kidwell, 1985; Norris and Woodruff, 1992). The development of effective mechanisms for controlling transpositions, at the level of transcription and/or at other levels, may indeed be a prerequisite for the successful coexistence of transposable elements and their hosts. In the case of HERV-H elements, host mechanisms may have evolved during the course of primate evolution to control the promoter function Type I and II LTRs. 70 Type la LTRs emerged 15-20 million years ago as an apparent recombinant between the two older LTR types and, due to its altered structure, could have been recognised differently by host cell factors. Results presented here suggest that release from repressor elements and acquisition of two functional Sp1 binding sites in the Type la LTR have both contributed to its relatively high and less restricted promoter activity. It seems probable that these and perhaps other sequence changes in this youngest subset of the HERV-H family resulted in an increased ability of HERV-H elements with Type la LTRs to expand in numbers in the great apes and humans. The acquisition of functionally distinct sequences by "young" sub populations may be a common theme in transposable element evolution. During rodent evolution, it appears that the dominant lineage of L1 elements has repeatedly acquired transcriptional regulatory sequences that facilitated their expansion over previous versions (Adey et al., 1994). In addition, there is evidence for differential binding of human nuclear proteins to young and old Alu subfamilies (Tomlin et al., 1992). The transcription factor Sp1 is ubiquitously expressed and has been shown be a strong supporter of transcription in the promoters of a large number of genes. These include examples from various species such as the murine IL-6 promoter (Kang et al., 1996), the rat Clara cell secretory protein gene (Toonen et al., 1997), the human medium chain acyl-CoA dehydrogenase gene (Leone et al., 1995) and the HTLV-1 LTR (Gegonrie et al., 1993). Two binding sites for the Sp1 transcription factor were identified in the Type la LTR. An Sp1 site has also been found in the LTRs of the ERV9 family of human endogenous retroviruses which is expressed in undifferentiated teratocarcinoma or embryonal carcinoma (EC) cells (La Mantia et al., 1992). Interestingly, Sp1 binding sites have been shown to be important in the differences in behaviour observed in the well characterised MuLV related retroviruses. MuLV is not able to productively infect embryonic stem (ES) cells or EC cells . It has been 71 demonstrated that expression of MuLV is down regulated by the binding of the Embryonal LTR-Binding Protein (ELP) to its target sequence in the LTR (Tsukiyama et al., 1989). One of the mutations in the derivative Myeloproliferative sarcoma virus (MPSV) creates an Sp1 binding site not present in the MuLV which is responsible for the expansion of host cell range to both ES and EC cells (Prince et al., 1991; Grez et al., 1991). PCMV, a derivative of MPSV, was isolated due to its ability to grow in the PCC4 EC cell line which does not support growth of MPSV. Analysis of U3 sequences reveals only two changes that may be causally associated with this increase (Hilberg et al, 1987). The binding site for ELP is altered and the change disrupts recognition of the site by this negative regulatory protein (Tsukiyama et al., 1989). Secondly, one of two 75 bp repeats present in MuLV and MPSV is deleted in PCMV. The parallels between PCMV and the Type la HERV-H element are striking. Both have acquired Sp1 binding sites not present in the parental element. Furthermore, both have alterations to sequences normally associated with repression of transcription and have disruption of direct repeats present in the parental element. It is likely that the wide expression pattern of Type la LTRs compared to other HERV-H LTRs is at least partially dependent on these sequence differences. However, other differences, such as methylation status, could also play a role in the differential expression as it does with rodent IAP elements (e.g. see Morgan and Huang, 1984; Falzon and Kuff, 1991). A growing literature points to the importance of endogenous retroviral elements and related transposable elements in the shaping of mammalian genomes. In mice, an ancient retroviral insertion provides an androgen responsive sequence that controls the expression of the sex-limited protein gene (Stavenhagen and Robins, 1988). The rat oncomodulin gene has a promoter provided by a endogenous retroviral IAP element (Banville and Boie, 1989), as does a murine gene expressed in placenta (Chang-Yeh et al., 1991). An interesting example from the human genome involves the salivary amylase gene cluster. An endogenous retroviral insertion has activated a cryptic promoter and currently provides sequences required for normal tissue specific expression (Ting et al., 1992). There are also at least three examples of human endogenous LTRs donating promoters to cellular genes (Kato et al., 1990; Feuchter-Murthy et al., 1993; Di Cristofano et al., 1995). The finding that HERV-H LTRs contain repressors of transcription suggests an additional means by which these LTRs can effect adjacent cellular genes. In the test vector employed here, the repeats were cloned 1.6 kb away from the cellular promoter demonstrating an effect over a considerable distance. In a similar assay, Type la LTRs have been shown to act as enhancers when cloned at a distance from a promoter (Feuchter and Mager, 1990). Thus, both the positive and negative regulatory sequences in the high copy number HERV-H elements have great potential to contribute to the evolution of cellular genes. References Cited 74 Adams, C.C., and Workman, J.L. (1995). Binding of disparate transcriptional activators to nucleosomal DNA is inherently cooperative. Mol. Cell. Biol. 15, 1405-1421. Adey, N.B., Schichman, S.A., Graham, D.K., Peterson, S.N., Edgell, M.H. and Hutchison, C.A. (1994). Rodent L1 evolution has been driven by a single dominant lineage that has repeatedly acquired new transcriptional regulatory sequences. Mol. Biol. Evol. 11, 778-789. Amariglio, N., and Rechavi, G. (1993). Insertional mutagenisis by transposable elements in the mammalian genome. Envion. Mol. Mutag. 21, 212-218. Andrews, P.W., Damjanov, I., Simon, D., Banting, G.S., Carlin, C , Dracopoli, N.C., and Fogh, J . (1984). Pluripotent embryonal carcinoma clones derived from the human teratocarcinoma cell line TERA-2. Lab. Invest. 50, 147-162. Banki, K., Maceda, J , , Hurley, E., Ablonczy, E., Mattson, D., Szedgedy, L, Hung, C., and Perl, A. (1992). Human T-cell lymphotropic virus (HTLV)-related endogenous sequence, HRES-1, encodes a 28-kDa protein: a possible autoantigen for HTLV-I gagf-reative autoantibodies. Proc. Natl. Acad. Aci. USA 89, 1939-1943. Banville, D., and Boie, Y. (1989). Retroviral long terminal repeat is the promoter of the gene encoding the tumor-associated calcium-binding protein oncomodulin in the rat. J. Mol. Biol. 207, 481-490. Bell, S.P., and Stillman, B. (1992). ATP-dependent recognition of eukaryotic origins of DNA replication by a multiprotein complex. Nature 357, 128-134. Berg, D.E. and Howe, M.M. (Eds). (1989). Mobile DNA. Am. Society for Microbiology, Washington, D.C. Boeke, J.D., and Corces, V.G. (1989). Transcription and reverse transcription of retrotransposons. Annu. Rev. Microbiol. 43, 403-434. Boer, P.H., Adra, C.N., Lau, Y.F., and McBurney, M.W. (1987). The testis-specific phosphoglycerate kinase gene is a recruited retrotransposon. Mol. Cell. Biol. 7, 3107-3112. Boiler, K., Konig, H., Sauter, M., Mueller-Lantzsch, N., Lower, R., Lower, J . , and Kurth, R. (1993). Evidence that HERV-K is the endogenous retroviral sequence that codes for the human teratocarcinoma derived retrovirus HTDV. Virology 196, 349-353. Bonner, T., O'Connell, C , and Cohen, M. (1982). Cloned endogenous retroviral sequences from human DNA. Proc. Natl. Acad. Sci. 79, 4709-4713. 75 Brack-Werner, R., Barton, D., Werner, T., Foellmer, B., Leib-Mosch, C , Francke, U., Erfle, V., and Hehlmann, R. (1989). Human SSAV-relataed endogenous retroviral element: LTR-like sequence and chromosomal localization to 18q21. Genomics 4, 68-75. Brand, A.H., Breeden, L., Abraham, J., Sternglanz, R., and Nasmyth, K. (1985). Charaterization of a "Silencer" in yeast: A DNA sequence with properties opposite to those of a transcriptional enhancer. Cell 41, 41-48. Brodsky, I., Foley, B., Haines, D., Johnston, J., Cuddy, K., and Gillespie, D. (1993). Expression of HERV-K provirus in human leukocytes. Blood 81, 2369-2374. Bruggemeier, U., Kalff, M., Franke, S., Scheidereit, C , and Beato, M. (1991). Ubquitous transcription factor OTF-1 mediates induction of the MMTV promoter through synergistic interaction with hormone receptors. Cell 64, 565-572. Buetti, E. (1994). Stably integrated Mouse Mammary Tumour Virus long terminal repeat DNA requires octamer motifs for basal promoter activty. Mol.Cell. Biol. 14, 1191-1203. Callahan, R., Drohan, W., Tronick, S., and Schlom, J. (1982). Detection and cloning of human DNA sequences related to the mouse mammary tumorrus genome. Proc. Natl. Acad. Sci. USA 79, 5503-5507. Callahan, R., Chiu, l.-M., Tronick, S., Roe, B., Aaronson, S., and Scholom, J. (1985). A new class of human endogenous retroviral genomes. Science 228, 1208-1211. Calvert, I., Peng, Z.Q., Kung, H.F., and Raziuddin. (1991). Cloning and characterization of a novel sequence-specific DNA-binding protein recognizing the negative regulatory element (NRE) region of the HIV-1 long terminal repeat. Gene 101, 171-176. Ch'ang, L.-Y., Yang, W.K., Myer, F.E., and Yang, D.-M. (1989). Negative regulatory element associated with potentially functional promoter and enhancer elements in the long terminal repeats of endogenous murine leukemia virus-related proviral sequences. J. Virology 63, 2746-2757. Chang-Yeh, A., Mold, D.E., and Huang, R.C.C. (1991). Identification of a novel murine lAP-promoted placenta-expressed gene. Nucl. Acids Res. 19, 3667-3672. Clark, A.R., and Docherty, K. (1993). Negative regulation of transcription in eukaryotes. Biochem. J. 296, 521-541. Clark, A.R., Wilson, M.E., Leibiger, I., Scott, V., and Docherty, K. (1995). A silencer and adjacent positive element interact to modulate the activity of the human insulin promoter. Eur. J. Biochem. 232, 627-632. 76 Di Cristofano, A., Strazzullo, M., Longo, L, and La Mantia, G. (1995). Characterization and genomic mapping of the ZNF80 locus: expression of this zinc-finger gene is driven by a solitary LTR of ERV9 endogenous retroviral family. Nucl. Acids Res. 23, 2823-2830. Falzon, M., and Kuff, E.L. (1991). Binding of the transcription factor EBP-80 mediates the methylation response of an intracisternal A-particle long terminal repeat promoter. Mol. Cell. Biol. 11, 117-125. Feuchter, A., Freeman, D. and Mager, D. (1992). Strategy for detecting cellular transcripts promoted by human endogenous long terminal repeats: identification of a novel gene (CDC4L) with homology to yeast CDC4. Genomics 13, 1237-1246. Feuchter, A., and Mager, D. (1990). Functional heterogeneity of a large family of human LTR-like promoters and enhancers. Nucl. Acids Res. 18, 1261-1270. Feuchter, A., and Mager, D. (1992). SV40 large T antigen trans-activates the ong terminal repeats of a large family of human endogenous retrovirus-like sequences. Virology 187, 242-250. Feuchter-Murthy, A.E., Freeman, J.D., and Mager, D.L. (1993). Splicing of a human endogenous retrovirus to a novel phospholipase A2 related gene. Nucl. Acids Res. 21, 135-143. Fields, C , Grady, D., and Moyzis, R., (1992). The human THE-LTR(O) and Mstll interspersed repeats are subfamilies of a single widely distributed highly variable repeat family. Genomics 13, 431-436. Franklin, G., Chretien, S., Hanson, I., Rochefort, H., May, F., and Westley, B. (1988). Expression of human sequences related to those of mouse mammary tumour virus. J. Virol. 62, 1203-1210. Fraser, C , Humphries, R, and Mager, D. (1988). Chromosomal distribution of the RTVL-H family of human endogenous retrovirus-like sequences. Genomics 2, 280-287. Frenkel, B., Montecino, M., Stein, J.L., Lian, J.B., and Stein, G.S. (1994). A composite intragenic silencer domain exhibits negative and positive transcriptional control of the bone-specific osteocalcin gene: promoter and cell type requirements. Proc. Natl. Acad. Sci. 91, 10923-10927. Gattoni-Celli, S., Kirsch, K., Kalled, S., and Isselbacher, K. (1986). Expression of Type-C related human endogenous retoviral sequences in human colon tumours and colon cancer cell lines. Proc. Natl. Acad. Sci. 83, 6127-6131. 77 Gegonne, A., Bosselut, R., Bailly, R.A., and Ghysdael, J . (1993). Synergistic activation of the HTLV-1 LTR ETS-responsive region by transcription factors ETS1 and Sp1. EMBO J12, 1169-1198. Goodchild, N.L., Freeman, J.D., and Mager, D.L. (1995). Spliced H E R V - H endogenous retroviral sequences in human genomic DNA: evidence for amplification via retrotransposition. Virology 206, 164-173. Goodchild, N.L, Wilkinson, D., and Mager, D. (1993). Recent evolutionary expansion of a subfamily of RTVL-H human endogenous retrovirus-like elements. Virology 196, 778-788. Goodchild, N.L., Wilkinson, D.A., and Mager, D.L. (1992). A human endogenous retrovirus provides a polyadenylation signal to a novel, alternatively spliced transcript in normal placenta. Gene 121, 287-294. Gorman, C.M. , Rigby, P.W.J., and Lane, D P . (1985). Negative regulation of viral enhancers in undifferentiated embryonic stem cells. Cell 42, 519-526. Graham, F.L., and Van der Eb, A .J . (1973). A new technique for the assay of infectivity of human adenovirus 5 DNA. Virology 52, 456-467. Gray, S., Szymanski, P., and Levine, M. (1994). Short range repression permits multiple enhancers to function autonomously within a complex promter. Gene. Dev. 8, 1829-1838. Grez, M., Zornig, M., Nowock, J . , and Ziegler, M. (1991). A single point mutation activates the Moloney Murine Leukemia Virus long terminal repeat in embryonal stem cells. J. Virol. 65, 4691-4698. Hagen, G. , Muller, S., Beato, M. and Suske, G. (1992). Cloning by recognition site screening of two novel GT box binding proteins: a family of Sp1 related genes. Nucl. Acids Res. 20, 5519-5525. Haltmeier, M., Siefarth, W., Blusch, J . , Erfle, V., Hehlmann, R., and Leib-Mosch, C. (1995). Identification of S71 related human endogenous retroviral sequences with full-length pol genes. Virology 209, 550-560. Harada, F., Tsukada, N., and Kato, N. (1987). Isolation of three kinds of human endogenous retrovirus-like sequences using t R N A P r o as a probe. Nucl. Acids Res. 15, 9153-9162. 78 Hartig, E., Nierlich, B., Mink, S., Nebl, G., and Cato, A.C.B. (1993). Regulation of expression of Mouse Mammary Tumour Virus through sequences located in the hormone response element: Involvement of cell-cell contact and a negative regulatory factor. J.Virol. 67, 813-821. Henthorn, P., Zervos, P., Raducha, M., Harris, H., and Kadesch, T. (1988). Expression of a human placental alkaline phosphatase gene in transfected cells: Use as a reporter for studies of gene expression. Proc. Natl. Acad. Sci. USA. 85, 6342-6346. Hilberg, F., Stocking, C., Ostertag, W., and Grez, W. (1987). Functional analysis of a retroviral host-range mutant: Altered long terminal repeat sequences allow expression in embryonal carcinoma cells. Proc. Natl. Acad. Sci. 84, 5232-5236. Hirose, Y., Takamatsu, M., and Harada, F. (1993). Presence of env genes in members of the RTVL-H family of human endogenous retrovirus-like elements. Virology192, 52-61. Hobbs, H.H., Brown, M.S., Goldstein, J.L., and Russel, S.W. (1986). Deletion of exons encoding cysteine rich repeat of low density lipoprotein receptor alters its binding specificity in a subject with familial hypercholesterolemia. J. Biol. Chem. 261, 13114-13120. Horsthemke, B., Beisiegel, U., Dunning, A., Havinga, J.R., Williamson, R., and Humphries, S. (1987). Unequal crossing-over between two Alu-repetative DNA sequences in the low-density-lipoprotein receptor gene. Eur. J. Biochem. 164, 77-81. Johansen, T., Holm, T., and Bjorklid, R., (1989). Members of the RTVL-H family of human endogenous retrovirus-like elements are expressed in placenta. Gene 79, 259-267. Johnson, A.D. (1995). The price of repression. Cell81, 655-658. Johnson, P. M., Lyden, T. W., and Mwenda, J. M., (1990). Endogenous retroviral expression in the human placenta. Amer. J. Reproduct. Immunol. 23, 115-120. Kadesch, T., and Berg, P. (1986). Effects of the position of the simian virus 40 enhancer on expression of multiple transcription units in a single plasmid. Mol. Cell. Biol. 6, 2593-2601. Kang, S.H., Brown, D.A., Kitajima, I., Xu, X., Heidenreich, O., Gryaznov, S., and Nerenberg, M. (1996). Binding and functional effects of transcription factor Sp1 on the murine interleukin-6 promoter. J. Biol. Chem. 271, 7330-7335. 79 Kadesch, T., Zervos, P., and Ruezinsky, D. (1986). Functional analysis of the murine IgH enhancer: Evidence for negative control of cell type specificity. Nucl. Acids. Res. 14, 8209-8221. Kannan, P., Buettner, R., Pratt, D., and Tainsky, M. (1991). Identification of a retinoic acid inducible endogenous retroviral transcript in the human teratocarcinoma-derived cell line PA-1. J. Virol. 65, 6343-6348. Kass, D.H., Batzer, M.A., and Deininger, P.L. (1995). Gene conversion as a secondary mechanism of short interspersed element (SINE) evolution. Mol. Cell. Biol. 15, 19-25. Kato, N., Shimotohno, K., Van Leeuwen, D., and Cohen, M. (1990). Human proviral mRNAs down regulated in choriocarcinoma encode a zinc finger protein related to Kruppel. Mol. Cell. Biol. 10, 4401-4405. Kazazian Jr., H., Wong, C , Youssoufian, H., Scott, A., Phillips, D., Antonarakis, S., (1988). Haemophilia A resulting from de novo insertion of L1 sequences represents a novel mechanism for mutation in man. Nature 332, 164-166. Kingsley, C , Winoto, A. (1992). Cloning of GT box-binding proteins: A novel Sp1 multigene family regulating T-cell receptor gene expression. Mol. Cell. Bio. 12, 4251-4261. Kroger, B., and Horak, I. (1987). Isolation of novel human retrovirus related sequences by hybridization to synthetic oligonucleotides complementary to the tRNAP r o primer binding site. J. Virol. 61, 2071-2075. La Mantia, g., Maglione, D., Pengue, G., Di Cristofano, A., Simeone, A., Lanfrancone, L., and Lania, L., (1991). Identification and characterization of novel human endogenous retroviral sequences preferentially expressed in undifferentiated enbryonal carcinoma cells. Nucl. Acids Res. 19,1513-1520. La Mantia, G., Majello, B., Di Cristofano, A., Strazzullo, M., Minchiotti, G., and Lania, L. (1992). Identification of regulatory elements within the minimal promoter region of the human endogenous ERV9 proviruses: accurate transcription initiation is controlled by an Inr-like element. Nucl. Acids Res. 20, 4129-4136. Lania, L., Di Cristifano, A., Strazzullo, M., Pengue, G., Majello, B., and La Mantia, G. (1992). Structure and functional organisation of the human endogenous retroviral ERV-9 sequences. Virology 191, 464-468. Larsson, E., Kato, N., and Cohen, M. (1989). Human endogenous proviruses. Curr. Top. Microbiol. Immunol. 148, 115-132. 80 Lehrman, M. A., Goldstein, J. L, Russell, D. W., and Brown, M. S., (1987). Duplication of seven exons in LDL receptor gene caused by Alu-Alu recombination in a subject with familial hypercholesterolemia. Cell 48, 827-835. Leib-Mosch, C., Brack, R., Werner, T., Erfle, V., and Hehlmann, R. (1986). Isolation of an SSAV-related endogenous sequence from human DNA. Virology 155, 666-677. Leib-Mosch, C., Bachmann, M., Geigl, E. M., Brack-Werner, R., Werner, T., Erfle, V., and Hehlmann, R., (1992). Expression of S71-related sequences in human cells. Haematology and Blood Transfusion 35, 256-259. Leib-Mosch, C., Haltmeier, M.,Werner, T., Geigl, E. M., Brack-Werner, R., Francke, U., Erfle, V., and Hehlmann, R. (1993). Genomic distribution and transcription of solitary HERV-K LTRs. Genomics 18, 261-269. Leone, T.C., Cresci, S., Carter, M.C., Zhang, Z., Lala, D.S., Strauss, A.W. and Kelly, DP. (1995). The human medium chain acyl-CoA dehydrogenase gene promoter consists of a complex arrangment of nuclear receptor response elements and Sp1 binding sites. J. Biol. Chem. 270, 16308-16314. Levy, L.S., Lobelle-Rich, P.A., Elder, J.H.,Payne, S.,and Montelaro, R.C. (1990). An unusual retrovirus-like sequence identified in human DNA. J. Gen. Virol. 71,1613-1618. Li, M.D., Bronson, D.L., Lemke, T.D., and Faras, A.J. (1995). Restricted expression of new HERV-K members in human teratocarcinoma cell lines. Virology 208, 733-741. Liu, A., and Abraham, B., (1991). Expression of a hybrid human endogenous reetrovirus and calbindin gene in a prostrate cell line. Cancer Res. 51, 4107-4110. Lower, R., Boiler, K., Hasenmaier, B., Korbmacher, C , Muller-Lantzsch, N., Lower, J., and Kurth, R., (1993). Identification of human endogenous retroviruses with complex mRNA expression and particle formation. Proc. Natl. Acad. Sci. USA 90, 4480-4484. MacDonald, P., Ingram, P., and Struhl, G. (1986). Isolation, structure and expression of even-skipped:: A second pair-rule gene of Drosophilia containing a homeo box. Cell 47, 721-734. Maeda, N., (1985). Nucleotide sequence of the haptoglobin and haptoglobin-related gene pair. J. Bio. Chem. 260, 6698-6703. Maeda, N., McEvoy, S., Harris, H., Huisman, T., and Smithies, O., (1986). Polymorphisms in the human haptoglobin gene cluster: chromosomes with multiplt haptoglobin-related (Hpr) genes. Proc. Natl. Acad. Sci. USA 83, 7395-7404. 81 Maeda, N., and Kim, H. S., (1990), Three independent insertions of retrovirus-like sequences in the haptoglobin gene cluster of primates. Genomics 8, 671-683. Mager, D., and Goodchild, N. (1989). Homologous recombination between the LTRs of a human retrovirus-like element causes a 5-kb deletion in two siblings. Am. J. Hum. Genet. 45, 848-854. Mager, D.L. (1989). Polyadenylation function and sequence variability of the long terminal repeats of the human endogenous retrovirus-like family RTVL-H. Virology 173, 591-599. Mager, D.L., and Freeman, D. (1987). Human endogenous retroviruslike genome with type C pol sequences and gag sequences related to human T-cell lymphotropic viruses. J. Virol. 61, 4060-4066. Mager, D.L., and Henthorn, P. S. (1984). Identification of a retrovirus-like repetitive element in human DNA. Proc. Natl. Acad. Sci. USA 81, 7510-7514. Mager, D., Henthorn, P., and Smithies, O., (1985). A Chinese G +(A B)° thalassemia deletion: comparison to other deletions in the B-globin cluster and sequence analysis of the breakpoints. Nucleic Acids Res. 13,6559-6575. Majello, B., De Luca, P., Hagen, G., Suske, G., and Lania, L. (1994). Different members of the Sp1 multigene family exert opposite transcriptional regulation of the long terminal repeat of HIV-1. Nucl. Acids Res. 22, 4914-4921. Marked, M. L., Hutton, J. J., Wiginton, D. A., States, J. C , and Kaufman, R. E., (1988). Adenosine deaminase (ADA) deficiency due to deletion of the ADA gene promoter and first exon by homologous recombination between two Alu elements. J. Clin. Invest. 81, 1323-1327. Martin, M., Bryan, T, Rasheed, S., and Khan, A. (1981). Identification and cloning of endogenous retroviral sequences present in human DNA. Proc. Natl. Acad. Sci. 78, 4892-4896. McCarrey, J.R., and Thomas, K., (1987). Human testis-specific PGK gene lacks introns and posseses charateristics of a processed pseudogene. Nature 326, 501-505. Medstrand, P., Linderskog, M., and Blomberg, J., (1992). Expression of human endogenouos retroviral sequences in peripheral blood mononuclear cells of healthy individuals. J. Gen. Virology 73, 2463-2468. Miki, Y., Nishisho, I., Horii, A., Miyoshi, Y., Utsunomiya, J., Kinzler, K., Vogelstein, B., and Nakamura, Y. (1992). Disruption of the APC gene by the retrotransposal insertion of L1 sequence in a colon cancer. Cancer Res. 52, 643-645. 82 Minakami, R., Kurose, K., Etoh, K., Furuhata, Y., Hattori, M., and Sakaki, Y. (1992). Identification of an internal cis-element essential for the human L1 transcription and a nuclear factor(s) binding to the element. Nucl. Acids Res. 20, 3139-3145. Morgan, R.A., and Huang, R.C. (1984). Correlation of undermethylation of intracisternal A-particle genes with expression in murine plasmacytomas but not in NIH/3T3 embryo fibroblasts. Cancer Res. 44, 5234-5241. Morse, B., Rotherg, P., South, V., Spandorfer, J., and Astrin, S., (1988). Insertional mutagenesis of the myc locus by a LINE-1 sequence in a human breast carcinoma. Nature 333, 87-90. Mueller-Lantzsch, N., Sauter, M., Weiskircher, A., Kramer, K., Best. B., Buck, M., and Grasser, F., (1993). The human endogenous retroviral element K10 (HERV-K10) encodes for a full length gag homologous 73kD protein and a functional protease. AIDS Res. Human Retrovir. 9, 343-351. Nakamura,.N.,Sugino, H.Takahara, K., Jin, C , Fukushige, S., and Matsubara, K. (1991). Endogenous retroviral LTR DNA sequences as markers for individual human chromosomes. Cytogenet. Cell. Genet. 57, 18-22. Norris, E.S., and Woodruff, R.C. (1992). Visible mutations induced by P-M hybrid dysgenesis in Drosophila Melanogaster result predominantly from P element insertions. Mutat. Res. 269, 63-72. O'Connell, C , O'Brien, S., Nash, W., and Cohen, M., (1984). ERV3, a full-length human endogenous provirus: chromosomal localization and evolutionary relationships. Virology 138, 225-235. Ono, M., (1986). Molecular cloning and long terminal repeat sequences of human endogenous retrovirus genes related to types A and B retrovirus genes. J. Virology 58, 937-944. Ono, M., Yasunaga, T., Miyata, T., and Ushikubo, H., (1986). Nucleotide sequence of human endogenous retrovirus genome related to the mouse mammary tumor virus genome. J. Virology 60, 589-598. Ono, M., Kawakami, M., and Takezawa, T., (1987). A novel human nonviral retroposon derived from an endogenous retrovirus. Nucl. Acids Res. 15, 8725-8737. Ou, S.-H.l., and Gaynor, R.B. (1994). Intracellular factors involved in gene expression of human retroviruses, in: "The Retroviridae, Vol4' (J. Levy, ed.) Plenum Press, NY. pp. 97-184. 83 Paulson, K., Deka, N., Schmid, C , Misra, R., Schindler, C , Rush, M., Kadyk, L, and Leinwand, L, (1985). A transposon-like element in human DNA. Nature 316, 359-361. Paulson, K.E., Matera, A., Deka, N., and Schmid, C.W., (1987). Transcription of a human transposon-like sequence is usually directed by other promoters. Nucl. Acids Res. 15, 5199-5215. Perl, A., Rosenblatt, J., Chen, I., DiVincenzo, J., Bever, R., Poiesz, J., and Abraham, G. (1989). Detection and cloning of new HTLV-related endogenous sequences in man. Nucl. Acids Res. 17, 6841-6854. Perl, A., Colombo, E., Dai, H., Agarwal, R., Mark, K.A., Banki, K., Poiesz, J., Phillips, P.E., Hoch, S.O., Reveille, J.D., and Abraham, G. (1995). Antibody reactive to HRES-1 endogenous retroviral element identifies a subset of patients with lupus erythematosus and overlap syndromes. Correleation with anti-nuclear antibodies and HLA class II alleles. Arthritis Rheum. 38, 1660-1671. Prince, V.E. and Rigby, P.W.J. (1991). Derivatives of Moloney Murine Sarcoma Virus capable of being transcribed in embryonal carcinoma stem cells have gained a funtional Sp1 binding site. J. Virol. 65, 1803-1811. Rabson, A., Steele, P., Garon, C , and Martin, M., (1983). mRNA transcripts related to full-length endogenous retroviral DNA in human cells. Nature 306, 604-607. Rabson, A., Hamagishi, Y., Steele, P., Tykocinse, M., and Martin, M., (1985). Characterization of human endogenous retroviral envelope RNA transcripts. J. Virology 56, 176-182. Renkawitz, R. (1990). Transcriptional repression in eukaryotes. Trend. Genet. 6, 192-197. Repaske, R., Steele, P., O'Neill, R., Rabson, A., and Martin, M. (1985). Nucleotide sequence of a full-length human endogenous retroviral segment. J. Virolorgy 54, 764-772. Repaske, R., O'Neill, R., Steele, P. and Martin, M. (1983). Characterization and partial nucleotide sequence of endogenous type C retrovirus segments in human chromosomal DNA. Proc. Natl. Acad. Sci. USA. 80, 678-682. Robinson, G. L, Henderson, E., Massari, M.E., Murre, C , and Stein, R. (1995). C-jun inhibits insulin control element-mediated transcription by affecting the transactivation potential of the E2A gene products. Mol. Cell. Biol. 15, 1398-1404. 84 Rouyer, F., Simmler, M.C., Page, D.C., and Weissenbach, J. (1987). A sex chromosome rearrangment in a human XX male caused by Alu-Alu recombination. Ce//51, 417-425. Rubin, G.M., Kidwell, M.G., and Bingham, P.M. (1982). The molecular basis of P-M hybrid dysgenesis: the nature of induced mutations. Cell 29, 987-994. Sambrook, J., Fritsch, E.F., and Maniatis, T. (1989). Molecular cloning: A laboratory manual, 2nd ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. Samuelson, L, Wiebauer, K., Snow, C , and Meisler, M., (1990). Retroviral and pseudogene insertion sites reveal the lineage of human salivary and pancreatic amylase genes from a single gene during primate evolution. Mol.Cell. Biol. 10, 2513-2520. Schneider, R., and Will, H., (1991). Regulatory sequences of duck hepatitis B virus C gene transcription. J. Virol. 65, 5693-5701. Seifarth, W., Skladny, H., Kreig-Schneider, F., Reichtert, A., Hehlmann, R., and Leib-Mosch, C. (1995). Retrovirus-like particles released from the human breast cancer cell line T47-D display type B and C related endogenous retroviral sequences. J Virol. 69, 6408-6416. Shore, D., and Nasmyth, K. (1987). Purification and cloning of a DNA binding protein from yeast that binds to both silencer and activator elements. Cell 51, 721-732. Shih, A., Coutavas, E.E., and Rush, M.G. (1991). Evolutionary implications of primate endogenous retroviruses. Virology\82, 495-502. Silver, J., Rabson, A., Bryan, T., Willey, R., and Martin, M. (1987).Human retroviral sequences on the Y chromosome. Mol. Cell. Biol. 7, 1559-1562. Small, S., Kraut, R., Hoey, T., Warrior, R., a.nri Levine, M. (1991). Transcriptional regulation of a pair-rule stripe in Drosophilia. Gene. Dev. 5, 827-839. Stavenhagen, J.B., and Robins, D.M. (1988). An ancient provirus has imposed androgen regulation on the adjacent mouse sex-limited protein gene. Cell 55, 247-254. Steele, P., Rabson, A., Bryan, T., Martin, M., (1984). Distinctive termini characterize two families of human endogenous retroviral sequences. Science 225, 943-947. Steele, P., Martin, M., Rabson, A., Bryan, t., and O'Brien, S., (1986). Amplification and chromosomal dispersion of human endogenous retroviral sequences. J. Virology 59, 545-550. 85 Steinhuber, S., Brack, M.f Hunsmann, G., Schwelberger, H., Dierich, MP., and Vogetseder, W. (1995) Distribution of human endogenous retrovirus HERV-K genomes in humans and different primates. Hum. Genet. 96, 188-192. Stoye, J.P., Fenner, S., Greenoak, G.E., Moran, C., and Coffin, J.M. (1988). Role of endogenous retroviruses as mutagenes: the hairless mutation of mice. Cell 54, 383-391. Strazzullo, M., Majello, B., Lania, L., and La Mantia, G. (1994) Mutational analysis of the human endogenous ERV-9 provirus promoter region. Virology 200, 686-695. Sugino, H., Oshimura, S., and Matsubara, K. (1992). Banding profiles of LTR of human endogenous retrovirus HERV-A in 24 chromosomes in somatic cell hybrids. Genomics 13, 461-464. Suzuki, H., Hosokawa, Y., Toda, H., Nishiimi, M., and Ozawa., (1990). Common protein-binding sites in the 5'-flanking regions of human genes for cytochrome C1 and ubiquinone-binding protein. J. Biol. Chem. 256, 8159-8163. Swergold, G.D. (1990) Identification, characterization, and cell specificity of a human LINE-1 promoter. Mol. Cell. Biol. 10, 6718-6729. Taruscio, D., and Manuelidis, L. (1991). Integration site preferences of endogenous retroviruses. Chromosoma'\0'\, 141-156. Tchenio, T., and Heidmann, T., (1991). Defective retroviruses can disperse in the human genome by intracellular transposition. J. Virology 65, 2113-2118. Ting, C.-N., Rosenberg, MP., Snow, CM. , Samuelson, L.C., and Meisler, M.H. (1992). Endogenous retroviral sequences are required for tissue-specific expression of a human salivary amylase gene. Genes and Dev. 6, 1457-1465. Tomita, N., Horii, A., Doi, S., Yokouchi, H., Ogawa, M., Mori, T., and Matsubara, K. (1990). Transcription of human endogenous retroviral long terminal repeat (LTR) sequence in a lung cancer cell line. Bioch. Biophys. Res. Comm. 166, 1-10. Tomlin, N.V., Bozhkov, V.M., Bradbury, E.M. and Schmid, C.W. (1992). Differential binding of human nuclear proteins to Alu subfamilies. Nucl. Acids Res. 20, 2941-2945. Toonen, R.F., Gowan, S., and Bingle, CD. (1996). The lung enriched transcription factor TTF-1 and the ubiquitiously expressed proteins Sp1 and Sp3 interact with elements located in the minimal promoter of the rat Clara secretory protein gene. J. Biochem. 316, 467-473. 86 Tsukiyama, T., Niwa, O., and Yokoro, K. (1989). Mechanism of suppression of the long terminal repeat of Moloney Leukemia Virus in mouse embryonal carcinoma cells. Mol. Cell. Biol. 9, 4670-4676. Ullu, E., and Tschudi, C. (1984). Alu sequences are processed 7SL RNA genes. Nature 312, 171-172. Varmus, H., and Brown, P. (1989). Retroviruses, in: Mobile DNA, Berg, D.E., and Howe, M.M. (Eds). American Society for Microbiology, Washinton D.C. pp53-108. Westin, G., and Schaffner, W. (1988). Heavy metal ions in transcription factors from HeLa cells: Sp1 but not octamer transcription factor requires zinc for DNA binding and for activator function. Nucleic Acids Res. 16, 5771-5781. Wilkinson, D. A., (1993). Expression of the RTVL-H family of human endogenous retrovirus-like sequences. Ph.D. Thesis, University of British Columbia. Wilkinson, D.A., Freeman, J.D., Goodchild, N.L., Kelleher, C.A., and Mager, D.L. (1990). Autonomous expression of RTVL-H endogenous retroviruslike elements in human cells. J. Virol. 64, 2157-2167. Wilkinson, D.A., Goodchild, N.L., Saxton, T.M., Wood, S., and Mager, D.L. (1993). Evidence for a functional subclass of the RTVL-H family of human endogenous retrovirus-like sequences. J. Virol. 67, 2981-2989. Wilkinson, D.A., Mager, D.L., and Leong, J.C. (1994). Endogenous human retroviruses, in: "The Retroviridae, Vol3' (J. Levy, ed.) Plenum Press, NY. pp. 465-535. Xu, X., Brown, D.A., Kitajima, A., Bilakovics, J., Fey, L.W., and Nerenberg, M.I. (1994). Transcriptional suppression of the human T-cell leukemia virus type I long terminal repeat occurs by an unconventional interaction of a CREB factor with the R region. Mol.Cell. Biol. 14, 5371-5383. Yeh, K. -W., Yen, C. -P., Liu, J. - C , Feng, Y. -N., Wu, F. Y. -H., Yang, W. -K., and Wu, C. W., (1991). Isolation of a cDNA clone of human endogenous retrovirus (HERV) from human cancer cell line. FASEB J. 5, A884. Zeichner, S.L., Kim, J.Y., and Alwine, J.C. (1991). Linker-scanning mutational analysis of the transcriptional activity of the human immunodeficiency virus type 1 long terminal repeat. Virology 65, 2436-2444. Zucchi, I., and Schlessinger, D. (1992). Distribution of moderately repetative sequences pTR5 and LF1 in Xq24-q28 human DNA and their use in assembling YAC contigs. Genomics 12, 264. 


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items