UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

A Drosophila tRNA gene family Newton, Craig Hunter 1989

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


831-UBC_1989_A1 N48.pdf [ 9.81MB ]
JSON: 831-1.0098177.json
JSON-LD: 831-1.0098177-ld.json
RDF/XML (Pretty): 831-1.0098177-rdf.xml
RDF/JSON: 831-1.0098177-rdf.json
Turtle: 831-1.0098177-turtle.txt
N-Triples: 831-1.0098177-rdf-ntriples.txt
Original Record: 831-1.0098177-source.json
Full Text

Full Text

A DROSOPHILA tRNA GENE FAMILY By Craig Hunter Newton B.Sc., McGill University, 1980 M.Sc., University of British Columbia, 1984 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY i n THE FACULTY OF GRADUATE STUDIES Department of Biochemistry We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA © C. H. Newton 1989 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference v and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of The University of British Columbia Vancouver, Canada DE-6 (2/88) i i Abstract This thesis describes a tRNA^rg gene family in the fruit fly D. melanogaster. The study was initiated in order to better understand the gene organization of a subset of this family. Of a total of 10 tRNA^rg gene copies that comprise this family, four genes are arranged tandemly on repeated sequences 200 bp and 600 bp in length. This organization suggests these four genes have undergone recent gene duplication. To investigate the significance of such events, these tRNA^rg genes were compared to other members of this gene family in regards their structure, in vitro function, and organization between different D. melanogaster strains and sibling species. The results show that the four repeated genes differ in sequence at a single nucleotide (CI3) relative to six additional gene copies. Five of these additional genes are identical in sequence and one differs at two nucleotides (A16, A37). The gene family is organized at four different chromosomal sites. Six of the 10 gene copies occur at polytene region 12E1-2 on the X chromosome. These include the four repeated genes (R12.1-R12.4) and two additional gene copies (R12.5-R12.6). A single gene occurs at another X-linked locus located at 19F (R19.1). The three remaining gene copies occur on chromosome 3R as a gene doublet at 85C (R85.1-R85.2) and as a single gene at 83AB (R83.1). All 10 genes in this family are active as templates for in vitro transcription in. homologous Drosophila extracts. The predicted 5' initiation sites are all very similar and occur at conserved nucleotides 4-5 bp upstream from the mature 5' ends of the genes. Six of the ten gene copies are transcribed efficiently in vitro . The four repeated gene copies however, have novel transcription properties; they are much less efficient templates in Drosophila extracts (2-5 fold) and are inhibited at KC1 concentrations where other gene copies transcribe optimally. These properties do not result from the single nucleotide difference in coding sequence or an inability to iii form stable pre-initiation complexes. Instead they appear to result from the upstream 5' flanking sequence. These novel transcription properties are not observed in_heterologous transcription systems containing human cell extracts. Instead they are transcribed efficiently relative to other members of the gene family and no longer exhibit sensitivity to KC1. Comparison of the repeated tRNA^rg gene locus between several D. melanogaster strains indicates that this is the predominant form in the majority of wild and laboratory stocks. However, a fraction of populations (5/45) contain a variant locus that consists of only three gene copies. These have apparently lost, or not gained, one of the repeats found in the majority of populations. Similar comparisons between D. melanogaster sibling species (D. simualns, D. teissieri, D. erecta, and D. yakuba ) show that only the D. melanogaster lines contain the repeated genes. Thus in the time since the divergence from its closest related sibling species (D. simulans ), D . melanogaster lines have acquired three new gene copies by duplication of a putative ancestral single gene. Analysis of the four, three, and single gene loci isolated from D. melanogaster (pDt27R, p27ry2) and D. simulans (p27simC), respectively, at the nucleotide level suggests a model to account for the evolution of this gene cluster. One additional sequence associated with this gene family consists of a half t R N A ^ r g gene composed of the 3' 37 bp. This half gene contains a 3' CCA sequence and is flanked on one side by a region that is > 90% homologous to the LTR of the Drosophila retrotransposon mdg 1. It is not clear if this half gene is itself part of a related retrotransposon or has originated by recombination or aberrant reverse transcription with mdg 1. The 3' ends tRNA A r g have been proposed to function as primers for at least two classes of retrotransposon, mdg 1 and 412 (Yuki et al., 1986). i v T A B L E OF CONTENTS Page Abstract ii Table of Contents iv List of Tables x List of Figures xi Acknowledgements xiii Dedication xiv Abbreviations xv INTRODUCTION 1. Background 1 2. Redundant gene families encode tRNA 2 3. Organization of tRNA gene families 3.1 Genomic organization in D. melanogaster 3 3.2 tRNA gene organization in other eucaryotes 6 3.3 Exceptional organization of tRNA gene families 7 3.4 Other classes of genes associated with tRNA genes 8 4. Structure of tRNA gene families 4.1 Homogeneity of tRNA gene families 9 4.2 tRNA pseudogenes 11 4.3 Sequences flanking members of tRNA gene families 12 4.4 The evolution of tRNA gene families 13 5. Function of tRNA gene families 14 5.1 Structure/function relationships in tRNA gene expression 16 5.2 In vitro activity of gene copies within a tRNA gene family 18 V Page 5.3 Significance of variable tRNA gene activity within a family 19 6. Thesis study- a t R N A A r 8 gene family in D. melanogaster 21 MATERIALS and METHODS M a t e r i a l s -Enzymes 25 Nucleotides 25 Oligonucleotides <• 25 Autoradiography and photography 26 Media components 26 Electrophoresis reagents 26 Organic reagents Microbial strains 26 Drosophila strains and sibling species 27 Plasmid DNAs 27 Transfer RNAs 27 Methods 1. Preparation of plasmid DNA 28 2. Preparation of single stranded DNA from pEMBL plasmids 29 3. Preparation of Drosophila genomic DNA 29 4. General nucleic acid techniques -Precipitation of nucleic acids with ethanol 30 - Elution of DNA fragments from agarose gels 30 - De-phosphorylation with Calf Intestinal Phosphatase 31 v i Page - Filling in 5' single stranded ends with DNA Polymerase I (Klenow enzyme) 3 1 - Ligations with T4 DNA Ligase 3 1 - Digestion with Exonuclease III 31 5. Radioactive labelling of Nucleic acids with 32p - Nick translation of DNA fragments with DNA Polymerase 1 32 - Labeling of single strand RNA with T7 RNA Polymerase 32 - 3' end labeling of tRNA with T 4 RNA Ligase 32 - 5' end labeling of synthetic oligonucleotides with T 4 Polynucleotide Kinase > 33 6. Filter hybridizations - [32p] DNA/DNA 33 - [32p] RNA/DNA 33 7. Cloning of size fractionated DNA into pEMBL mini-libraries 34 8. DNA sequencing with pEMBL vectors 35 9. In vitro transcription of tRNA genes 3 6 10. Analysis of in vitro transcription products - RNase T l Fingerprinting 37 - 5' end analysis by primer extension with Reverse Transcriptase 37 11. Plasmid constructions - pArg 3 8 -pA27 38 -pR12.4 39 -pR12.2 39 -pR12.5 39 -pR85.1 39 v i i P a g e -pR85.2 40 -pR83.L 40 -pR19.1 40 -pR5*/3' 40 -pR12.4 T 1 3 41 RESULTS AND DISCUSSION Part I t R N A ^ r g gene family in D. melanogaster 1. Genomic Organization of tRNA gene family 1.1 In situ hybridization analysis 42 1.2 Genomic Southern analysis 45 2. Molecular analysis of tRNA^rg gene family 2.1 Identification and isolation of recombinant clones -pDt67R, pDt66R, PDt72R 48 -pDtl7R, pDt85C 50 -pR12.6 51 2.2 Organization and structure of the tRNA A r 8 gene family - 12E1-2. region 52 - 19F region 55 - 85C region 55 - 83AB region 55 2.3 Comparison of tRNA flanking sequences 60 3.Summarv of tRNA^rg g e n e family in D. melanogaster 64 viii Page Part II - Evolution of pDt27R gene cluster 1. Analysis of the pDt27R locus in D. melanogaster and sibling species 66 2. Genomic Southern analysis of pDt27R homologous loci 2.1 Survey of D. melanogaster strains 68 2.2 Survey of D. melanogaster sibling species 73 3. Molecular analysis of variant pDt27R loci in D. melanogaster and D. simulans 3.1 Nucleotide sequence of p27ry2 amd p27simC 74 3.2 Junctions of repeated sequences in pDt27R, p27ry2 and p27simC 80 3.3 Divergence of repeated sequences in pDt27R from p27simC 80 4. A model for evolution of p27rv2 and pDt27R t R N A A r S gene clusters. 86 5. Relation of D. simulans and D. melanogaster to other Drosophila sibling  species 92 Part III - Functional studies of the t R N A A r 8 gene family Un vitro transcription of t R N A A r £ gene family 94 1.1 Gene products 95 1.2 RNase T l fingerprinting of in vitro transcripts 98 1.3 Intitiation and termination of in vitro transcripts 101 1.4 Transcription efficiency of different gene copies 105 1.5 Novel properties of pDt27R genes -Coding and flanking sequence dependence 112 -Template pre-incubation assays 113 -Extract dependence; Drosophila versus HeLa cell extracts 116 1.5 Summary of in vitro transcription of tRNA A r 8 gene family 121 2. In vivo expression- Identity of gene products 123 ix P a g e Part IV A t R N A A r8 pseudogene or retrotransposon? 127 CONCLUSIONS and PERSPECTIVES 1 3 5 R E F E R E N C E S 138 APPENDIX 1 5 4 X LIST OF TABLES Page Table 1. Summary of plasmids containing tRNA A r S genes 49 Table 2. D. melanogaster strains and sibling species used to study homologous pDt27R loci 69 Table 3. Pairwise comparison of repeated sequences in pDt27R with single copy sequences in p27simC 79 Table 4. Summary of in vitro transcription efficiency of t R N A A " g gene family 109 x i LIST OF FIGURES Page Figure 1. Summary of structure of plasmid pDt27R 23 Figure 2. In situ hybridization of t R N A A r 8 genes in D. melanogaster 44 Figure 3. Genomic Southern analysis of t R N A A r S genes in D. melanogaster.. 47 Figure 4. Summary of t R N A A r g gene family 54 Figure 5. Cloverleaf structure of the predicted tRNA A "g gene products 58 Figure 6. Comparison of the t R N A A r 8 gene flanking sequences 62 Figure 7. Summary of genomic Southern analysis of different D. melanogaster strains and Drosophila sibling species 72 Figure 8. Restriction maps of Bam HI sites in pDt27R, p27ry2, and p27simC 76 Figure 9. Sequence comparison of t R N A A r § gene clusters in pDt27R, p27ry2, and p27simC 78 Figure lO.Junction sequences of duplicated regions in pDt27R and p27simC '. 83 Figure 11.Sequence divergence between repeated regions in D. melanogaster and D. simulans 85 Figure 12.Model for the evolution of the pDt27R locus 89 Figure 13. In vitro transcription products of the t R N A A r S gene family in Drosophila cell extracts 97 Figure 14. RNase T l fingerprints of in vitro transcription products 100 Figure 15. Primer extension analysis of t R N A A r S transcripts synthesized in vitro 103 Figure 16. Transcription efficiency of t R N A A r § templates 108 Figure 17. KC1 optima for in vitro transcription 111 Figure 18. Stable complex formation of pDt27R genes 115 xii Page Figure 19. Comparison of Drosophila and human cell (HeLa) extracts on in vitro transcription of pDt27R gene 118 Figure 20. Comparison of t R N A A r S genes in HeLa cell extracts as function of KC1 120 Figure 21. Hybridization of purified t R N A 4 A r g to t R N A A r e Plasmid DNAs 125 Figure 22. Structure of t R N A A r g gene in pDt72R 129 Figure 23. Comparison of pDt72R to sequence of mdg 1 LTR 132 xiii ACKNOWLEDGMENTS I wish to thank Gordon Tener, Shizu Hayashi, Ian Gillam, Jeffrey Leung , Don Sinclair, Nir Seto, Roland Russnak, Marlys Kochinsky, Rob Kay, Ron Mackay, Ferydoun Sajjadi, Georj Speigelman, Tony Griffiths, Hugh Brock, Dave Holm, Caroline Astcll, Ross McGillivray, ar Joan McPherson for generously giving their time and help to me with this venture. I a: particularly indebted to the financial support provided by Gordon Tener who patient! oversaw this research. I thank Shizu and Ian , respectively, for doing the in si\ hybridization analysis and providing the purified tRNAs .which together helped glue a this work together. I wish to also thank Patrick Dennis and members of his laborator Lawrence Shimmin, Willa Downing, Peter Durovic, Phalgun Joshi, Bruce May, Janet Ye and Deidre De Jong Wong for putting up with me during the gestation period of this thesi Special thanks must go to Lawrence Shimmin for his infinite patience with those of us wr must be shown the same thing several times in the use of a microcomputer or fixing motor scooter. X I V DEDICATION This thesis is dedicated to the memory of John Ramsey Hunter Wells X V ABBREVIATIONS A adenosine A260 absorbance at 260 nm (or other wavelengths) ATP adenosine -5' triphosphate bp basepairs BSA bovine serum albumin BPB bromophenol blue C cytidine Cp cytidine -2'(3') phosphate cpm counts per minute (Cerenkov radiation) DNA deoxyribonucleic acid dNTP 2' deoxyribonucleoside triphosphate (where N = any of the four nucleosides, A, G, C, or T) DTT 1,4-dithiothreitol DMSO dimethyl sulfoxide DNase I Bovine pancreatic deoxyribonuclease EDTA ethylenediaminetetraacetic acid EtBr ethidium bromide G guanosine g gravity HEPES N-2-hydroxyethylpiperazine-N'-2-ethanesulfonic acid ICR internal control region kbp kilobasepairs LTR long terminal repeat MYR million years NTP ribonucleoside triphosphate (where N= any of the four nucleosides A,C,G,U) x v i PBS primer binding site PEG polyethylene glycol (MW 6-8000) pfu plaque forming units Pol III RNA polymerase III RNA ribonucleic acid RNase pancreatic ribonuclease A rRNA ribosomal RNA S Svedberg units SDS sodium dodecylsulphate SSPE 20X= 3M NaCl, 0.2 M NaH 2P0 4- H 2 0 , 0.02M EDTA, pH 7.4 T thymidine TAE 10X= 0.4 M Tris-acetate, 0.02 M EDTA, pH 8.0 TE 10 mM Tris-HCl (pH 7.5), 0.1 mM EDTA, TBE 10X= 0.9 M Tris-borate, 0.9 M boric acid. 0.01M EDTA, pH 8.3 Tris-Cl tris(hydroxymethyl)aminomethane neutralized with HC1 tRNA transfer RNA TJ uridine UV ultraviolet XD xylene cyanol YT 0.8% bactotryptone, 0.5% yeast extract, 0.5% NaCl Y T a r a D YT media containing 0.1 mg/ml ampicillin 1 INTRODUCTION 1. B a c k g r o u n d The presence of gene families encoding related proteins and structural RNAs (i.e rRNA) demonstrates how gene duplication has played an important role in the evolution of eucaryotic genomes (Li, 1983). Two evolutionary outcomes appear to follow from gene duplication events. On one hand, many protein coding gene families are thought to have arisen by duplication events where the once identical duplicated gene copies subsequently evolved independently, eventually giving rise to present day genes that vary in structure and have specialized functions based around a common theme (i.e the hemoglobin gene family). In contrast, other gene families such as those encoding structural RNAs (i.e rRNA, tRNA) show much less difference in structure between between large numbers of different gene copies, and because so, are thought to have evolved in concert and have not accumulated significant sequence divergence (reviewed by Dover, 1982). A rationale for these homogeneous gene families is to allow synthesis of greater quantities of more or less functionally equivalent gene products. One obvious difference between gene families that evolve independently and those that evolve in concert is how the gene families are organized in the genome. For example rRNA genes are organized into one or a few clusters containing hundreds of tandemly repeated units of gene coding and spacer sequence (Long and Dawid, 1980). It has been suggested that unequal exchange and or gene conversion events between these clusters would be sufficient to maintain the observed homogeneity between different gene copies (Petes, 1980, Coen and Dover, 1983 ). Independently evolving gene copies on the other hand, are generally not organized in such tandem arrays and, with some exceptions, are presumably not subject to similar mechanisms of genetic turnover and consequent co-evolution. 2 Transfer [t]RNA is another cellular component that is encoded by redundant gene families. Their evolution and potential function is more complex however because these gene families are not organized in tandem arrays but instead are organized more like the independently evolving protein gene families. This organization raises the question whether redundant gene copies encoding individual tRNAs function solely to increase the synthetic potential of their corresponding gene products, or rather, whether different gene copies have unique functional roles that presently are not apparent. As a basis for better understanding the evolution and function of tRNA genes, this study describes a tRNA gene family in the fruit fly Drosophila melanogaster. This organism is ideally suited to the study of tRNA gene organization because different gene copies can be localized by in situ hybridization to polytene salivary chromosomes. In addition its small genome size and relatively low tRNA gene redundancy simplify the analysis. The following pages will review briefly the structure, organization, and function of eucaryotic tRNA genes drawing heavily from the extensive studies in Drosophila. 2. Redundant gene families encode tRNA. The first evidence that tRNAs are encoded by redundant gene families came from kinetic hybridization studies between total tRNA (4S RNA) and homologous sequences in genomic DNA. A wide range of eucaryotes studied (i.e yeast through human) all showed that the number of total tRNA genes per haploid genome was much greater than the complexity of the 4S RNA (Long and Dawid, 1980). For example in Drosophila, a total of 600-750 tRNA genes encode the 60-100 different tRNA species detectable in this organism (Ritossa et al., 1966, Weber and Berger, 1976, White et al., 1973). This suggested each individual tRNA was encoded by gene families on average composed of 10-12 copies. A rough generalization is that the 3 average size of each tRNA gene family, or degree of redundancy, is usually larger in complex eucaryotes such as Drosophila and humans than in simple eucaryotes like yeast (Long and Dawid, 1980). For example, while the total number of tRNA species is approximately the same between Drosophila , yeast, and humans (60-100, White et al., 1973, Lin and Agris, 1980) the total number of human tRNA genes is approximately double that in Drosophila (ca. 1200- 1300 copies, Hatlen and Attardi, 1971). Conversely, in yeast the total number of tRNA genes is approximately half that inDrosophila (ca. 350, Guthrie and Abelson, 1982). 3. The organization of tRNA gene families 3.1- Genomic organization of tRNA genes in Drosophila The organization of tRNA gene families was first suggested from cytological studies in Drosophila (Steffanson and Wimber, 1971, Elder et al., 1980). In situ hybridization of radiolabeled total 4S RNA (containing all species of tRNA) to polytene salivary chromosomes resulted in labeling of all the major chromosome arms at approximately 60-75 different chromosomal sites. Each site can usually be assigned to a single band on the polytene map of the salivary chromosomes and are named as such (i.e 12E1-2). The only obvious difference between the four major chromosome arms of the Drosophila genome was that the X chromosome contained significantly fewer tRNA loci than the autosomal chromosomes. For example, chromosomes 2 and 3 each contain approximately 30-40 tRNA loci while the X chromosome contains less than a dozen. In addition the small fourth chromosome contains no tRNA loci and the Y chromosome in males cannot be assayed this way because it is not polytenized. Therefore tRNA genes are not located at one or a few chromosomal sites, such as the case of other redundant gene families (ie. rRNA genes), but instead are dispersed randomly in the genome. 4 The use of purified tRNA preparations allowed individual members of a single tRNA gene family to be localized (Hayashi et al., 1980, 1982, Kubli, 1982). To date more than 25 different Drosophila tRNA gene families have been studied by in situ hybridization. Some generalizations of these results are as follows. Gene copies homologous to a single tRNA species are generally found at 2-4 different chromosomal sites. These can occur on the same or different chromosome and show no obvious pattern between different types of tRNA and the locations of their corresponding loci. These dispersed loci often show quite different signal intensities suggesting that different numbers of homologous genes occur at different sites. In addition, a particular chromosomal site identified with one tRNA may also be identified with one or more different tRNAs. This is not due to cross-hybridization but results from apparent clustering of different kinds of tRNA genes. Usually no more than one chromosomal site is shared; any other sites will be unique between different tRNAs. These cytological studies with purified tRNAs predicted that members of a tRNA gene family occur at several chromosomal sites in clusters of the same and different type of tRNA gene. No apparent order to this organization has been deduced except for the paucity or lack of sites on certain chromosomes, and the fact that with only one exception (serine tRNAs, Hayashi et al., 1980), genes for different tRNAs accepting the same amino acid (isoacceptors) do not share the same chromosomal sites. These cytological studies have been confirmed and extended at the molecular level by cloning and DNA sequencing of genomic fragments containing tRNA genes. In particular the organization of some Drosophila tRNA gene clusters has been analysed in detail. A classic example is a tRNA gene cluster derived from polytene region 42A (Yen and Davidson 1980). This region is one of the most intensely labeled sites after in situ hybridization with total 4S RNA and therefore was predicted to contain a large number of tRNA genes. The molecular analysis showed that of 5 a l m o s t 100 k b p a n a l y s e d , a 4 6 k b p c e n t r a l p o r t i o n c o n t a i n e d e i g h t t R N A 5 A s n g e n e s , f o u r t R N A 2 A r S g e n e s , f i v e t R N A 2 L v s g e n e s , a n d a s i n g l e t R N A I , e g e n e . T h e i n d i v i d u a l g e n e c o p i e s w e r e s c a t t e r e d i n s u b c l u s t e r s w i t h i n t h i s 4 6 k b p s e g m e n t a n d s h o w e d n o o b v i o u s p a t t e r n i n r e g a r d to g e n e t y p e , s p a c i n g o r t r a n s c r i p t i o n a l o r i e n t a t i o n . T h e d i s t a n c e s b e t w e e n d i f f e r e n t g e n e s v a r i e d f r o m as l i t t l e as a f e w h u n d r e d b a s e p a i r s to as m u c h as 10 k b p a n d i n a f e w i n s t a n c e s w e r e i n t e r s p e r s e d w i t h s e q u e n c e s tha t h y b r i d i z e d p o l y - A R N A . A d d i t i o n a l c o p i e s o f t h e t R N A gene f a m i l i e s e n c o d e d i n t h i s c l u s t e r w e r e i s o l a t e d f r o m o t h e r c h r o m o s o m a l s i t e s . F o r e x a m p l e , at t he 5 0 A B p o l y t e n e r e g i o n , f i v e a d d i t i o n a l t R N A ^ e g e n e s w e r e i d e n t i f i e d i n D N A f r a g m e n t that a l s o c o n t a i n e d t w o t R N A ^ u g e n e s ( R o b i n s o n a n d D a v i d s o n , 1 9 8 1 ) . S i m i l a r l y , a d d i t i o n a l t R N A 2 A r 8 genes h a v e b e e n i s o l a t e d f r o m a g e n e c l u s t e r at p o l y t e n e r e g i o n 8 4 F ( D u d l e r et a l . , 1 9 8 0 ) . O t h e r t R N A g e n e c l u s t e r s w i t h s m a l l e r n u m b e r s o f s i m i l a r l y o r g a n i z e d g e n e s h a v e b e e n i s o l a t e d f r o m p o l y t e n e r e g i o n s 9 0 B C ( D e l o t t o a n d S c h e d l , 1 9 8 4 , A d d i s o n et a l . , 1 9 8 2 ) , 1 2 E 1 - 2 ( C r i b b s et a l . , 1 9 8 7 ) , and 5 6 E F ( H o s b a c h et a l . , 1980 ) . N o t a l l Drosophila t R N A g e n e s a p p e a r t o o c c u r i n c l u s t e r s h o w e v e r . In situ h y b r i d i z a t i o n d e t e c t s s e v e r a l s i t es to w h i c h o n l y a s i n g l e t R N A s p e c i e s i s l o c a l i z e d ( K u b l i , 1 9 8 2 ) a n d s i m i l a r l y , m o l e c u l a r s t u d i e s h a v e s h o w n tha t s o m e t R N A g e n e s o c c u r a l o n e i n r e g i o n s o f c l o n e d D N A ( S h a r p et a l . , 1 9 8 1 , G l e w et a l . , 1986 ) . It is l i k e l y h o w e v e r tha t s t u d i e s o f t R N A g e n e o r g a n i z a t i o n u n d e r e s t i m a t e t he e x t e n t o f g e n e c l u s t e r i n g d u e to l i m i t s o n the s i z e s o f c l o n e d r e g i o n s tha t a re u s u a l l y a n a l y s e d , a n d b y t h e f a c t t h a t a t m o s t o n l y h a l f o f t he c h r o m a t o g r a p h i c a l l y r e s o l v a b l e t R N A s p e c i e s h a v e b e e n p u r i f i e d to h o m o g e n e i t y and l o c a l i z e d b y in situ h y b r i d i z a t i o n . T h e r e f o r e u n l i k e o the r Drosophila r edundan t g e n e f a m i l i e s ( i .e r R N A o r 5 S g e n e s , r e v i e w e d b y S p r a d l i n g a n d R u b i n , 1 9 8 1 ) , m e m b e r s o f a t R N A g e n e f a m i l y a re no t l o c a t e d t o g e t h e r i n the g e n o m e bu t a re d i s p e r s e d e i t h e r a l o n e o r i n t o g e n e c l u s t e r s . N o a p p a r e n t pa t t e rn h a s ye t e m e r g e d i n th is g e n o m e o r g a n i z a t i o n e x c e p t f o r the fac t 6 that some tRNA gene clusters apparently encode tRNAs that accept amino acids with the same polar side chains (see DeLotto and Schedl, 1984). In addition the clustering does not follow patterns of evolutionary relatedness that exists between those isoaccepting tRNAs that accept the same amino acid but recognize different anticodons. Only in one case do isoaccepting tRNA genes localize to the same polytene region (Cribbs et al., 1987). This example may be unique because the two isoacceptors differ only by three nucleotides. The only other digression from a totally random organization is that the X chromosome contains only a few tRNA gene loci and the small fourth chromosome contains none. The latter may simply reflect its small size relative to the other chromosome arms. 3.2 tRNA Gene Organization in other eucaryotes. The organization of tRNA genes in yeast has received extensive study (reviewed by Guthrie and Abelson, 1982). Without the advantage of cytological studies possible in Drosophila , it is more difficult to determine the detailed organization of yeast tRNA genes. However genetic mapping of tRNA suppressor loci and molecular studies of cloned DNA show that tRNA gene families in yeast are also dispersed randomly throughout the genome. One difference however is that no tRNA gene clusters have been isolated from yeast. To date, yeast tRNA genes have been isolated almost exclusively as single gene copies within cloned DNA fragments and do not show the close association with genes of the same or different type as is seen in Drosophila (eg. Baker et al., 1982, Bull et al., 1987). One exception to this rule is a pair of t R N A 3 A f g and t R N A A s P genes in Saccharomyces cerevisiae and a pair of t R N A S e r and t R N A i ^ e t genes in Schizosacchromyces pombe . In both organisms these genes are separated by only a few nucleotides and are transcribed as dimeric precursors to the apparent advantage of the trailing tRNA species (Hottinger-Werlen et al., 1985). 7 In mammals where both cytological and genetic evidence are not yet available, the organization of tRNA genes has been determined solely from molecular studies (reviewed by Sharp et al., 1985). The approximately 60-90 different species of human tRNA (Lin and Agris, 1980) are encoded by 12-1300 genes (Hatlen and Attardi, 1971) and thus correspond to 10-20 gene copies per tRNA species. Several studies suggest these genes also are dispersed randomly in the genome. For example the human t R N A j M e t (Santos and Zasloff, 1981) and t R N A V a l (Arnold et al., 1986) gene families are each found on at least 12-13 different sized restriction fragments. To date mammalian tRNA genes have not been assigned to different chromosomes either by somatic cell hybrid analysis (Naylor et al., 1983) or by in situ hybridization to single copy metaphase chromosomes. Thus while their genomic organization has not been shown directly, these cloning studies suggest that like in yeasts and Drosophila , tRNA genes in humans are also probably randomly distributed. Analysis of cloned DNAs show that mammalian tRNA genes also occur in clusters. For example a 13.8 kbp segment of human DNA contains two pairs of genes encoding t R N A L Y s and t R N A p h e (Doran et al., 1987). This and other examples (Chang et al., 1986, Ma et al., 1984, Pirtle et al., 1986, Looney and Harding, 1983, Makowski et al., 1983) demonstrate that mammalian tRNA genes families are organized in a manner similar to that seen inDrosophila . 3.3 Exceptional organization of tRNA gene families In some cases tRNA genes are not organized singly or in heterogeneous clusters dispersed in the genome. The classic example is a 3.18 kbp segment of Xenopus laevis DNA that contains 8 tRNA genes organized irregularly within it, much like the gene clusters observed in other eucaryotes (Muller and Clarkson, 1980). The unusual feature of this segment is that it is tandemly repeated approximately 150 times per haploid genome at only one or a few chromosomal locations. Other tRNA genes may 8 be similarly arranged in X. laevis and likely account for the exceptionally high redundancy of tRNA genes in this organism (ca. 8000, Long and Dawid, 1980). Another example is the silk gland specific t R N A A l a genes of Bombyx mori. This tRNA is selectively expressed during silk synthesis during the larval stages of development (Garel, 1982). Its corresponding genes have distinct functional properties (see below. Young et al., 1986) and are arranged as a tandem cluster of approximately 30 gene copies (Underwood et al., 1988). These and other examples (Sharp et al., 1985) may represent specialized cases where tRNA gene copy number has been amplified to meet increased demand of tRNA products at certain stages of development and in extreme cases of tissue specific protein synthesis. 3.4 Other classes of genes associated with tRNA genes In certain cases tRNA genes are found associated with genes encoding different structural RNAs and proteins. For instance, members of the the Drosophila t R N A G l u gene family are located adjacent to the 3' end of the 5S RNA gene cluster located at polytene region 56EF (Indick and Tartof, 1982). At least three additional tRNA species ( t R N A 2 L v s , t R N A 3 M e t , tRNA3 G 1 y) have also been localized to this site (Hayashi et al., 1980). Another stable RNA, U6 snRNA, is encoded by 1-3 gene copies at a t R N A A s P locus found at polytene region 96A (Saluz et al., 1988). All three classes of these RNAs are transcribed by RNA polymerase III (see below). Another example of a close association with tRNA genes is the Ty3 retrotransposons of S. cerevisiae (Hansen et al., 1988). At least two of these mobile retroviral-like elements are found within 20 bp of the 5' ends of either a tRNA^ys or t R N A Leu gene. In addition, the associated sigma elements that correspond to isolated LTRs of Ty3 are also found near tRNA genes. One interpretation of this data is that the tRNA genes are involved as targets for the site specific transposition of these elements. 9 Some tRNA genes also occur near or within protein coding genes transcribed by RNA polymerase II. For example, at the Drosophila 42A gene cluster it was noted that pdlyA containing sequences were interspersed between tRNA coding sequences (Yen and Davidson, 1980). In addition, two pairs of Drosophila t R N A T v r genes are located within the putative 5' control regions of two developmentally regulated genes transcribed by RNA polymerase II (Suter and Kubli, 1988). Another member of this gene family is located within an intron of the decapentaplegic gene. The Drosophila gene, no-ocelli, contains within it a total of 5 tRNA G ly genes (Meng et al., 1988). In addition there is increasing evidence for a functional relationships between Pol II and Pol III promoters (Chang and Clayton, 1989, Carbon et al., 1987, Murphy et al., 1987) that may eventually shed light on how the RNA polymerases of eucaryotes have evolved and possibly also their functional interaction in certain genes today (Chung et al., 1987, Mattaj et al., 1988). 4 . Structure of tRNA gene families 4.1 Homegeneity The in situ hybridization studies (Kubli, 1982) suggested that individual members of tRNA genes families were sufficiently similar in sequence to cross-hybridize under the conditions employed. This relative homogeneity of gene family structure has largely been confirmed by DNA sequencing studies of a large number of cloned eucaryotic tRNA genes. For example, eight different members of a Drosophila tRNA^yr gene family are derived from three different chromosomal sites (85A, 28C, 22F) and each predict mature tRNAs that are identical to the known sequence of this tRNA (Suter and Kubli, 1988). These 8 t R N A T y r genes likely constitute the whole gene family and therefore are one example where all members, less post-transcriptional modifications (reviewed by Bjork et al., 1988), give rise to identical gene products. The homogeneity observed in these 8 tRNATyr coding regions does 10 not include the total transcription unit however. For example, the sequences that occur a few nucleotides upstream and downstream from the mature coding region are included in the primary transcripts of tRNA genes (Sharp et al., 1985) but differ significantly between different gene copies. In addition, each gene copy contains the intron in identical positions within the tRNA coding region but both minor and major differences in sequence and structure are observed between the introns of different gene copies (Choffat et al., 1988). For example, five introns are 20-21 bp in length and differ in sequence only at 1-2 positions. A sixth intron is also 21 bp long but differs completely in sequence from those above. The two remaining introns present in this gene family are 48 and 113 bp long and each are composed of unique sequence relative to other introns in this gene family. There are numerous other examples of tRNA gene families where at least some of the mature coding regions are identical but in practically every case these analyses do not include every member of the particular gene family (Yen and Davidson, 1980, Robinson and Davidson, 1981, Addision et al., 1982, Lofquist and Sharp, 1986, Meng et al., 1988). As more genes are sequenced it is becoming apparent that not all members of tRNA gene family are identical (Leung et al., 1984, Cribbs et al., 1987, Defranco et al., 1982, Hosbach et al., 1980, Sharp et al., 1981, Bull et al., 1987, Doran et al., 1986, Arnold et al., 1986, Pirtle et al., 1986, Ma et al., 1984, Gouillard and Clarkson, 1986). In these examples tRNA-like genes were isolated that differ by 1-6 nucleotides from known tRNA sequences and/or other tRNA gene copies. They retain however, identical anticodon sequences, and therefore at least theoretically are equivalent to authentic genes in their protein de-coding functions. In the yeast tRNA^he gene family, two of 8 otherwise identical gene copies contained the same 1 bp change and suggests these differences may not be accumulating on a random basis (Bull et al., 1987). Also these variant genes sometimes have different functional properties as templates in vitro (Addison et al., 1982, Leung et al., 1984). 11 Whether these isocoding variant genes or 'allogenes' (Leung et al., 1984) have any functional significance remains to be shown. By definition their in vivo products have not been isolated and at least in two cases there is some evidence to suggest that they are not expressed at all (Larson et al., 1984, Pirtle et al., 1986). However, their frequency in gene families is probably an underestimate because in only a few cases (Suter and Kubli, 1988, Bull et al., 1987, Cribbs et al., 1987, Leung et al., in preparation) have all gene copies of a particular gene family been analysed. 4.2 tRNA pseudo genes One last category of sequences can be included in a tRNA gene family, if only on die basis of their origin. These include tRNA genes that have incurred obvious structural defects in the coding region that are likely to result in an inactive gene copy. Examples include human t R N A j M e t (Zasloff et al., 1982) and t R N A G I y genes (Pirtle et al., 1986) which contain single nucleotide changes which are likely to cause the lack of template activity observed in vitro (see Sharp et al., 1985 and below). More extensive derangements occur in a Drosophila t R N A ^ l s pseudogene where an 8 bp coding segment is different from an authentic gene copy located nearby (Cooley et al., 1984). Similarly, a rat tRNA gene cluster contained t R N A G l u and t R N A G 1 y genes that also were lacking portions of their coding regions and likewise were inactive in vitro (Makowski et al., 1983, Shibuya et al., 1982). In some cases tRNA pseudogenes have been described that consist of either 3' fragments (Sharp et al., 1981, Reilly et al., 1982) or of 5' fragments of intact genes (Pratt et al., 1985) and an absence of the remaining gene portion. In the case of the mouse t R N A P n e pseudogene (Reilly et al., 1982) the presence of a 3' CCA sequence adjacent to the 3' end of the gene fragment was interpreted as reflecting the reverse transcription of tRNA sequences and reintegration into the genome. The CCA sequence is added post-transcriptionally to eucaryotic tRNAs and is therefore not 12 expected at this position in the DNA. The presence of these pseudogenes emphasizes how estimates of tRNA gene number or their location in the genome (i.e by in situ hybridization) may often not solely reflect authentic tRNA genes and that analysis at the molecular level is the only unambiguous method of determining the organization and composition of tRNA gene families. 4.3 Sequences flanking members of tRNA gene families. A potentially significant feature of tRNA gene families is that the high conservation between different gene copies includes only the mature tRNA coding region. Regions immediately adjacent to the coding region, which as noted above are also included in the gene transcription unit, and flanking sequences beyond frequently show little or no homology between different gene copies. For instance, in examples where most if not all gene copies have been analysed such as for Drosophila t R N A T y r (Suter and Kubli, 1988) and yeast t R N A p h e (Bull et al., 1987), the flanking sequences show little homology to one another. A universal exception is the presence of a tract of dT n (n= 4 or more) residues in the non-coding strand following the 3' end of the tRNA coding region. This sequence functions as a termination signal in all classes of genes transcribed by Pol III (see review by Geiduschek and Tocchini-Valentini, 1988). In a few cases homologies in the 5' flanking sequences of gene family members have also been detected. One example is the yeast t R N A 3 ^ e u gene family (Raymond and Johnson, 1983). Four copies of this family contain a conserved 15 nucleotide sequence adjacent to the 5' end of the genes. This sequence has been shown to be important for their in vitro and in vivo activity (Raymond et al., 1985, Raymond and Johnson, 1987) and is also conserved at equivalent positions in certain other classes of yeast tRNA genes. Similarly, in the silkworm Bombyx mori , the silk gland specific tRNA genes also share short stretches (25-35 bp) of highly conserved 5' and 13 3' flanking sequences in the 10 gene copies that have been analysed (Young et al., 1986). These conserved sequences are also important for function but are part of the spacer sequence of a tandem gene cluster (Underwood et al., 1988) and therefore are distinct from conserved sequences found in other dispersed gene copies. Other examples of short conserved sequences preceding tRNA coding regions include mouse t R N A A s P genes (Looney and Harding, 1983), t R N A ^ l s genes (Morry and Harding, 1986), human t R N A G 1 y genes (Pirtle et al., 1986) and certain yeast genes (Baker et al., 1982). Their significance and origin is not clear at present. By analogy with the yeast t R N A 3 ^ e u genes, they may represent functional regulatory elements that are conserved, or alternatively, may reflect short sequences that mark the recent evolution of these particular genes (see below). In either case what is clear is that with the exception of the 3' poly dT tracts, the majority of tRNA genes either within or between different tRNA gene families do not share sequences outside the genes that have obvious similarities. This may be functionally significant because these flanking sequences close to the genes are thought to play a regulatory role in the expression of tRNA genes (see below). 4.4 The evolution of tRNA gene families In some examples of cloned tRNA genes more extensive homologies are found in both the 5' and 3' flanking sequences of certain gene copies. For instance a cluster of five Drosophila t R N A G l u genes derived from chromosomal region 62A (Hosbach et al., 1980) exhibit a pattern of 5' and 3' flanking sequence homologies that suggests these genes arose by duplication of a gene doublet followed by an unequal crossing over event that converted one of these doublets into a gene triplet. In another case, two Drosophila tRNAGty genes are contained on direct repeats of 1.1-2.0 kbp (Hershey and Davidson, 1980). In these and other examples (Sharp et al., 1981, Ma et al., 1986) the flanking sequence similarities are thought to reflect the fact that these 14 DNA segments have recently duplicated and have not yet lost their homology by the random accumulation of flanking sequence nucleotide substitutions. In all cases the gene coding sequences have remained identical while the flanking sequences, although highly homologous, are distinguishable by varying degrees of nucleotide sequence divergence. These examples support the idea that tRNA gene families have evolved by successive gene duplications followed by flanking sequence divergence and coding region conservation. However it is not yet clear how members of tRNA genes become dispersed in the genome. To date all cases of tRNA gene duplication have involved gene copies located at the same chromosomal locus. The more frequent observation that gene copies within a family share little if any flanking sequence homology implies that the majority of gene families are ancient. Other possible mechanisms for the emergence of multiple gene copies include reverse transposition events similar to those proposed for the generation of pseudogenes of protein coding genes and retrotransposons (Li, 1983, Weiner et al., 1986). With the exception of the mouse t R N A P n e pseudogene (Reilly et al., 1982) there is no evidence yet that tRNA gene copies have arisen by this route. What is clear about the evolution of tRNA gene families is that tRNA coding regions remain identical or at least closely related between different gene copies over long evolutionary periods. This high similarity between redundant gene copies has led to the proposal that tRNA gene families are maintained by genetic processes of homogenization such as gene conversion (Dover, 1982, Munz et al., 1982, Cribbs et al., 1987). While gene conversion-like events between non-allelic tRNA gene copies have been shown to occur in yeast (Munz et al., 1982), it is not clear how significant this is in the evolution and maintenance of tRNA gene copies within a gene family. These small RNA molecules have extremely high information content in their primary structure and it is not inconceivable that strict functional conservation exists for every nucleotide in their mature structure (Rich and RajBhandary, 1976). 15 This is reflected by the extreme conservation exhibited between equivalent tRNAs from diverse species (reviewed by Cedergen et al., 1981). That such conservation is exhibited between many or all members of a redundant gene family could also imply that each gene copy is also subject to strict selection. In turn this suggests that the term 'redundant' may be inappropriate for describing members of a t R N A gene f a m i l y . 5. The function of gene copies within a tRNA gene family A n important question concerning tRNA gene families is whether each member is active and thus capable of at least potentially contributing to the total 4S R N A pool. In other redundant gene families there is evidence that not all gene copies are active as gene templates. For instance in the Drosophila gene families encoding 18S and 28S rRNAs as many as 65% of the gene copies contain insertions in the 28S coding region and do not give rise to transcripts in vivo (Jamrich and Miller, 1984). Also in 5S ribosomal R N A redundant gene families, 19 of 23 genes derived from the Drosophila 5S D N A cluster contain a single point mutation in the coding region and are transcriptionally inactivem vitro (Sharp et al., 1984). This suggests that functional heterogeneity may be common in redundant gene families. In the case of tRNA genes it is technically difficult to assess whether all genes are active in vivo because the products of different gene copies are often indistinguishable. In at least one example attempts have been made to assess the in vivo contribution of gene copies at specific chromosomal loci (Dunn et al., 1979b, Larsen et al., 1984) but it is difficult to interpret these results unambiguously. Alternatively, the availability of cloned gene templates and the development of cell free assays (Dingermann et al., 1981, Rajput et al., 1982) allows the comparison of tRNA gene family members in vitro . Certain aspects of in vitro activity are also manifested in vivo which 16 s u g g e s t s tha t t h e s e r e s u l t s m a y at l eas t b e a p p r o x i m a t e l y c o m p a r a b l e ( R a y m o n d et a l . , 1 9 8 5 , S c h a a c k and S o i l , 1 9 8 5 , H u i b r e g t s e et a l . , 1987 ) . 5.1 S t r u c t u r e - F u n c t i o n r e l a t i o n s h i p s i n t R N A g e n e s t R N A g e n e s a re o n e o f a d i v e r s e set o f g e n e s ( C l a s s III g e n e s ) t r a n s c r i b e d b y R N A P o l y m e r a s e II I ( P o l I I I ) . O t h e r c l a s s III g e n e s i n c l u d e t h o s e e n c o d i n g 5 S r i b o s o m a l R N A , a v a r i e t y o f s m a l l n u c l e a r and c y t o p l a s m i c R N A s ; 7 S K , 7 S L , U 6 , 4 . 5 S R N A s , a n d c e r t a i n s m a l l v i r a l R N A s ( i .e a d e n o v i r u s V A I, V A I I R N A s , E p s t e i n B a r r E B E R I, E B E R II R N A s ) . A l l t h e s e P o l III g e n e s h a v e b o t h c o m m o n a n d u n i q u e f e a t u r e s i n r e g a r d g e n e p r o m o t e r s e q u e n c e s a n d a n c i l l a r y p r o t e i n t r a n s c r i p t i o n f a c t o r s r e q u i r e d f o r t h e i r t r a n s c r i p t i o n ( r e v i e w e d b y S o l l n e r - W e b b 1 9 8 8 , G e i d u s c h e k a n d T o c c h i n i -V a l e n t i n i , 1 9 8 8 ) . T h e s t r u c t u r e - f u n c t i o n r e l a t i o n s h i p i n t R N A g e n e t r a n s c r i p t i o n h a v e b e e n r e v i e w e d e x t e n s i v e l y ( S h a r p et a l . , 1 9 8 5 , G e i d u s c h e k a n d T o c c h i n i - V a l e n t i n i , 1 9 8 8 ) . A l l t R N A g e n e s c o n t a i n w i t h i n t h e i r m a t u r e c o d i n g r e g i o n s h i g h l y c o n s e r v e d s e q u e n c e s tha t h a v e d u a l f u n c t i o n s i n b o t h d e t e r m i n i n g the f i n a l t h r e e d i m e n s i o n a l s t r u c t u r e o f t he t R N A ( r e v i e w e d b y R i c h and R a j B h a n d a r y , 1 9 7 6 ) a n d i n r e c o g n i t i o n o f p r o t e i n t r a n s c r i p t i o n f a c t o r s tha t f a c i l i t a t e r e c o g n i t i o n a n d t r a n s c r i p t i o n o f t h e g e n e t e m p l a t e b y P o l I I I ( L a s s a r et a l . , 1 9 8 3 , B u r k e a n d S o i l , 1 9 8 5 ) . T h e s e t r a n s c r i p t i o n f a c t o r s , T F I I I C a n d T F I I I B , r e c o g n i z e t w o s e p a r a t e d b l o c k s o f s e q u e n c e ( A and B I n t e r n a l Control R e g i o n s , I C R s ) that o c c u r a p p r o x i m a t e l y b e t w e e n p o s i t i o n s 8 - 1 9 a n d 5 2 - 6 2 , r e s p e c t i v e l y , o f t he m a t u r e c o d i n g r e g i o n ( H o f s t e t t e r et a l . , 1 9 8 1 , S h a r p et a l . , 1 9 8 1 ) . T h e s e f a c t o r s a p p e a r to b e u n i v e r s a l l y c o n s e r v e d i n a l l e u c a r y o t e s a n d a p p a r e n t l y f u n c t i o n i n t e r c h a n g e a b l y b e t w e e n t R N A g e n e s o f d i f f e r e n t s p e c i e s ( L a s s a r et a l . , 1 9 8 3 , S h a r p et a l . , 1 9 8 5 ) . T h e i r i n t e r a c t i o n w i t h the I C R s e q u e n c e s r esu l t s i n the f o r m a t i o n o f s t a b l e p r e - i n i t i a t i o n c o m p l e x e s that d o not d i s s o c i a t e d u r i n g m u l t i p l e r o u n d s o f t r a n s c r i p t i o n . E x p e r i m e n t a l l y , f o r m a t i o n o f 17 stable complexes is observed by the inhibition of transcription of a gene that is added to a transcription reaction that has been briefly preincubated with a different gene template. If the transcription factors are in limiting quantities then formation of a stable complex on one template sequesters them and blocks formation of similar complexes on the second template (Schaack et al., 1983, Sharp et al., 1983). Alternatively, formation of stable complexes can be observed directly by DNA protection from deoxyribonucleases (Newman et al., 1983). Disruption of the ICR sequences either by deletion or nucleotide substitution results in an inactive gene template both in vitro and in vivo (Folk and Hofstetter, 1983, Schaack and Soli, 1985, Huibreigste et al., 1987). In addition to the ICR , sequences outside the gene coding region also influence in vitro transcription of tRNA genes. For example in Drosophila the transcription of several different tRNA genes have been shown to depend on the 5' upstream sequences (Schaack et al., 1984, Dingermann et al., 1982,, Lofquist and Sharp, 1986, Sajjadi et al., 1987). Deletion or mutagenesis of these 5' flanking sequences either results in improvement of in vitro transcription rates (Defranco et al., 1982, Dingermann et al., 1982) or more frequently they lower or abolish in vitro transcription (i.e. Schaack et al., 1984). The basis of these effects does not appear to result from the ability of the template to form stable complexes, although exceptions to this may exist (Cooley et al., 1984, Morry and Harding, 1986). For example, progressive 5' deletions of Drosophila t R N A 2 A r S gene affect transcription efficiency more dramatically than the ability of the gene to form stable complexes, although the latter is disrupted as sequences closer to the ICR regions are removed (Schaack et al., 1984). Similar results were observed with other Drosophila tRNA genes (Lofquist and Sharp, 1986), yeast tRNA genes (Raymond and Johnson, 1987) and human or mouse tRNA genes (Arnold and Gross, 1987, Rooney and Harding, 1988). A generalization that emerges is that the ICR region is important for stable 18 complex formation and hence transcription competence, while the 5' flanking regions constitute a third promoter element that modulates the efficiency with which a gene may be transcribed. Also associated with this 5' flanking sequence modulation are differences in the sensitivity of in vitro transcription to the salt concentration of the reaction (Lofquist and Sharp, 1986, Young et al., 1986). This has been interpreted as reflecting the ability of Pol III to interact with these 5' flanking sequences. However it is also possible that additional protein transcription factors may be involved (Young et al., 1986, Marschalek and Dingermann, 1988). Whatever the mechanism of these modulatory sequences, they apparently are less conserved than those required for stable complex formation because the 5' sequence dependance is not observed or is less dramatic in experiments using gene templates and cell extracts from different species (Schaack and Soli, 1985, Sprague et al., 1984). 5.2 In vitro activity of gene copies within a tRNA gene family. In studies where more than one copy of a tRNA gene family have been compared in vitro the results suggest that different gene copies can have wide ranges of transcriptional activity. For example in the Drosophila t R N A 2 A r S gene family (Dingermann et al., 1982), of four identical gene copies assayed, one had high template activity (pYH48) while the remaining three genes, pi IF, p35D, and pl7D each had much lower activities (12-15% and, <1%, respectively). Similarly, different members of the t R N A 5 A s n and t R N A 2 L v s gene families (Lofquist and Sharp, 1986, Defranco et al., 1984) also exhibit large differences in in vitro transcriptional activity. In both these examples the gene coding regions, hence ICR sequences, are identical within different copies of the respective gene families, and the differences in activity were shown to be conferred by the respective upstream flanking sequences. Similar results between identical or closely related gene copies have been observed in yeast (Bull et al., 1987), Xenopus (Gouillard and Clarkson, 1986), 19 and humans (Doran et al., 1987). These differences in activity within a tRNA gene family are also suggested from in vivo studies of tRNA^y r nonsense suppressors in both S. cerevisiae (Rothstein et al., 1977) and the nematode Caenorhabditis elegans (Kondo et al., 1988). 5.3 Significance of variable tRNA gene activity within a gene family. Why different copies of a gene family encoding identical gene products should exhibit such wide differences in transcriptional activity is not known. It may simply be the fortuitous result of the apparent low conservation in these 5' flanking sequences that appear to modulate transcription rates. However the differences in modulation observed between cell extracts from different species suggests that some function of the 5' flanking sequence has evolved in a species specific manner. One of the most obvious differences encountered in studies of tRNA is the large differences in cellular abundance that occur between tRNAs that are esterified to the same or different amino acids. In unicellular organisms such as E. coli or yeast there is a strong correlation between the most frequently used codons and the abundance of the corresponding tRNA (reviewed by Ikemura, 1985, deBoer and Kastelein, 1986). The degree of this bias in codon choice is greater for genes that are highly expressed compared to those that are expressed at low levels. Also different organisms have different favoured codons and give rise to genome 'dialects', that for instance, distinguish genes that are highly expressed in yeast from those that are highly expressed in E. coli. Therefore in unicellular organisms, where one cytoplasm must meet the needs of all protein synthesis, a relationship has evolved between the abundance of tRNAs and the codon choice patterns of different protein messages. This presumably has been selected for by the metabolic cost of inefficient protein synthesis caused by codon choices that are not matched to the available tRNA levels. 20 Similar biases in codon usage are found in genes of multicellular eucaryotes. However different genes show biases for different codons and there does not appear to" be any one genome 'dialect'. In multicellular organisms differentiation often results in cells with vastly different requirements for protein synthesis and it is thought, in contrast to unicellular organisms, that the tRNA levels are tailored to meet the needs of different cell types. The best known example is in the silk gland of the silkworm B. mori where several tRNAs whose codons show high bias in silk protein mRNAs are co-ordinately induced at the developmental stage when silk is synthesized (Garel, 1982). The genes for at least one of these silk gland specific tRNAs ( tRNA A ^ a ) , differ from the constitutive t R N A A l a by only 1 nucleotide in the anticodon stem and have distinct 5' flanking sequence dependant in vitro transcription properties (Young et al., 1986). Although this is an extreme example due to the nature of silk synthesis in the silk gland, it may serve as a prototype for differential tRNA transcription in all tissues of differentiated multicellular organisms. The observation of differential in vitro transcriptional activity within a tRNA gene family may therefore be related to this tailoring of tRNA abundance levels. However, while several studies show the induction of novel tRNAs in a tissue specific manner (i.e Lin et al., 1980, Hedgcoth et al., 1984, Lin and Agris, 1980) there is no evidence of similar tissue or developmentaly regulated expression of different gene copies within a tRNA gene family. Such studies however are important in understanding the function of tRNA gene families, and in particular whether they function as as whole or whether individual components of the family have specific functions. Resolving these questions will in turn help us understand the evolution of these gene families and the significance of their maintenance within an organism's genome. 21 6. Thesis background-pDt27R and the t R N A A r g gene family in Drosophila This study began with the fortuitous isolation of a group of D. melanogaster t R N A A r g genes on the basis of their linkage to different tRNA genes for which a purified tRNA probe existed (Newton 1984, M.Sc. Dissertation, University of British Columbia). The plasmid pDt27R was originally isolated using a tRNA4/7^ e r probe and was shown to be derived from the single major X chromosome tRNA cluster located at polytene region 12E1-2 (Dunn et al., 1979a). The insert of this plasmid was shown to contain two identical t R N A 4 ^ c r genes (Cribbs et al., 1987) and an additional cluster of four t R N A A " g genes (Newton, this thesis). These genes were originally detected as a series of repeating Bam HI restriction sites that were subsequently identified as t R N A A r g genes on the basis of their 5' TCG anticodon sequence and the fact that they exhibited 84% identity to the only other known Drosophila arginine isoacceptor, t R N A 2 A r g (Silverman et al., 1979, Sprinzel et al., 1987). Each of the four tRNA coding regions were identical and contained a Bam HI restriction site located within it. The organization of these genes and the product they encode are summarized in Figure IA and IC. The interesting feature of these t R N A A r g genes was the fact that each gene coding region was contained within four tandemly repeated sequences (R1-R4) that were almost identical in spacer sequence. Two of these repeats were 200 bp in length (Rl and R2) and the remaining two repeats were 600 bp in length (R3 and R4). The two size classes differed mainly in the amount of 5' flanking sequence that was associated with each tRNA coding region. In addition one of the repeats (Rl) showed significantly higher numbers of flanking sequence polymorphisms relative to the homologous regions in the other repeats. These homologies are shown diagrammatically in Figure IB. One interpretation of this organization is that these genes arose by recent gene duplication. The different sizes and degree of homology 22 Figure 1. SummarY ttf LtL£ recombinant plasmid pDt27R. (A) shows the restriction map of the 6.5 kbp Drosophila Hind III fragment in pDt27R (Newton, 1984). The position of 6 tRNA genes are indicated by open boxes (serine tRNA genes) and closed boxes ( t R N A A r g genes). For clarity the boxes are not drawn to scale. Expanded beneath the restriction map is a summary of the nucleotide sequence of the region encoding the four repeated t R N A A r g genes (R12.1-R12.4). Each t R N A A r g gene is contained within pairs of repeats 200 bp and 600 bp in length (open bars). The repeats are composed of identical gene coding sequences (filled boxes) flanked by different lengths of almost identical flanking sequence (open bars). The direction of tRNA transcription for each gene is indicated by arrows. (B) The regions of the 200 bp and 600 bp repeats are aligned vertically according to the overlap of their flanking sequences. The cross-hatched open boxes flanking the R12.1 gene ( and above in A) indicates the decreased level of identity observed in this repeat relative to the other repeats (R12. 2-4). (C) shows the cloverleaf structure of the t R N A A r g predicted from the sequence of the four identical genes in pDt27R. The 5'phosphate (p) and 3' hydroxyl (OH) ends of the predicted tRNA are indicated. The 5TCG anticodon sequence is boxed. The first nucleotide of the Bam HI restriction site present in all four genes is indicated at position 36 . The unique cytosine residue at position 13 in the D loop is also shown. pDt27R B B B B P B P P P E _l L1LI L S 5/ R R R R • ' 1 • R12.4 RI2.3 R12.2 R12.1 600 bp repeats 200 bp repeats I I I : I -454 -30 +80 R12.4 R12.3 R12.2 RI2.I GOH PG C A T C G C G G C T A T A A ,3 G C T G T C C T 6 A C C G G C A G G T C B A G G C G T T T A j A C G A G G C G C A T C A T G m ^ f G i 36 24 between these repeats suggests that these duplications occurred in multiple steps. The biological basis or rationale for these and other proposed tRNA gene duplications (Hosbach et al., 1980) is not clear. The aim of this thesis was to characterize the entire gene family encoding this t R N A A r g species in order to provide a context into which the apparent duplication of the t R N A A r g genes in pDt27R might be better understood. Using homologous probes derived from pDt27R, additional genomic t R N A A r S gene copies were isolated and characterized by DNA sequencing. These and one other clone provided by J. Leung (Ph.D Thesis, 1988) constitute the whole gene family as judged by genomic Southern analysis and in situ hybridization experiments (S. Hayashi, unpublished results). To determine whether all the members of this gene family are transcriptionally active they were compared in vitro using Drosophila tissue culture cell extracts. Lastly the four duplicated genes in pDt27R were compared to the homologous loci from the related Drosophila sibling species in order to facilitate a more detailed reconstruction of the recent evolution of this unusual gene cluster. In addition to the question of tRNA gene duplication, it was hoped that a complete description of this tRNA gene family might lead to a better appreciation of the function and evolution of tRNA gene families in general. 25 MATERIALS AND METHODS M A T E R I A L S Enzymes - Restriction endonucleases and DNA modifying enzymes were obtained from commercial suppliers and used according to the manufacturers' recommendations; Bethesda Research Laboratories (BRL), Pharmacia (Ph), New England Biolabs (NEB), Promega (Pr), Boehringer Mannheim (BM). Creatine Phosphokinase was obtained from Sigma. RNase Inhibitor was obtained from BM or Promega. Drosophila Schneider cell SlOO extracts were obtained from L. Duncan (University of British Columbia) and HeLa cell nuclear extracts were obtained from M. Blundell (University of British Columbia). N u c l e o t i d e s - Deoxyribonucleoside triphosphates, dideoxyribonucleoside triphosphates, ribonucleoside triphosphates, and (1 )-phosphorothioate deoxyribonucleoside triphosphates were obtained from P. L Biochemicals. 3 2p_ labeled nucleotides were obtained from Amersham and New England Nuclear (Du Pont). Oligonucleotides- Synthetic oligonucleotides were synthesized by T. Atkinson on an Applied Biosystems 380B DNA synthesizer. The following oligonucleotides were used: FP1- 5" d(TCACG ACGTTGT A A AAC)-3' RP1- 5' d(TC ACACAGG A A ACAGCT)-3' Argl- 5'd(TTATCCATTAGGCCACACGG)-3' Arg2- 5'd(GACCGTGACAGGACTCG)-3' 26 Crude oligonucleotides were purified by gel electrophoresis and quantified by measuring the A26O (Maniatis et al., 1982). Autoradiography and photography equipment- Kodak XRP-1 film and Dupont Cronex intensifying screens were used for autoradiography of radiolabeled nucleic acids. Polaroid 667 film was used for fluorescence photography of EtBr stained nucleic acids. Media components- Agar, yeast extract, and Bacto-tryptone were from Difco. Ampicillin was from Sigma. Electrophoresis and chromatography supplies- Ultra-pure agarose was from BRL. DNA grade agarose was from Bio-Rad. Acrylamide and N,N,N',N' -Tetramethyl-ethylenediamine (TEMED) were from Kodak. Methylene bis-acrylamide was from MBB. Urea was from Sigma. Cellulose acetate membrane strips were obtained from Schleicher and Schuell. DEAE-cellulose plates were from Machery-Nagel and AcA-54 resin was from Pharmacia. Organic reagents- Formamide (analytical grade, BDH) was de-ionized with Bio-Rad AG501-X8-D mixed bed resin and stored in aliquots at -20°C. Phenol (Malinckrodt) was redistilled and stored in aliquots at - 2 0 ° C . Microbial strains- Recombinant DNA molecules were propagated in the following E. coli hosts: E. coli JM101 - (lac-pro). s u p E . thi. strA. sbcB 15. end A . hspR4. F' traD36. proAB. lacH. lacZ M15 (Yanisch-Perron et al., 1985). 27 E. coli DH1 - F- r e c A l . e n d A l . gvrA96. thi l . hsdR17(r' k .m + k'). supE44. r c l A l . I" (Hanahan, 1983). Bacterial cultures were grown at 37°C in 1-2X YT media (Miller, 1972) containing 0.1 mg/ml ampicillin ( Y T a m p ) and agar (1.5%) as required. Bacterial transformations with plasmid DNAs were performed as described (Maniatis et al., 1982). Drosophila strains- Drosophila melanogaster strains, Drosophila sibling species, and their sources are listed in Table 2. The flies were maintained at room temperature on Soybean-Yeast extract- Glucose media (G.M. Tener personal communication) in pint urine bottles or 30 ml glass vials. Plasmid DNAs- The plasmids pDt27R, pDt67R, pDt66R and pDt72R contain Drosophila genomic Hind III fragments in pBR322 (Dunn et al., 1979) and were obtained from R.C. Miller Jr. (University of British Columbia). The plasmid vector pEMBL8- and helper phage IR1 (Dente et al., 1983) were obtained from M. Zoller in the laboratory of M. Smith (University of British Columbia). The plasmid pYH48 was obtained from D. Soil (Yale University). Transfer RNAs- Total Drosophila tRNA contained in the 4S fraction was obtained from V. Dartnell (University of British Columbia). Purified Drosophila t R N A 4 A r S was obtained from I.C. Gillam (University of British Columbia). 28 METHODS 1. Preparation of plasmids DNAs Plasmids DNAs were prepared from 250-500 ml E. coli cultures grown in 2X Y T a m p . Nucleic acids were isolated by the alkaline lysis (Maniatis et al, 1982) and further purified by banding in CsCl-EtBr gradients. For CsCl gradients, ethanol pellets from alkaline lysis crude lysates (per 250 ml culture) were redissolved in 7.6 ml TE. To this solution was added 0.2 ml 0.5 M EDTA (pH 8.0), 8.4 g of powdered CsCl, and 0.6 ml EtBr (10 mg/ml). An optional step was to remove the resulting precipitate by centrifugation in a SS34 rotor for 5' at 10K rpm. The solution was loaded into two 5.1 ml Quick-Seal polyallomer tubes (Beckman) and centrifuged in a VTi65 rotor at 65K rpm for 4-5 hours or at 50K rpm overnight at 20°C. The plasmid DNA bands were removed with an 18-21 gauge syringe, diluted twice with water, extracted three times with n-butanol (saturated with CsCl and water), and then precipitated twice with 2.5 volumes of ethanol (95%). The final ethanol precipitation was performed in 2.5 M ammonium acetate (pH 7.0). After washing the pellet with ethanol, the DNA was redissolved in TE and quantified by spectrophotometry at 260 nm as described by Maniatis et al. (1982). Small scale plasmid (pEMBL) preparations were prepared by alkaline lysis from 2 ml Y T a m p cultures grown to saturation. To remove contaminating RNA the pellet was redissolved in 0.1 ml TE and treated for 1 hour with 5 u.1 RNase A (5 mg/ml, boiled for 10 minutes). This was mixed with 0.06 ml 20% polyethylene glycol 8000/2.5 M ammonium acetate (pH 7.0) and chilled on ice for 15 -30 minutes. After centrifugation for 10 minutes in a microfuge (12,000 x g, room temperature) the supernatant was removed with a fine tipped Pasteur pipette. The pellet was washed with 80% ethanol, chilled at -70°C for 5 minutes, spun in the microfuge for 1 minute, dried in vacuo , and re-dissolved in 0.05 ml TE. This plasmid DNA was suitable for 29 double stranded DNA sequencing by the dideoxy terminator method (Sanger et al., 1977 and below). 2. Preparation of single stranded DNA from pEMBL plasmids A single E. coli colony harbouring a pEMBL plasmid was picked from a plate spread the previous night and used to inoculate a 2 ml culture of 2X Y T a m p . The culture was shaken vigorously at 3 7 ° C until the A goo nm w a s approximately 0.1 to 0.2 (1-2 hours). The culture was then infected with 0.01 ml of helper bacteriophage IR1 phage supernatant (10^- 10*0 pfu.) and grown until the AgQO reached 0.5 to 0.6 (5-6 hours). The culture was cleared by centrifuging 1 minute in a microfuge and 1.0 ml phage supernatant was transferred to a new tube. The phage were collected by mixing with 0.3 ml 2.5 M ammonium acetate/20% PEG and chilling on ice 15-30'. After centrifugation in a microfuge at 4 ° C for 10' the supernatants were withdrawn with a flame drawn Pasteur pipette. The last droplets of supernatant were collected by a brief spin and removed. The phage pellet was redissolved in 0.2 ml TE and extracted once with an equal volume of TE equilibrated phenol and then twice more with an equal volume of phenol: chloroform (1:1). The aqueous phase was mixed with 0.5 volumes of ammonium acetate and precipitated with 2.5 volumes of ethanol. After washing with ethanol the pellet was redissolved in 50 p.1 of TE. 3. Preparation of Drosophila genomic DNA Genomic DNA was prepared from adult flies that had been frozen at -70° C . Approximately 0.1-0.2 g (100-200 flies) were ground to a paste with a flame rounded glass rod in the bottom of a 1.8 ml microfuge tube containing 0.3 ml 100 mM Tris-HCl (pH 9.4), 200 mM NaCl, 10 mM ETDA, and 0.5% SDS (on ice). The paste was diluted with 0.5 ml of the same buffer and placed at 70° C for 20 minutes. This was mixed with 0.15 30 ml 8 M potassium acetate, chilled on ice for 30 minutes, and spun in a microfuge for 10 minutes at 4 ° C . The supernatant was extracted twice with 0.8 ml of phenol: chloroform (1:1) and precipitated at room temperature with two volumes of ethanol. The pellet was redissolved in 0.4 ml TE, treated with RNase A , re-extracted once with phenol: chloroform, and precipitated with 2.5 volumes of ethanol. The pellet was redissolved in 0.1-0.2 ml TE and quantified by electrophoresis in 0.5% agarose gels. The yield varied from 50-100 ug and could be stored at 4 ° C for more than a year without significant degradation. This DNA was suitable for restriction endonuclease analysis and cloning from size fractionated DNAs (see below). 4. General nucleic acid techniques Restriction endonuclease digestions, DNA fragment subcloning, agarose and polyacrylamide electrophoresis, Southern blotting, and other standard techniques were performed essentially as described by Maniatis et al. (1982). Minor modifications and choices of standard techniques are described briefly below. - Precipitation of nucleic acids with ethanol: DNA or RNA was precipitated from aqueous solutions by adding one half volume of 7.5 M ammonium acetate (final 2.5 M), 2.5 to 3 volumes of ethanol, and chilling at -70° C for 30 minutes or -20° C overnight. The precipitates were collected by centrifugation in a microfuge for 10 minutes at room temperature or in a Sorvall SS34 rotor at 12,000 x g at 4 ° C for 20 minutes. DNA pellets were then washed with 80% ethanol and dried in vacuo . - Elution of DNA fragments from agarose gels: after staining with EtBr the agarose gel (DNA grade, BioRad) or slices thereof, were protected from visible light and given only minimal exposure to UV radiation (preferably at longer wavelengths, 302-360 nm). DNA fragments were eluted from gel slices by electroelution into dialysis tubing using IX T A E buffer. After the eluate was removed, the tubing was washed with a small volume of TE, the two combined, and then concentrated by 31 several extractions with n-butanol. This concentrated eluant was extracted once with phenol: chloroform and precipitated with ethanol as above. -Dephosphorylation with calf intestinal phosphatase (CIP): The 5' ends of DNA fragments (5 pmoles) were dephosphorylated with CIP (BM, 25,000 units /ml) by adding 10 units of enzyme directly to the restriction buffer and incubating for 30 minutes at 3 7 ° C (for 5' overhanging ends) or 5 0 ° C (for 3' overhanging or blunt ends). Dephosphorylated DNAs were extracted once with phenol :chloro form and recovered by precipitation with ethanol. -Filling in 3' ends with the Klenow fragment of DNA polymerase I: Approximately 1-2 pmoles of 3' ends were incubated at 37° C for 10 minutes in 10-20 ul reactions of 50 mM Tris-HCl (pH 7.5), 10 mM MgCl 2 , 10 mM DTT, 100 ug/ml BSA, 0.1 mM dNTPs, and 1-2 units of Klenow enzyme. The reactions were stopped with EDTA (10 mM final) and heated at 75° C for 10 minutes. If necessary, excess dNTPs were removed by precipitation with ethanol and ammonium acetate. - Ligations with T 4 D N A ligase: DNAs (0.1-0.2 ug) were mixed in 10-20 ul 50 mM Tris-Cl (pH 8.0), 10 mM MgCl2, 0.1 mg/ml BSA, 10 mM DTT, 1 mM ATP, 0.5-1 unit T 4 D N A ligase and incubated at 15°C overnight. - Treatment of DNA with exonuclease III: DNA to be digested was suspended at 100-150 pmoles/ml in 50 mM Tris-Cl (pH 8.0), 10 mM MgCl2, 10 mM DTT, 0.1 mg/ml BSA containing 2-5000 units/ml exonuclease III (BM). After incubation at 37° C for appropriate time (30 seconds to 5 minutes) the DNA solution was added to an equal volume of 2X SI buffer (10X= 2M NaCl, 0.5 M NaOAC [pH 4.5], 10 mM ZnSC>4, 5% glycerol) containing 250 units/ml SI nuclease. After digestion at room temperature for 30 minutes the reactions were terminated by phenol: chloroform extraction and precipitated with ethanol and ammonium acetate. For religation the digested DNAs were treated with Klenow enzyme as described above. 5. Labelling of Nucleic acid probes -Nick translation of DNA fragments with DNA polymerase I (Rigby et al., 1977): DNA fragments were purified from agarose gels and nick translated in 50 pi 50 mM Tris-Cl (pH 7.5), 5 mM MgCl2, 0.1 mg/ml BSA, 5 mM DTT, 0.2 mM CaCl2, 20 pM each unlabeled dNTP, 1. 8 p M [<x32P] dNTP (500 Ci/mmole, NEN), 1 ng/ml DNase I (freshly diluted in 10 mM Tris-Cl pH7.5, 5 mM MgCl2, 1 mg/ml BSA), lOpg/ml DNA, 10 units DNA polymerase I, and incubated at 15°C for 2.5 hours. The reactions were stopped with EDTA and SDS to 10 mM and 1 %, respectively, heated to 68°C for 10 minutes and mixed with 25 pg carrier tRNA. Unincorporated nucleotides were then removed by two precipitations with ethanol as described above. Prior to addition to the hybridization solution the labelled DNA was denatured by boiling for 10 minutes and quickly chilled on ice. -Labeling single strand RNA with T7 RNA Polymerase (Melton et al., 1984): pArg plasmid DNA (1 pg) was linearized with Hind III and purified by phenol extraction and precipitation with ethanol. The DNA was redissolved in diethyl pyrocarbonate treated water (Maniatis et al., 1982) and transcribed in 20 pi 50 mM Tris-Cl (pH 7.5), 6 mM MgCl2, 2 mM spermidine, 5 mM DTT, 0.5 mM each unlabeled rNTP, 25 u,M [a 3 2 P] rNTP (200 Ci/mmole, Amersham), 1000 units/ml RNAsin (Promega), and 70 units T 7 RNA Polymerase (BM). After incubation for 1 hour at 37°C the DNA was digested with 100 units RNase-free DNase (BRL) for 10 minutes at 37°C and purified by phenol: chloroform extraction. Carrier tRNA (25 pg) was added and together were precipitated with ethanol. -3' end labeling of tRNA with T4 RNA ligase (England and Uhlenbeck, 1978) : [ 3 2 P]pCp was synthesized in 10 p.1 containing 1.2 nM Cp (2'+3')( 50 mM Tris-Cl (pH 8.0), 10 mM MgCl2, 10 mM DTT, 100 pCi [Y32P] ATP (3000 Ci/mmole, NEN) and 5 units of T4 polynucleotide kinase. The reaction was incubated at 37° C for 1 hour and stopped by heating to 100°C for 1 minute. For 3' end labeling tRNA, 5 u.1 [ 3 2 P] p Cp was added to 50 pmoles tRNA (1.5 pg) in 20 u.1 containing 50 mM Hepes (pH 7.5), 15 mM M g C l 2 , 5 mM DTT, 0.1 mg/ml BSA, 10% DMSO, 20 uM ATP, and 5-10 units T 4 RNA ligase (P.L. Biochemicals). After incubation overnight at 4 ° C the reaction was stopped with 25 mM EDTA, 1% SDS and heated at 65°C for 10 minutes. The labelled tRNA was mixed with 10 mg £. coli tRNA and purified by chromatography over AcA54 resin (10 cm X 0.5 cm) in 200 mM NaCl, 10 mM Tris, 2 mM EDTA, 0.1% SDS and precipitated with ethanol. -5' end labelling of synthetic oligonucleotides with T4 polynucleotide kinase (Maniatis et al., 1982): 100 uCi [Y 3 2P] ATP (3000 Ci/mmole, NEN) and 10 pmoles oligonucleotide in a 10 ul volume containing 50 mM Tris-Cl (pH 8.0), 10 mM MgCl2, 10 mM DTT, and 10 units T4 polynucleotide kinase were incubated 1 hour at 37°C, the reaction was stopped by heating at 100° C for 1 minute. The labelled oligonucleotide was either used directly or purified by two precipitations with ethanol in the presence of 10 ug carrier E. coli tRNA. 6. Filter hybridizations - f 3 2 P] DNA/DNA hybridizations were performed at 65°C in 5X SSPE, 5X Denhardt's solution, 0.5% SDS, and 0.1-0.2 mg/ml sonicated, denatured salmon sperm DNA. Nitrocellulose (Schleicher and Schuell BA54) or nylon filters (Hybond-N, Amersham) containing bound DNAs were prehybridized 1-12 hours and then fresh hybridization solution was added containing 1-5 x l O 6 cpm/ml (Cerenkov counts) of denatured 3 2P-labeled probe. Hybridization was continued for 12-24 hours and the filters were washed in 2-0.2 X SSPE containing 0.5% SDS at 65°C . - [ 3 2 P] RNA/DNA hybridizations were performed at 37-42°C in 50% de-ionized formamide, 5X SSPE, 5X Denhardt's solution, 0.5% SDS and 50-100 ug/ml E. coli tRNA. High stringency hybridizations were at 42° C. Hybridization times and washes were as above. The excess moisture was removed from the filters with Kimwipes and they were wrapped in Saran wrap and exposed to autoradiography. If an intensifying screen was necessary the films were exposed at -70° C . 7. Cloning of size fractionated DNAs into pEMBL mini-libraries Drosophila genomic DNA (25-50 pg) was digested to completion with restriction endonucleases, purified by phenol: chloroform extraction, and precipitated with ethanol as above. The DNA fragments were redissolved in IX T A E buffer and fractionated by electrophoresis in 0.7-1.0 % agarose gels containing IX T A E buffer. After visualizing the DNA by staining in 1 pg/ml EtBr, the gel was protected from visible light. The appropriate size range of DNA was cut out in a gel band and purified as described above. The DNA fragments were then ligated into pEMBL vectors with increasing molar ratios of insert to vector. Ligation mixtures containing the optimal molar ratio of DNAs (eg. maximum # of ampr colonies) were used to transform E. coli DH1 (Hanahan, 1984) and resulting ampicillin resistant colonies were selected on Y T ^ p plates. Colonies (500-5000, 200-500/plate) grown to diameter of less than 1 mm were chilled at 4 ° C several hours and then transferred to 98 cm nitrocellulose (S&S) or Hybond-N (Amersham) filter circles (Grunstein and Hogness, 1975). Pin pricks were used to orient the filters to these master plates. The filters were transferred to fresh Y T a m p plates and grown until 1-2 mm in diameter. The master plates were regrown until the colonies were visible and stored at 4 ° C wrapped in Cellophane. The filters were then denatured and neutralized by treatment for 5' with 0.5 N NaOH/1.5 M NaCl and 1.0 M Tris-Cl/1.5 M NaCl (pH 7.5), respectively. The filters were air dried and baked 2 hour at 80°C (nitrocellulose filters) or exposed to UV on a transilluminator (260 nm) for 3 minutes (nylon filters). The filters were then scrubbed in 2X SSPE until no cell debris remained, air dried, and then screened by filter hybridization (above). 8. DNA sequencing with single strand or double strand templates All DNA sequencing was done by the dideoxy terminator method of Sanger et al., (1977) using synthetic oligonucleotides (FP1 and FP2) using either single stranded or double stranded pEMBL recombinant DNAs. Single stranded pEMBL8- DNAs were prepared as described above. For sequencing double stranded templates plasmid DNA (2-3 pg) was denatured with 0.2 N NaOH for 5' at room temperature. The DNA was neutralized by addition of one-half volume of 7.5 M ammonium acetate (pH 7.0) and precipitated with 3 volumes of ethanol. The DNA pellets were washed with 80% ethanol, dried in vacuo , and redissolved in 5 p.1 of water. This solution was mixed with 2 ul 10X HIN.LS buffer (100 mM NaCl, 500 mM Tris-Cl (pH 7.5), 100 mM MgCl2), 1 u.1 17-mer (FP1 or RP1, 5 pg/ml), hybridized at 37°C for 15' and then chilled on ice. To this was added 1 pi of 15 pM dATP, 1 pi 0.1 M DTT, 1 -1.5 ul of [a 3 2 P] dATP (3000 Ci/mmole, NEN). , and 4.5 pi of a solution containing 1 mg/ml BSA, 100 mM potassium phosphate (pH 7.5) and 500 units /ml Klenow enzyme. This mix was divide into four 3.5 pi aliquots to which were added 1.5 pi of each dideoxy/deoxyribonucleotide mix (Sanger et al., 1977). The four reactions were placed at 37°C for 5-10 minutes and then chased with l p l of 2 mM each dNTP for an additional 5-10 minutes. After addition of 5 pi 98% formamide dye mix (XC and BPB at 0.1%) the reactions were heated at 8 0 ° C for 3 minutes. Aliquots (0.5-1.0 pi) were loaded onto 5-8% polyacrylamide: bis (29:1) gels and fractionated by electrophoresis as described (Maxam and Gilbert, 1980). Often the 8% gels were wedged from top to bottom (0.2-0.6 mm) using different numbers of spacers. Single stranded pEMBL DNA templates were treated similarly except that the concentration of DNA was reduced (0.5 pg) and the template DNA: primer mixture was heated to 65° C and slowly cooled to room temperature instead of treating with NaOH. Subsequent steps were the same as for double stranded templates. DNA sequences were usually obtained from sets of overlapping deletions generated with exonuclease III and SI nuclease. These strategies were described previously (Newton, M.Sc, 1984, Henikoff, 1984). All sequences were compiled using the DBUTIL programs of Staden (1981). 9. ln vitro transcription of tRNA genes Transcription reactions were performed as described by St.Louis and Speigelman (1985). Standard transcription reactions were performed in 25 1 containing 9 - 12.5 pi of Drosophila Schneider cell S100 extract or Hela cell nuclear extract (both approximately 6 mg/ml total protein), 10% glycerol, 20 mM Tris-Cl (pH 7.9), 5 mM M g C l 2 , 3 mM DTT, 100 mM KCL, 2.5 ug/ml a-amanitin , 6 units/ml creatine phosphokinase, 5 mM creatine phosphate, 0.6 mM each unlabeled ribonucleoside triphosphates, and 25 uM (cc32p).labeled ribonucleotide triphosphate (3-5 Ci/mmole). The template DNA concentration was varied from 0.1- 5.0 ug/ml (5-250 ng/reaction) in the presence of sufficient pEMBL DNA to maintain a constant 20 ug/ml. The reactions were incubated at 2 4 ° C for 90 minutes and terminated by addition of 25 ul of Stop solution (0.3M NaCl, 0.5% SDS, 1 mg/ml proteinase K, 50 ug/ml E. coli tRNA). After 15 minutes at 37°C the reactions were extracted with phenol: chloroform and precipitated with ethanol. The nucleic acids were redissolved in 5 pi of Urea buffer (7 M Urea, 5 mM EDTA, 0.1% BPB/XC), heated at 65°C for 5*. and fractionated by electrophoresis in 8-10% polyacrylamide-urea gels (Maxam and Gilbert, 1977). After autoradiography, the bands corresponding to the transcription products were excised from the gel and quantified by counting Cerenkov radiation. In some experiments the KC1 concentration of the reaction was varied from 45 mM KCL to 125 mM KC1. In these cases the lower S100 extract was used (9.0 ul). Also in some cases Hela cell extracts were substituted for Drosophila extracts but the conditions were 37 otherwise identical. The template preincubation assays are described in the respective figure legends but otherwise used the conditions described above. 10. Analysis of in vitro transcription products -RNase T l Fingerprinting: [a 32p] GTP- labeled transcripts synthesized in vitro were excised from gels and eluted overnight at 37°C in a small volume of 0.3 M NaCl, 10 mM Tris-Cl (pH 7.5), 1 mM EDTA, 0.1% SDS. After extraction with phenol: chloroform and precipitation with ethanol, the labelled RNA (5000-20,000 cpm Cerenkov radiation) was digested with 10 units RNase Tj (BM) in 5 pi TE for 30' at 37°C. The products were dessicated in vacuo and redissolved in 2 pi 98% de-ionized formamide. The mix of oligoribonucleotides and a 1 pi of dye markers (0.33% each XC, orange G, acid fuchsin) were spotted to the origin of a cellulose acetate strip (3 cm X 35 cm) that had been soaked in 5% HOAc, 7M Urea, 5 mM EDTA (pH 3.5), blotted dry, and covered with Saran wrap. The oligoribonucleotides were then fractionated in the first dimension by high voltage electrophoresis (Shandon) at 3000 V until the XC separated 8-10 cm. The products were then transferred by capillary blotting with water to a DEAE cellulose plate (20 cm X 20 cm) that had previously been developed at 6 5 ° C in 1 mM EDTA. The origin was washed with water, dried and then developed in the second dimension with 20 mM KOH homomix (Krupp and Gross, 1983) at 65° C until the front had reached the top. The DEAE cellulose plate was then dried and exposed to autoradiography. -5' end analysis by primer extension with Reverse Transcriptase: In vitro transcripts synthesized in reactions containing all unlabeled ribonucleotides (625 pM) were purified and redissolved in 8 pi. To this was added 1 pi (3 pmoles) of Argl 20-mer labelled at its 5' end with [y 3 2P] ATP. The RNA and primer were heated at 80°C for 1 minute, brought to 0.3 M with 1 pi 3 M NaCl, placed at 65°C for 10 minutes, and slowly cooled to room temperature. The reaction was then brought up to 50 pi in 100 38 mM KC1, 10 mM MgCl2, 100 mM Tris-Cl (pH 8.3 at 42°C), 10 mM DTT, 4% DMSO, 0.5 mM each dNTP, 500 units/ml RNasin (Promega) and 50 units/ml A M V reverse transcriptase. After incubation for 1 hour at 42-55° C, the reaction was stopped with EDTA (10 mM), treated with RNase A (50 pg/ml) for 15' at 37°C, and then extracted with phenohchloroform and precipitated with ethanol. The extension products were analysed on 10-12% urea-polyacrylamide gels and visualized by autoradiography. The sizes of the extension products were determined from dideoxy sequencing ladders generated with the same end-labelled primer and denatured plasmid DNAs containing a t R N A A r g template. ll.Plasmid constructions These constructions, with the exception of pArg and pD27, are summarized in Figure 4 and are named according the the t R N A A r g gene they contain. - pArg : This subclone of pDt27R (Figure 1) was originally constructed by J. Leung in M13mp9 (1988). It consists of a 81 bp Hae III/ Dde I restriction fragment made blunt ended with Klenow enzyme. This fragment contains the coding region of R12.2 ( t R N A A r g ) from the Hae III site at position 11 to a Dde I site 3 bp outside the 3' end of the gene. For this work the insert was removed from M13mp9 by digestion with Hind III/Eco RI and the resulting fragment was ligated into the corresponding sites of the T7/SP6 transcription vector SPT18 (BM). -pA27: A 3.2 kbp Hind III/Pst I fragment of pDt27R contained in the corresponding sites of M13mp9 was linearized at the Hind III site and treated with Exonuclease III. Aliquots were taken at 15, 30 and 45 seconds and treated with SI Nuclease as described above. The deleted DNAs were then digested with Pst I and the released blunt end/Pst I end inserts were recloned into the Sma I/Pst I sites of pEMBL 8-. The extent of deletion was determined by dideoxy sequencing with the RP universal primer. A subclone was selected that removed 669 bp from the original Hind III site 39 of pDt27R. This fragment is lacking both t R N A ^ e r g e nes but retains all the duplicated t R N A A r g regions of pDt27R. - pR12.4 : This Ava II fragments of pDt27R were first made blunt with Klenow enzyme and then inserted into the Sma I site of pEMBL8-. A recombinant was selected that contains a 597 Ava II fragment. This fragment contains the R12.4 gene contained in the 600 bp repeat R4. It consists of 154 bp of 5' flanking sequence and 370 bp of 3' flanking sequence. The 3' flank is composed of 80 bp of unique flank joined to the 5' flank of the downstream R3 repeat. -pR12.2 : This plasmid contains a 205 bp Dra I fragment of pDt27R inserted directly into the Sma I site of pEMBL8-. This fragment contains a gene (R12.2) found in one of the 200 bp repeats (R2) of pDt27R. The 5' flank is composed of the 30 bp of endgenous flank linked to the upstream 76 bp of 3' flank in the adjacent R l repeat. The 3' flank of pR12.2 ends 22 bp downstream from the gene just after the poly dT termination signal where it then joins the Sma I site of pEMBL. - pR12.5 : This plasmid contains a derivative of the 850 bp Hha I fragment of pDtl7R. This fragment contained the R12.5 t R N A A r g gene and a t R N A ^ e r g e n e located 276 bp downstream. The fragment was first made blunt by treatment with Klenow enzyme and then redigested with Eco Rl to generate a 702 bp fragment that was ligated into the Eco Rl/Sma I sites of pEMBL8+. The resulting recombinant was linearized by digestion with the Sal I and Pst I sites in the poly linker site of pEMBL (downstream from the tRNA genes). The linearized DNA was treated with Exonuclease III to remove 180 bp that contained the t R N A ^ e r gene coding sequence. The remaining 522 bp insert was recircularized and contains only the R12.5 coding sequence. - pR85.1 : This plasmid contains the 1.05 kbp Pst 1 fragment of pDt85C ligated directly into the Pst 1 site of pEMBL8+. This clone contains approximately 430 bp of 5' 40 flanking sequence, the R85.1 t R N A A r g coding region and 539 bp of 3" flanking sequence. - pR85.2 : This plasmid contains the 1.7 kbp Eco Rl fragment of pDt85C ligated into the Eco Rl site of pEMBL 8+. The R85.2 gene in this fragment contains 413 bp of 5' flanking sequence and approximately 11.3 kbp of 3' flanking sequence. - pR83.1 : This plasmid contains the 1.35 kbp Hind II of pDt66R inserted into the Sma I site of pEMBL8-. The R83.1 gene is flanked by 141 bp of 5' sequence and approximately 700 bp of 3' sequence. Preceding the 5' flanking sequence is an additional 400 bp Hind II/ Hind III region of pBR322 that was subcloned along with the D. melanogaster sequences. - pR19.1: This plasmid contains the 1.0 kbp Hind III/Pst I fragment of pDt67R ligated directly into the corresponding sites of pEMBL8+. The R19.1 t R N A A r S gene is flanked by approximately 580 bp of 5' sequence and 450 bp of 3' sequence. - pR573': This plasmid is a fusion of the pR12.4 and pR85.2 genes at their common Bam HI site located within the tRNA coding region. pR12.4 was digested with Bam HI, treated with CIP, and then purified by phenol extraction and ethanol precipitation. This releases a fragment containing the 5' flanking sequence and the 5' half of the t R N A A r S coding sequence. The 3' half of the gene and adjacent flanking sequence remain attached to the vector. pR85.2 was then digested with EcoRI and treated with CIP. This DNA was then redigested with Bam HI to release a Bam HI fragment that contains the 5' flank and 5' half of the gene coding region. The pR12.4 and pR85.2 DNAs were then mixed and ligated via their Bam HI sites for transformation into E. coli . The only combination of ligatable fragments and vectors that will give rise to ampicillin resistant colonies are the 5' flank and gene half from pR85.2 inserting into the 3' gene half and flanking sequence of pR12.4 The 5' half of this fusion coding region contains the T13 nucleotide that distinguishes pR85.2 from the original pR12.4 plasmid. This construction was confirmed by sequencing both ends 41 of the fused insert to show that the 3' flank was derived from pR12.4 and that the 5' flank was derived from pR85.2 (data not shown). - p R 1 2 . 4 T 1 3 : The C13 nucleotide of pR12.4 was changed to a T13 residue using the 20-mer oligonucleotide Argl (see Materials) as described by Zoller and Smith (1984). Briefly, Argl was phosphorylated with ATP and T4 polynucleotide kinase. Approximately 0.5 pg single stranded pR12.4 DNA and 50 pmoles phosphorylated Argl were mixed in 10 pi 100 mM Tris-Cl pH 7.5, 20 mM NaCl, 20 mM MgCl2, heated to 6 5 ° C , and slowly cooled to room temperature. The mixture was brought to 20 pi containing 10 mM DTT, 0.1 mg/ml BSA, 0.5 mM ATP, 0.5 mM each dNTP, 3 units T 4 T4 DNA Ligase, and 2.5 units Klenow enzyme. After incubation overnight at 15°C this DNA was used to transform E. coli DHL Resulting ampicillin resistant colonies were transferred to Hybond N fdters and screened with 5' 3 2P-labelled Argl at 4 2 ° C in 6X SSC, 5 x Denhardt's solution, 0.1% SDS for differential colony hybridization. After washing at 37°C in 6x SSPE, 0.1% SDS approximately 2% of the recombinant clones hybridized Argl more intensely than background colonies. DNA sequencing of one of these stronger hybridizing colonies with the t R N A A r S specific 17-mer primer Arg2 showed it contained the correct C13-T13 transition (data not shown). 42 R E S U L T S a n d DISCUSSION Part 1: The structure of the t R N A A r 8 gene family in Drosophila The first part of this study describes the structural analysis of the t R N A A " g gene family in Oregon R strains of Drosophila melanogaster. The four t R N A A r g genes found originally in pDt27R were likely to constitute only part of a larger gene family. Additional copies of this gene family were identified by hybridization with a homologous probe (pArg) prepared from the t R N A A r 8 gene coding regions of pDt27R. The chromosomal sites to which this probe was homologous were identified by in situ hybridization. The restriction fragments containing these homologous sequences were then identified in plasmids that had previously been cloned with different tRNA probes (Dunn et al., 1979a) or where necessary, were cloned directly using pArg as a homologous probe. 1- Genomic organization of t R N A A r g gene family 1. Un situ hybridization with pArg The overall chromosomal organization of this family was determined by in situ hybridization of pArg to squashes of polytene salivary chromosomes (courtesy of Dr. S. Hayashi). The results in Figure 2 show that in addition to the 12E1-2 site from which pDt27R was originally derived, three different chromosomal sites hybridize this probe. A second X chromosome locus is located at polytene region 19F near the chromocentre and two autosomal sites occur on chromosome 3R at polytene regions 85C and 83AB. Differences in signal intensity between these sites relative to the minimum of four pDt27R gene copies located at 12E1-2 suggest that the 85C site contains approximately 2-3 gene copies and that the 19F and 83AB sites contain only 1 gene copy (S. Hayashi, personal communication). This shows that the four 43 Figure 2 Chromosomal organizat ion of. t K N A A r 8 genes in—Q—. melanogaster. Squashes of salivary chromosomes from third instar D . melanogaster larvae (gt^w3/ y sc In (1) g t x 1 1 ) were hybridized in 70 % formamide buffer at 35°C (Hayashi et al., 1980) with [ 1 2 5 I ] CTP labelled T7 RNA polymerase transcripts of the t R N A A r g probe pArg. Sites of hybridization were visualized by autoradiography and are assigned to the polytene regions indicated by arrows (courtesy Dr. S. Hayashi). 44 45 t R N A A r S genes found in pDt27R are indeed part of a larger family of identical or closely related gene copies. 1.2 Genomic Southern analysis To identify the genomic DNA restriction fragments that carry these gene copies, Southern analysis of Oregon-R Drosophila genomic DNA was performed using the pArg probe. The results show that a total of 5 different sized Hind III fragments hybridize the probe (Figure 3). The strongest hybridization signal arises over a fragment that is 6.5 kbp in length and therefore likely corresponds to the equivalent sized insert of pDt27R containing four t R N A A f g gene copies . The remaining fragments are 9.5 kbp, 8.1 kbp, 5.5 kbp, and 3.0 kbp in size. The signal intensity of the 5.5 kbp fragment is approximately one half that of pDt27R and therefore is predicted to contain at least two gene copies. The remaining 9.5 kbp, 8.1 kbp, and 3.0 kbp fragments are weaker still and therefore probably contain only single gene copies. These data show at least 5 different genomic Hind III fragments sequences hybridize with the tRNAArg probe. One likely corresponds to the previously cloned insert of pDt27R while the remaining four fragments contain additional members of this gene family. The only other known t R N A A r g related sequences in the Drosophila genome are the genes encoding tRNA.2A r& . Although closely related in primary sequence (84%), these two arginine isoacceptors do not cross-hybridize under the stringent conditions of in situ hybridization (Hayashi et al., 1980) and suggests that the bands seen in Figure 3 are specific to genes that are identical or closely related to those found in pDt27R. One additional Hind III fragment is not visible in Figure 3. This fragment also contains t R N A A r g sequences but was missed from this analysis because of its small size (359 bp). This fragment was isolated by J. Leung (1988,) and is described below. 46 Figure 3. Genomic Southern analysis of t R N A A r 8 coding sentiences in D . melanogaster. Approximately 5 pg of D. melanogaster (Oregon R) genomic DNA was digested with restriction endonucleases and fractionated by electrophoresis through 1.0 % agarose gels. After denaturation and transfer to nylon membranes, the bound DNA fragments were hybridized to P-labelled pArg transcripts and exposed to autoradiography. Digestions were performed with Hind III (lane a), Eco RI (lane b), Pst I (lane c) and Bam HI (lane d). The size standards on the right were 1 DNA digested with Hind III and EcoR I. The sizes of the Hind III fragments that hybridize pArg are indicated on the left with arrows. 48 2. M o l e c u l a r a n a l y s i s QL t R N A A r g gene family 2.1 Identification and isolation of recombinant clones containing t R N A A r g gene  family Recombinant plasmids containing each of the Hind III fragments visible on genomic Southerns were either identified from previously cloned Drosophila Hind III fragments (Dunn et al., 1979a) or were cloned directly using t R N A A r g probes. The chromosomal location of each recombinant insert was identified by in situ hybridization (S. Hayashi, unpublished results) and the corresponding genomic fragments were confirmed by Southern blotting (data not shown). A summary of these plasmids is given in Table 1. The regions of the inserts that contained the t R N A A r g coding region were determined by DNA sequencing and are shown in Appendix I. -pDt66R. pDt67R. and pDt72R: Two of the Drosophila Hind III fragments identified in Figure 3 were isolated previously on the basis of hybridization to a tRNA5^ys preparation (Dunn et al., 1979a). These Hind III fragments were subsequently shown by in situ hybridization to not be derived from known tRNA5^ys joc} (DeFranco et al., 1982) but instead were derived from two of the t R N A A r g chromosomal loci identified in Figure 2 (S. Hayashi, unpublished results). pDt66R contains the 8.1 kbp Hind III fragment seen in Figure 3 and is derived from the pArg site at polytene region 83AB. pDt67R contains the 3.0 kbp genomic Hind III fragment and is derived from the second X chromosome site located at 19F. A third Hind III fragment contained in pDt72R was also isolated with this lysine tRNA preparation. It is 6.5 kbp in length and therefore should co-migrate on 49 Table 1. Summary of plasmids containing t R N A ^ gene family Plasmid Hind III fragment Chromo-some . Polytene site tRNAAr9 genes Polymorphic sites 13 16 37 pDt27R* 6.5 kbp X 12E1-2 R12.1 R12.2 R12.3 R12.4 C C c c T T T T 0 e Q 0 pR12.6 0.38 kbp X 12E1-2 R12.6 T T G pDtl7R* 9.5 kbp X 12E1-2 R12.5 T, T G pDt67R 3.0 kbp X 19F R19.1 T T G pDt85C 5.5 kbp 3R 85C R85.1 R85.2 T T T T G G pDt66R 8.1 kbp 3R 83AB R83.1 T A A pDt72R** 6.5 kbp ? ? R\?.l - - G These plasmids also contain genes encoding tRNA^Ser a n (j a r e described in detail by Leung (PhD Thesis 1988) and by Crfbbs et al (1987). This plasmid could not be localized to a specific genomic fragment or chromosomal site due to cross-hybridization with repeated sequences in D. meJanogaster genomes. 50 genomic Southerns with the similar sized insert in pDt27R. This fragment cross hybridizes with Drosophila repetitive sequences (see Part IV) and could not be localized to a single polytene region in situ (S. Hayashi, unpublished observations) or to a single band on genomic Southerns (data not shown). The reason these plasmids were isolated with a tRNA5^y s probe was subsequently explained by the fact that this tRNA preparation also contained a contaminant tRNA whose partial sequence closely matched that of the t R N A A r S sequence predicted from pDt27R (Cribbs 1982a, Cribbs et al., 1982). An unusual feature of this tRNA, and in part, why it went undetected in the tRNA5^ys preparation, was that it is uniformly lacking 5 nucleotides from the 3' end and thus no longer was able to be aminoacylated with [ 1 4 C ] - arginine in vitro . -pDt85C and pDt!7R ; Two Hind III fragments detectable in Figure 3 are 9.5 kbp and 5.5 kbp in length. These were cloned directly using plasmid libraries of genomic Hind III restriction DNA fragments enriched for the respective size range of genomic Hind III fragments. The libraries were screened with pArg and resulting positive clones, pDt85C, and pDtl7R, were isolated. They contained inserts of the predicted length, 5.5 kbp and 9.5 kbp, and hybridized to the same length fragments on genomic Southerns (not shown). In situ hybridization showed that the 5.5 kbp insert in pDt85C is derived from the 85C pArg locus while the 9.5 kbp insert of pDtl7R is derived from the same loci as pDt27R, polytene region 12E1-2 (S. Hayashi unpublished). It was subsequently recognized that like pDt27R, the 9.5 kbp insert of pDtl7R had previously been cloned with serine tRNA probes (Dunn et al., 1979a). The fragment described here is given the same name (pDtl7R) but it should be noted that this is an independent isolate and may contain strain specific differences that distinguish it from previous isolates (Dunn et al., 1979a, Cribbs 1982a). The serine 51 tRNA genes in this plasmid have been described elsewhere (Cribbs et al., 1987, Leung 1988) pR 12.6- One final Hind III fragment was obtained from a 1 clone derived from the 12E1-2 region using pDt27R as a probe (Leung, 1988 Ph.D thesis). A 359 bp Hind III fragment is located approximately 10 kbp upstream from the genes in pDt27R (See Figure 4). This fragment hybridized with the t R N A A r S specific oligonucleotide Arg2 and was subsequently shown by DNA sequencing to contain a t R N A A r S gene. Due to its small size this fragment was not identified in Figure 3 and was kindly provided by J. Leung. The plasmids described above probably represent the entire t R N A A " g gene family per haploid genome because they are derived from each of the four chromosomal sites that are detected by in situ hybridization with pArg (Figure 2), and they correspond to all of the genomic Hind III fragments that were detected by genomic Southern analysis (Figure 3). Additional members of this gene family may occur only if they are undetectable by these two hybridization type experiments. For example, such as in heterochromatic regions of salivary chromosomes or due to inverted gene orientations that preclude hybridization with nucleic acid probes.(Yen and Davidson, 1980). It was not shown whether additional small genomic fragments were detectable by Southern analysis but this is not consistent with the number of t R N A A r g hybridizing fragments detected after digestion with different restriction enzymes (lanes b, c, and d Figure 3). Therefore it is concluded that the plasmids described here constitute most, if not all members of this t R N A A r 8 gene family in Oregon R strains of D. melanogaster. 2.2 Organization and structure of the t R N A A r S - gene family Figure 4 shows a summary of the structural analysis of the 7 Hind III inserts contained in these plasmids. In addition to the four tRNA A r 8 coding regions found originally in pDt27R, 6 different coding regions were identified in these Hind III fragments. All the gene copies are either identical or are closely related in their mature coding regions (see below). None of the genes contain introns nor the 3' terminal CCA sequences that are normally added post-transcriptionally to mature eucaryotic tRNAs. With the exception of the pDt27R and pDtl7R inserts derived from 12E1-2, these plasmids contain no other tRNA genes that could be detected by hybridization with radiolabeled total 4S RNA (data not shown). The t R N A A r g genes are named by a prefix denoting their amino acid class (i.e R) followed by the segment number of their polytene site (1-100) and finally by a decimal of the number of genes located at that site (i.e R12.1-R12.6, R85.1-R85.2 etc). -The 12E1-2 polytene region (R12.1-R12.6)- In addition to the four genes in pDt27R (R12.1-R12.4), two other Hind III fragments were derived from the 12E1-2 site. Chromosomal walking of the 12E1-2 region showed that a t R N A A r g gene (R12.6) is located approximately 10 kbp upstream from the cluster of serine and arginine genes contained in pDt27R (Leung, 1988). This gene is identical in mature coding sequence to the pDt27R genes except for a single C-T substitution at position 13. Another t R N A A r S gene, R12.5, is located within the 9.5 kbp insert of pDtl7R. This gene is identical to the R12.6 gene and contains the same C-T substitution at position 13. The pDtl7R insert has not been localized relative to the pDt27R within the 12DE1-2 region but must occur at least 15-20 kbp up or downstream (J. Leung, personal communication). Like pDt27R, the t R N A A f g gene in pDtl7R is interspersed with serine genes tRNA genes that are also carried on this Hind III fragment (filled bars in Figure 4). 53 Figure 4. J J K t R N A A r 6 gene family in P, mdanoeaster. The Drosophila H i n d III fragments that contain t R N A A r S sequences are indicated by bold lines. Restriction endonuclease sites for Hind III (H), Eco RI (E), Pst I (P), and Bam HI (B) are indicated above the line. The location of t R N A A f g coding sequences are indicated by open boxes (not to scale). The direction of transcription is indicated by the arrows (5'-3' direction) Where present, additional serine tRNA genes are indicated by small cross bars. The filled bars beneath the restriction map correspond to the subclones used for in vitro transcription studies. The names of each starting plasmid, the subclones derived thereof, and the chromosomal site from which they are derived (courtesy of Dr. S. Hayashi) are indicated on the right of each restriction map. The 10 kbp that separate the 6.5 kbp Hind III fragment in pDt27R from the 0.36 kbp Hind III fragment (pR12.6) are indicated by a thin slashed line (J. Leung, 1988). 54 L±k^l B E •4-61-a) E E E + J 1 I I pDtl7R (12DE,X) a) pR12.5 B BBW P HBH BH a) b) c) BPPP E —LLL1 L pDfc27R (12DE,X) a) pR12.6 b) pR12.4 c) pR12.2 pDt67R (19F,X) a) pR19.1 lo-a) P i a) P B PEBB E H b) pDt66R (83AB,3R) a) pR83.1 pDt85C (85C,3R) «0 b) pR8S.l pR85.2 H P P I I I H J pDt72R (multiple) 55 Combined with the work of Cribbs et al., (1987) and Leung, (1988), these results show that the single major tRNA gene cluster at 12E1-2 is composed of at least 14 genes. There are a total of 6 tRNA A r S genes and probably 8 additional genes coding for two serine isoacceptors ( t R N A ^ e r , tRNAy^er) a n ( j g e n e v a riants thereof. No other of the 25 or more purified tRNAs that have been tested by in situ hybridization localize to the 12E1-2 region (Hayashi et al., 1980, 1982, Kubli, 1982). -The 19F polytene region (R19.1)- The second t R N A A r S locus on the X chromosome is represented by the single 3.0 kbp Hind III fragment contained in pDt67R. This plasmid contains a single t R N A A r g gene (R19.1) that is identical to the R12.5 and R12.6 genes. As judged by hybridization with 4S tRNA, pDt67R contains no other tRNA genes and no other tRNAs have been definitively localized to the 19F site by in situ hybridization. A possible exception however is the detection of weak signals using t R N A T y r probes (Suter and Kubli, 1988). -The 85C polytene region (R85.1-R85.2)- A single 5.5 kbp Hind III fragment in the plasmid pDt85C is derived from the 85C site on chromosome 3R. DNA sequencing and hybridization analysis showed that this fragment contains two identical t R N A A r g gene copies. They are identical in their mature coding sequence to the three X chromosome genes R19.1, R12.5 , and R12.6, and likewise differ from the four pDt27R genes by 1 nucleotide. They are separated by 1.05 kbp and occur in the same transcriptional orientation. No other tRNA genes are present in this 5.5 kbp Hind III fragment and no other tRNA genes have been localized to this polytene region to date. -The 83AB polytene region (R83.1)- The plasmid derived from the 83AB polytene region (pDt66R) contains a single gene (R83.1) near one end of the 6.7 kbp Hind III fragment. This gene differs from all the above genes by two substitutions at position 16 (T-A) and position 37 (G-A). The latter change accounts for the absence of the Bam HI restriction site that is present in all other gene copies. No other tRNA genes are present on this plasmid and to date no other tRNAs are predicted to occur at the 83AB site. The mature coding regions of all ten gene copies are summarized in Figure 5. Of a total of 10 gene copies in this family, Five could give rise to identical products, four could give rise to products that differ by 1 nucleotide (C13, pDt27R), and a single gene could give rise to a product that differs at 2 nucleotides (A16, A37, pDt66R). Whether all these genes are expressed in vivo remains to be shown. The significance of the polymorphisms in these potential gene products is not clear. The C13 polymorphism in the four pDt27R genes (R12.1-R12.4) occurs within the A box of the RNA polymerase III ICR but still conforms to the ICR consensus sequence (Sharp et al., 1985) and therefore is not expected to affect gene template activity. This change does, however, alter the possible basepairing in the D-loop (positions 13:22) and changes the length of the predicted basepaired stem from 4 bp to 3 bp. In turn this may alter the overall three dimensional structure of the tRNA product. The changes in the 2 nucleotide variant R83.1 gene (A16, A37) are more complex. The G37 to A37 transition changes the nature of the base at a site that frequently is modified post-transcriptionally. In turn, modification at position 37 is suspected to play a role in translation efficiency (reviewed by Bjork et al., 1987) and therefore raises the possibility that the tRNA product from the R83.1 gene is functionally distinct from other gene copies that have the guanylate residue at this position. The second variant nucleotide (T16-A16) also has potential consequences both in terms of gene template activity and tRNA function. The A16 transversion occurs within the Pol III ICR A box and replaces a conserved pyrimidine residue with an purine residue that is not found in any arginine tRNA and only rarely in all other known 57 Figure 5. T h e tRNA products of the r R N A A r g gene family. The predicted gene products of all 10 t R N A A r 8 genes are summarized in the cloverleaf structure shown. The backbone sequence of the cloverleaf corresponds to the mature coding sequence of the five identical gene copies (R12.5, R12.6, R19.1, R85.1, R85.2). The single nucleotide change that distinguishes the four pDt27R genes (R12.1-R12.4) is indicated by an arrow outside the cloverleaf at position 13. The two nucleotide changes that distinguish the single gene in pDt66R (R83.1) are also indicated outside the backbone at positions 16 and 37. The anticodon sequence is boxed. A A C 13 T A T C C G G A G G C G G OH P G C A T C G C G G C T A G C T T G T C C G C A G G T T T A A G A T T C G G C G C A T C A T 6 A 37 T C 59 tRNAs(Sprinzel et al., 1987). It is not known what effect this change may have of gene template activity in vivo but a generalization is that mutations away from the ICR consensus sequence (Sharp et al., 1985) leads to reductions in the stability of pre-initiation complexes and the overall rate of template transcription (Koski et al., 1980, Folk and Hofstetter, 1983). However, it is shown below (Part III) that this template is still active in vitro . The A16 replaces a uridine residue present in most other tRNAs that is also frequently modified post-transcriptionally to dihydrouridine. However this modification does not appear to be essential for activity in vivo in at least one tRNA (Lo and Roy, 1982) and it is therefore not clear what consequence, if any, might be associated with this variant R83.1 arginine tRNA gene. . In addition to potential functional differences in the genes or their products, these structural heterogeneities raise the possibility that the products of this isocoding gene family account for more than one of the five species that can be resolved by chromatography of Drosophila arginine tRNAs (see Figure 21 and White et al., 1973). The single high abundance arginine tRNA species ( t R N A 2 A r 2 ) has previously been shown to be distinct from this gene family (Silverman et al., 1979) and leaves the four remaining less abundant species as potential products of this gene family. Whether each of the three t R N A A r S species predicted from the gene coding sequences account for one of more of these minor species must await their purification and sequence characterization. This should also answer questions concerning possible differences in post-transcriptional modification and their potential functional consequences. To date only one of these minor species, t R N A 4 A r S , has been purified (I.C. Gillam, unpublished results) and results presented below suggest that this tRNA corresponds to at least one of the gene products encoded by this gene family (see Figure 21). 60 2. 3 Comparison of gene flanking sequences The sequences immediately adjacent to the 10 gene coding regions of this gene family are shown in Figure 6. The flanking sequences are numbered -1 corresponding to the first nucleotide adjacent to the 5' end of the mature tRNA and + 1 corresponding to the first nucleotide adjacent to the 3' nucleotide to which the 3' CCA sequence is eventually added post-transcriptionally. The four pDt27R flanking sequences are summarized by a single sequence (R12.4) due to the high similarity between all four gene copies (see Figure 11). With the exception of the pDt27R genes, the most obvious feature of these flanking sequences is the fact that they share only limited identity outside the mature tRNA coding region. In the 3' flanking sequence the only sequence strictly conserved in all gene copies are the poly dT tracts found 12-20 bp downstream from the 3' end of the gene. These are present in virtually all eucaryotic tRNA genes and act as termination signals for RNA polymerase III (reviewed by Geidushek and Tocchini-Valentini, 1988). These 3' flanking sequences are quite AT rich (60-86%) and in some cases give rise two or more potential poly dT termination sites. In the 5' flanking regions the sequence of highest similarity observed between all 10 gene copies occur immediately adjacent to the 5' ends of the mature coding region. At position -6 to -1 each gene copy shares the sequence motif 5' ^-/IAA^^/QJ from position -6 to -1. The central dAA dinucleotide of this motif is conserved exactly in all gene copies except R83.1 where it is shifted upstream by 1 bp. This degenerate motif is present in the primary transcripts of these genes in vitro (see Part III) and thus may be considered part of the gene coding region. In contrast however, the sequences in the 3' flanking sequence between the 3' end of the mature tRNA and the poly dT tracts are also included in the primary transcript but do not show equivalent similarities between different gene copies. 61 Figure 6 . Comparison of the t R N A A r g gene flanking sequences. The 5 ' and 3' sequences are shown for all copies o f the tRNA A r & gene family. The mature coding re£ abbreviated within the box (centre) and include the terminal nucleotides plus the pol; nucleotides that distinguish different gene copies (boxed). The 5 ' flanking sequer numbered from -1 to - 5 0 beginning with the first nucleotide upstream from the 5 ' enc mature tRNA. The 3' flanking sequences are numbered +1 to + 5 0 beginning with nucleotide of the of the gene coding region. The poly dT tracts in the 3' flanking sequ indicated with a heavy underline. Small regions of 5 ' homology adjacent to the mature regions in the 5 ' flanking sequences are boxed and additional upstream h o m o l o j underlined. The vertical arrows in the 5 ' flanking sequence indicate the predicted major transcription initiation (see Figure 15 ) . The filled circles show the position of minor sites. The vertical bar in the R12.4 5 ' flanking sequence indicates the position of the di truncation in the R12.1 and R12.2 genes. All four flanking sequences of the repeated gen R12.1-R12.4 are shown in Figure 11. -50 - I pR85.2 p R 1 9 . 1 pRI2.5 pR83.1 pR85.l pR12.4 p R t 2 . 6 . t . GCTTGCflCRCGTflTCRRRTOTTTTCGRGTTTRRGCOTGCTTCGRfl|lflflS£] t . CTCGTTCCCRCGTTTGCCTTRTTCRCRTCTTflfiTCCGCTTGflfiRflCRfiOCl CTGCTCRGCTfiGTTGCTTTTCTTGGCRRCTTfifiGCCRCOTTTfiRfilCRfiCTl GTflCCCRTTTTGRGCTTTRTRGGGCRGGGRCflflflCGGGflCGTTTLmaESG TTTGTRRRGCCCGTCTTTRTGTTRGTCRTRTTTTTTCRGflGTTGCfcflflCJJ t t RCCGTTTTGTRTCRTTGRTcjrTGGGflRTTTGGGRCGCCGGTTGCClTRRCTl CTCTCGCCTCTCCCTCTTTRTRTTTGTTCTTRCOGCCTGGTRRTqCRfiCTI • G G T C G G R C C . . T . T . . G . . G G T C G G R C C . , T . T . , G . . G G T C G G R C C . • T Jfl]. 0 . G G T C G G R C C . • T . T . , G . . G G T C G G R C C . ( C ] T . . G . . G G T C G G R C C . • T . T . ,fi, , G G T C G 13.16 .37 1 +50 RRflCCflRflGTRTTRTJIIIIICTTTTTflTTTTTTTTTTGTGflGRRRCTTR RTflRflTTTflTRIIIIIIRTTTGTTTTCGGflflTCflflflTTRGTCRTflTRTTT TflCRRRTCCCUIIIGTTRTCTTCCRRRCTTTTTGGCTTTCflTTTTTGRR RTTTRRCGGCTGRRTGCRTTTTTTGCRCCGRCTGGCTTGTTTRRRCTTCT TCGTRRRRTTRRCTTTTTTCTTTTGTRTCCRGflflTTTTTTTTRTTTflTTT RRGCTCRGGCTRTRTTTTTTTflflflTTRTRTTTTGTTCGTCCTRGRRTRTR RCCGCTCTRTCTTTTTTTTRRTRTTCRTRTTTTCCTTGRGCTRTGRRTRT N3 63 Additional blocks of identical sequence 5-7 bp in length are found further upstream from several of the gene copies but these occur only between pairs of genes and show no common features. However, a degenerate sequence motif (5'TNNCT, where N is any nucleotide) that was shown to be functionally important in the 5' modulation of aDrosophila valine tRNA gene (Sajjadi et. al. 1987, Sajjadi and Speigelman 1988) is found in several of these gene flanking sequences between positions -29 to -40. In addition to the valine genes, this region of the 5' flanking sequence has been shown to contain sequences necessary for the efficient in vitro transcription of several other eucaryotic tRNA genes (Geidushek and Tocchini-Valentini, 1988). It remains to be shown whether the TNNCT sequence motif plays equivalent roles in the modulation of this tRNA gene family. The only upstream flanking sequences that show extended similarity are between the first 20 nucleotides of the R19.1 and R12.5 gene 5' flanks. Beyond this point, the similarity declines to the degree of unrelatedness evident between other gene flanking sequences The 3' flanking sequences these two genes only share a general predominance of dA or dT residues in this region (72%-86%). These mostly unrelated flanking sequences contrast sharply with the high similarity seen in the tRNA coding regions. This suggests that if the different members of this gene family once arose from one another as units of coding and flanking sequence, as is suggested by the pDt27R genes (see Part II), then they arose so long ago that few remnants of similarity remain in the flanking regions. One such remnant may be the short region upstream from the R12.5 and R19.1 genes. In conclusion, with the exception of the 3' poly dT tracts, obvious candidates for conserved gene family specific control sequences can be identified in these flanking sequences. 0 64 3. Summary ul gene family organization and structure. The structure and organization of these genes appear typical for D r o s o p h i l a tRNAs. They are a family composed of 10 identical and closely related gene copies that occur either alone (ie. R19.1, R83.1) or in clusters (12E1-2, 85C) at four different chromosomal locations. One exceptional feature of this family is the high concentration of genes encoded on the X chromosome. Combined, these X-linked genes account for the majority (7/10) of the total gene family (R12.1-R12.6, R19.1). This is striking because relative to the other major chromosomes, the X chromosome contains very few tRNA gene loci. To date, the only other X-linked tRNA genes that have been identified with certainty are those encoding the family of serine tRNA genes that are interspersed with the tRNA A r S genes at the 12E1-2 site (Cribbs et al., 1987). It would appear that the 12E1-2 locus may be a single major site for tRNA genes on the entire X chromosome. As will be shown below, three of the four t R N A A r S genes in pDt27R (R12.2-R12.4) probably arose recently by gene duplication. Prior to this predicted amplification the gene family was composed of five identical gene copies and two 1 and 2 bp variant copies (R12.1, R83.1 respectively). The bias in X chromosome linkage was therefore less in these putative ancestors (4/7) but is still substantial compared to most other known D r o s o p h i l a tRNAs. The fact that the majority of gene copies in this gene family are located on the X chromosome raises the question of how gene copy number differences between the sexes might affect t R N A A r S levels and whether this has any biological significance (Birchler et al., 1982). In terms of the proposed recent duplication of the pDt27R genes, it can now be seen that these genes are structurally distinct from all other members of this gene family. Although they only differ by a single nucleotide, this change has potential structural consequences in the tRNA product. By comparison, the tissue specific and constitutive t R N A A l a species of B. mori, also differ by only a single nucleotide and 65 yet their genes have distinct transcriptional properties that are probably related to their tissue specific expression (Young et al., 1986). The heterogeneity of potential gene products in this gene family further illustrates the point that tRNA populations may be much more complex than can be discerned by hybridization analysis alone. At present it is not clear whether any functional significance is associated with potential minor variant tRNAs. Their presence in this and several other tRNA gene families (Leung et al., 1984) therefore either reflects unknown functional properties or alternatively, are the consequence of the evolution of these gene families. For example, if tRNA gene families do co-evolve by sequence rectification (Dover, 1982, Munz et al., 1982, Cribbs et al., 1987) then these mechanisms are restricted to certain members of the gene family. In this example 5 of the 10 gene copies are identical and potentially co-evolve. The four 1 bp and single 2 bp variants are apparently not be subject to these processes and have acquired and maintained the sequence variations observed. In the case of the 1 bp variant genes in pDt27R, it will be shown below that they have also probably recently duplicated the variant sequence from one to four gene copies per haploid genome. 66 Part II- Evolution of the pDf27R gene cluster 1. Analysis of the pDt27R locus in D_, melanogaster stains anrl sibling species. The tandemly repeated organization of the four pDt27R genes is in marked contrast to the other members of this gene family and most other Drosophila and eucaryotic tRNA genes. One interpretation of the organization of the pDt27R genes is that the high flanking sequence homology reflects their recent evolution by gene duplication (Hosbach et al., 1980). Alternatively these genes could be maintained in this organization by frequent unequal exchanges or gene conversions analogous to the tandem repeats of ribosomal RNA genes. Preliminary genomic Southern blot experiments indicated that this tandem structure was not an artifact of cloning in E. coli but also existed in Oregon R D. melanogaster genomic DNA. Using mutation rates calculated from otheiDrosophila DNA sequences (0.5-1.7% substitutions per MYR, Zweibel et al., 1982, Stephens and Nei, 1985, Caccone et al., 1988), and the number of polymorphisms that occur in the repeated flanking sequences between the four 200 bp and 600 bp pDt27R repeats (Table 3), it is possible to estimate that these predicted gene duplications occurred within the last 5-10 MYR. In this time period the most recent divergence between D. melanogaster and its closest sibling species D. simulans is thought to have taken place. Other Drosophila sibling species (ie. D. yakuba, D. erecta, D. teissieri, ) diverged much earlier (approximately 30 MYR, references above). To test the hypothesis that these pDt27R genes have resulted from recent gene duplications, and to obtain more detail concerning the mechanisms involved, the homologous loci from a variety of D. melanogaster strains and sibling species were compared by genomic Southern blot experiments in order to assess how common this tandem organization is within different D. melanogaster populations, and whether it also exists in sibling species that diverged from D. melanogaster within the last 5-30 MYR. 2. GenflmJC Southern analysis of pDt27R homologous Lfl£i The probe used for these analyses was a deletion variant of pDt27R (pA27) that was lacking the two t R N A ^ e r g e n e s located upstream and truncated at the Pst I site located approximately 1.3 kbp downstream from the R12.1 t R N A A r e gene (Figure 1). This probe is largely specific for the repeated sequences and adjacent single copy sequences in pDt27R. Genomic DNAs to be screened were digested with the restriction endonuclease Bam HI. This enzyme will cut once within each pDt27R t R N A A r 8 coding region (Figure 5) and generate a doublet of 200 bp fragments and a 600 bp fragment that are diagnostic of the repeated organization in pDt27R. Two additional fragments 1.3 kbp and 2.0 kbp in length result from Bam HI sites contained in the outermost t R N A A r S genes (R12.4, R12.1) and sites located in unique flanking sequence outside the duplicated gene cluster. The Bam HI sites located within the tRNA coding regions should be highly conserved both within D . melanogaster and the related sibling species and thus provide useful phylogenetic markers. The flanking Bam HI site of the 2.0 kbp fragment is also expected to be highly conserved because in pDt27R it actually consists of two sites separated by 1 bp (Newton, 1984). The limitation of using these Bam HI sites for assessing the structure of homologous loci in other strains and species is that the number of 200 bp and 600 bp repeats can be determined unambiguously only by the presence or absence of the Bam HI fragments. Differences in the actual number of these repeats can only be detected by the change in signal intensities of the corresponding restriction fragments. However the presence of the 1.3 kbp and 2.0 kbp fragments provide internal standards which allow ready comparison of differences in fragment signal intensity and thus give a rough idea of Bam HI fragment copy number. 68 2.1. Survey of D. melanogaster strains The D. melanogaster strains that were tested are listed in Table 2. These strains are composed of a world-wide collection of wildtype flies obtained from the Umea stock centre in Sweden. In addition several common laboratory wild-type stocks (i.e Oregon R, Samarkand, Canton-S, etc.) and a diverse set of mutant D. melanogaster strains were included. The latter were chosen because they were isolated in the early 1920s to 30s and had been maintained in their original genetic backgrounds (D. Holm, personal communication). Genomic DNAs from these flies were digested with Bam HI, fractionated by electrophoresis in agarose gels, and then blotted to filters for probing with pD27. The results of this analysis are included in Table 2 and are summarized in Figure 7. Lanes (a-c) contain genomic DNAs from Oregon R, Urbana S, and Samarkand wildtype D. melanogaster strains. Each of these strains hybridized the same four genomic fragments predicted from pDt27R (2.0 kbp, 1.3 kbp, 0.6 kbp, and the 0.2 kbp). Of the approximately 50 different strains of D. melanogaster tested in Table 2, only five differed from this pattern. The remainder were identical to the four gene cluster seen in pDt27R and therefore are predicted to carry identical duplicated gene clusters. The five variant D. melanogaster strains (rosy2, yellow2, W760, W420, and Lausanne-S, Table 2) were all identical and each differed from pDt27R by the absence of the 600 bp Bam HI fragment. This is equivalent to the loss of either the R12.4 or R12.3 gene and results in a gene triplet contained on three identical 200 bp repeats (see Figure 8). Examples of two of these variant pDt27R loci are shown in lanes (d-e). The faint band with mobility slightly larger than 0.6 kbp is derived from one of the t R N A A r S genes located at the 85C region and results from Table 2. Drosophila melanogaster-strains and sibling species Strains Source tRNAAr9 genes9 Oregon-pJ Tener (UBC) + + + + Oregon-R T e d Grigliatti (UBC) + + + + Island Holm (UBC) + + + + B2 + + + + Canton-S(USA) Pasadena*3 + + + + Samarkand-S + + + + Swedish-C + + + + Urbana-S + + + + HikoneAS + + + + HikoneAW + + + + scute^ Holm (UBC)C + + + + forked + + + + white (w/w) + + + + V prune + + + + Algeria (W10, Algeria) Umead + + + + Alma Ata (W20, USSR) + + + + Ashtarok (W60, USSR) + + + + Berlin (W90, Germany) M + + + + Boa Esperanca (W120, Brazil) M + + + + Champtiers ( W H O , France) " + + + + Curituba(W180, Brazil) ** + + + + Fairfield (W200, Australia) ** + + + + Falsterbo (W310, Sweden) ** + + + + Formosa (W330, Taiwan) %% + + + + Frunze (W340, USSR) M + + + + Groningen (W400, Netherlands) M + + + + Haceteppe(W440, Turkey) M + + + + Hampton Hill(W450 t Britian) M + + + + Hikone(W460, Japan) ** + + + + Hodejice (W480, Czechoslovakia) " + + + + Israel (W500, Israel) ** + + + + Karsnas (W520, Sweden) + + + + Krasnodar (W560, USSR) + + + + Kreta-75 (W570, Crete) + + + + Naantali (W640 .Finland) • « + + + + Oslo (W690, Norway) ** + + + + Poringland (W720, Britian) ** + + + + Slankman (W820, Jugoslavia) ** + + + + Valencia (W1000, Spain) M + + + + Wien (w 1030.Austria) + + + + Tab I e 2. (continued) Strains Source tRNAAr9 genes" Lausanne-S Pasadena + + + Gruta (W420, Argentina) Umea + + + San Miguel (W760, Argentina) " + + + rosy2 Holm(UBC)c + + + yellow2 " + + + D. simulans -Lima Holm (UBC) + - C . . + - yellow, white •". • . +• . - South Africa Pasadena + - Kushla-F ' • - •' ." + - Morrow Bey . • ; .+• - Guatemala "• "• . + D. mauritiana Bowling Greene + D. teisssieri " + D. erecta " • + D. yakuba " +. • a The number of t R N A ^ genes (+'s) at the homologous pDt27R loci were determined by digestion of genomic DNAs with Bam HI and genomic Southern hybridization with pA27 (Rgure 7). Strains are given four +'s on the basis of hybridization with the 2.0 kbp. 13 kbp, 0.6 kbp, and 02 kbp Bam HI fragments. Three +"s are given when only the 2D kbp, 13 kbp, and 02 kbp bands were visible A single + indicates that only the 2D kbp and 12 kbp (or derivatives thereof) fragments were present . b These strains were obtained from the Pasadena Stock Centre, Pasadena, Calrfbrrtia. USA c The DsneJanogaster mutant strains were choosen on the basis of their initial isolation in the early to late 1920s or 30s and their subsequent propagation in the same genetic background (D. Holm, UBC, personal communication), d A global collection of wild Dsnelanogaster collections was obtained from the. Umea Strock Centre, Umea, Sweden In brackets are the stock centre accession numbers and the country of origin. e The DjneJanogaster^cim^ species were obtained from the Bowling Gren Stock Centre, Bowling Gren, Ohio. USA 71 Figure 7 Genomic Southern analysis of the pDt27R locus- in different D melanoeaster populations and closely related sibling species. Genomic DNAs were digested with the restriction endonuclease Bam HI and fractionated by electrophoresis through 1.2% agarose gels. The fragments were transferred to nitrocellulose filters and hybridized with a 32p labelled probe (pD27R) containing the repeated t R N A A r g gene region of pDt27R and exposed to autoradiography. The size markers (right) are pBR322 DNA digested with Hinf I and 1 DNA digested with Hind III. The arrows (left) indicate the positions of the genomic Bam HI fragments that are predicted to hybridize the t R N A A r & region of pDt27R. The sources of genomic DNAs are indicated above the lanes. Lanes (a-e) contain samples of D . melanogaster genomic DNAs, lanes (f-g) contain samples of D. simulans genomic DNA, and lanes (i-j) contain the more distantly related sibling species, D. erecta , D. teissieri , and D. yakuba , respectively. With the exception of lane (a), approximately equal quantities of DNA ( 5 pg) were loaded into each lane. In lane (a) approximately half as much total DNA was loaded onto the gel. • i W O % 0 "^Ifc » Oregon-R t t V Urbana-S ° Samarkand a Rosy 2 » Lausanne-S "» S .Af r ica f £ f £ r i o r r o B a y ^ ^ =r Guatemala € 9 * € - D.erecta 9 9 ' d D. t e iss ie r i f H *" D.yakuba r ^ i i i 11 i • I I i i i #jj 73 hybridization with the small percentage of t R N A A r g coding sequence contained in pA27. This survey shows that D .melanogaster strains collected from around the globe contain a tRNAArg locus that is highly similiar if not identical to that found in pDt27R. This is not surprising considering that this particular species of Drosophila is highly cosmopolitan and single populations can easily spread around the world. The fact that five strains contain only three genes can be interpreted two ways. On one hand these strains could be deletion variants that have lost a 600 bp repeat by unequal exchange, or alternatively, they could represent intermediate populations that have not yet acquired this 600 bp repeat present in the majority of strains. Whether different fly populations can acquire the repeats independently or solely by vertical inheritance is not clear. One observation, however, was that one of the five three gene variants (W420), contained a Hind III site polymorphism that was not shared by other three gene variant populations (data not shown). Instead of a 6.5 kbp pDt27R-like fragment, this strain gave rise to a @15 kbp Hind III fragment on genomic Southerns (data not shown). This suggests either that these strains do not share closely related X chromosomes or that the polymorphism occurred more recently. More detailed analysis of polymorphisms associated with this loci will be necessary to distinguish between these possibilities. 2.2 Survey of D. melanogaster sibling species Similar experiments with genomic DNAs from the Drosophila sibling species are shown in Figure 7 lanes (f- -k). This includes samples of three different D. simulans isolates (lanes f-h), and one isolate each of the more distantly related sibling species, D. erecta , D. teissieri , and D. yakuba (lanes i-k, respectively). Figure 7 and Table 2 show that all the sibling species are missing both the 600 bp and 200 bp fragments present in D. melanogaster . The flanking 1.3 kbp and 2.0 kbp bands arc still present 74 and suggest that only a single Bam HI site, and t R N A A r S coding region, occur in these homologous DNAs. Slight differences in mobility of the 1.3 kbp and 2.0 kbp equivalent pDt27R fragments are seen in the D. simulans strains (lanes e-g) and these become more pronounced in the more distantly related siblings (lanes, h-k) until in the oldest sibling, D. yakuba (approximately 30 MYR), the 1.3 kbp fragment is absent altogether and is replaced by at least two smaller, weakly hybridizing bands. These changes are to be expected as random sequence drift by nucleotide substitution, insertions, and deletions accrue in the sequences flanking the t R N A A r S coding sequence. Studies by Leung (1988) confirm that the coding regions of these genes in D. erecta and D. yakuba have remained almost identical to those in D. melanogaster and each still contain the internal Bam HI site predicted from Figure 7. 3. Molecular analysis of variant pDt27R loci in D. melanoeaster and fl. simulans. 3.1 Nucleotide sequence of p27ry2 and p27simC The 5.5 -5.9 kbp Hind III fragments analogous to pDt27R were isolated from one of the five variant D. melanogaster strains (rosy^ ) and a representative D. simulans species (strain C) listed in Table 2. These fragments were cloned into pEMBL and the resultant plasmids are designated p27ry2 and p27simC, respectively. A partial restriction map of their Hind III inserts are shown in Figure 8. The left hand portions of these two Hind III fragments were sequenced and in Figure 9 they are aligned with those from pDt27R. As predicted, the p27ry2 clone contains 3 t R N A A r § coding regions contained on three identically sized 200 bp repeats while the p27simC clone contains a single t R N A A r S gene surrounded by unique flanking sequence. All tRNA coding regions in both the p27ry2 and p27simC are identical to those in pDt27R. The non-coding regions, especially in the more Figure 8. Partial restriction man of pDt27R. p27ry2 and p27simC. (A) shows the 6.5 kbp Hind III fragment derived from pDt27R. (B) shows the homologous fragment (p27ry2) derived from the rosy 2 strain of D. melanogaster . (C) shows the homologous fragment (p27simC) derived from D. simulans strain C (see Table 2). The t R N A A rS coding regions are indicated by open boxes and the serine tRNA genes are indicated by filled boxes (not to scale). 77 Figure 9. Sequence comparison of t R N A A r g gene clusters in pDt27R. p27rv2 and p27simC. The sequence of pDt27R beginning at position 600 relative to the leftmost Hind III site (see Figure 1) is shown in full on the upper line (5' to 3' from left to right). The sequences of p27ry2 and p27simC are shown below as dashes to indicate identity to the pDt27R sequence. Nucleotides absent from p27ry2 and p27simC sequence are indicated by asterisks (*) . Additional nucleotides in p27ry2 and p27simC are shown below each line and occur on the 3' side of the nucleotide immediately above. The location of the t R N A A r S coding sequences are boxed in heavy lines. The transcriptional orientation of the genes are 5' to 3' from left to right. In the repeated flanking sequences outside the gene coding regions, the 8 bp repeats that flank the 600 bp repeat containing R12.4 are boxed. At the 3' 8 bp repeat, the downstream adjacent 5 bp are also boxed. The location of these 8 bp repeats in the flanking sequences of the other genes are also indicated by boxes. The three adenylate residues that occur at the junctions of the 200 bp repeats are boxed. The single tRNA A r & coding sequence in p27simC is arbitrarily divided at the Bam HI site in order to maximize identity between the pDt27R and p27simC flanking sequences. 78 p 0 t 2 7 R C C C T T T G T T T G G C f l f l T T f i C T T T C T G T C T f l f l T G f l f l T T T C T T f l f l T T C f l f l T T f l T f l H T C C G C f l T T T T G f l T C f l T f l T T T C C T R T T C f l f l G G f t f l C C f l C f l T C T C T R f l T T T T T T T f l C C T T G C C T f l T T T G T 720 p 2 7 r y 2 • p 2 7 s l . C . c R T — f l - f l fl CG C T CTCGCflTTOTGhGCCCnnhcflCRflCflflCBCCBCCCflCCHGflCflCGCflCRRflflTTflTTTflCnTTTGCTGCTGRCGflGTTCGTTGflfllCTTTGflTSflCCTTTTTGGTCTGCTCCTCGGCnn 610 ***********************************ft*********************************************************** * * * * * * C R t a*************************************************************************************************** TTTTflTTTCTCTflTRTRCTRflRTTTTTCGGCTGTCTTTCCTTTRCTTTCGTTTTGCTCTTCCGTCTGTGGGCGTRTRTCGCGTCCflCflflflflflGCCTCRRRRTGTCTTTGGTCCTTTTGCfl 960 • • • • • • • • • • • A * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ****** •••••***** •**•**•****« ************** •**•*•*•*•******••*•**•«*«**********************«****«•*««******* ******* ****** CCflTTGftCGTTGTTGTTTCCGCflGGTCCGflGCCCGCflGGflflTCTTTGRTflflflGflTCTTTRTRTTRTCflflTGTCTRflGTflTflGRTRRRRTGRRTRflflTRflTTflTGRRflTRRGRRTGTflRRT I080 • a********************************************************************************************************************** • A * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * BtraTTTTCflflTCRflTCGTTTTRRGCmGGTTCBTTTGCflBTBTTRTflBBCTRTM^ 1200 * • * * * * • « * • * * • • * * • • * * * • * * * « * • * * * * * * • * * * * * * * * * * * * * * * * * • * • * • * * * * * * * * • * * • * • * * • * « * * * * * • * * * • * * * • « « « • • * * * * * * * * * * * * * * * * * * * * * * * * * • ••*«*««*•* a**************** a*********************************************** ***••**•*••••*•** * • • * * « « • « * * * * • * * * * * * * * * * * * * ICCRRTGGRTRRGGCGTCGGflCTTCGGRTCCGRflGRTTGCRGGTTCGRGTCCTGTCRCGGTCGl^ RGCTCftGGCTRTRTTTTTTTRRflTTRTRTTTTGTTCGTCCTflGRRTflTRTTRRTflTG 1320 • • a * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * v s * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * GGBGRTTCCcfrRGCCCRfltEEfliDrcBCflCCBBCBCCflCCCBC M 1 0 a********************** — A •—C — •••****••**** •**•«**«** »*** Q G C C T GWnnRTTTCTCTflTf lTBCTRRRnnTCGTCTTTCTnCCTTTRCTTTCGTTTTGCTCTTTCDTCTGTGGGCGTRTRTCGCGTCCflDRflRflGCCTCRflRRTGTCTITGGTCCTTTT 1560 T C fl GCflCCflTTGBCGTTGTTGTTTCCGCBGGTCflGflGCCCGCflGGARTCTTTGflTRflTGRTCn 1660 -T • -BflTBcnnrmcaRTCwucomTMUSCTBGGTTCflm -G-c G T fl T fl IGOCCCTflTGGflTRRGCCOTCGGfiCTTCCGflTCCGflBGflTTGCflGGTTCGRGTrc 1920 ********* •**•*****•******•••••«*****•*••• ••***••••*•*•••••*•**•****•«•**•*•***** TRTGGGBGATTCCdrflOCCCTBtCCBmGTGTRflCCTGflGfjfffi^  2 0 1 0 •••a************************************************** •****••*•**•****•******•***•*****•*«••*•««***•*•**•***« **•***«««•* hTGCRGGTTCORGTCCTGTCflCCGTCtfefiOCTCflGGCTflTflTTTTTTTTRRfiTTflT^  2160 a * * * * * — * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * GBGfiB^GGGRBTTTGGGflCGCGGGnBCGTRRCnrSBreGTGTGGCC^^ 2280 « * * * * • * • « * • * • * * * * * * * * * * * * * * * * * * * * * * * * * * * A * * * * * * * * * * * * * * * * * * * * * * * • * • * • « • * • • „ QJ_ TTBBmTTTTTGflHCTTfimTTCGnCGTCCABTAATATATTflfiTflTK^ 2100 fl-T-CCfl T C--T C CCCflATTGflRflGflTTTATTGGBCTTTTflCBTGGGTCfiTrcflTGGflCGflflTCflflCBTGTGtXT 2520 fiCBBflTCTCTTTflBCGTCBGCftflBflRRTBBBRflGRTflTrTTTTCTBflBGTTTGTflTTGTCGTflCfllTTGGTTTBTBRTTTTBflTBTTTRGCGTRTCRflTTBRRTCflRTGTGTCTflTGTGT 2610 R — T T * C RR T T — R CCGflTRCTTTCGTGTRTTTTGTTRIGrTTCTGTGTBTCTGCTGGTGTCGTTGCTGCflflTTGTTGCTflGCTrGflflTflGCTflTflTRTTTTTTflTTCTCTTTTGTCflGCflflGCflGflCTGflGGfl 2760 fl C T C 79 Table 3. Pairwise comparisons of duplicated t R N A A r 8 gene flanking sequences. A. Comparison of 30 bp of 5' flanking sequence (positions -30 to -1) and 99 bp of 3' flanking sequence (positions +1 to +99) common to the 200 bp repeated regions of pDt27R, p27ry2 and p27simC a . Rl R'l R2 R'2 R3 R'3 R4 3' Flanks SI 4 5 12 12 12 12 13 (+1 to +99) Rl 1 12 13 13 13 13 R'l 14 14 14 14 15 R2 0 2 2 2 R'2 2 2 3 R3 0 3 R'3 3 5* Flask; SI 5 - 2 - 1 - 1 (-30to-l) Rl 3 - 3 - 4 R2 1 - 1 R3 a. Sequences compared are shown in Figure 11. Numbers correspond to nucleotide substitutions or deletion/insertion events between pairs of sequences. R1-R4 correspond to the four 200 bp repeated regions of pDt27R. Rl-R*3 correspond to the three 200 bp repeats of p27ry2. SI corresponds to the single homologous 200 bp region of p27simC (see Figure 8). The 5' flanks of p27ry2 are not included because they give identical results with pDt27R. B. Comparison of 454 bp (positions -454 to -1) of 5* flanking sequences in pDt27R, p27ryl and p27simCb. R3 R'3 R4 SI 30 29 32 R3 3 13 R'3 n b. S e q u e n c e s c o m p a r e d a r e s h o w n i n F i g u r e 9. R e p e a t d e s i g n a t i o n s a n d n u m b e r s a r e t h e s a m e a s a b o v e . 80 distantly related D. simulans sequence, contain small numbers of nucleotide substitutions and deletion/insertion events that distinguish the different sequences. The alignment and positioning of homologous positions in p27ry2 and p27simC with pDt27R is based on minimizing these differences between all possible comparisons (see Table 3 for pairwise comparisons). In doing so the polymorphisms which distinguish the pDt27R repeats recur in p27ry2 and show that the three genes in this D. melanogaster strain correspond to the closely spaced triplet of genes R12.1, R12.2, and R12.3 of pDt27R. For example, the leftmost repeat (Rl) in both pDt27R and p27ry2 contains approximately 5x more flanking sequence polymorphisms when compared to the equivalent sequence in the 2-3 repeats (R2-R4) further upstream (Table 3). Therefore the only difference between these 3 and 4 gene polymorphic D. melanogaster loci is that the 600 bp repeat containing the R12.4 (or R12.3) gene is not present in p27ry2. The sequence of p27simC contains only a single t R N A A r S gene downstream from the pair of serine tRNA genes found in both species (Cribbs et al., 1987). Curiously, the 5' and 3' flanking sequences of this single D. simulans tRNAArg gene show highest homology not to any single gene within the D. melanogaster repeats, but instead to the 5' flank of one set of genes (R12.2, R12.3, R12.4) and the 3' flank of another gene (R12.1). Therefore in the alignment shown in Figure 9 a break has arbitrarily been introduced mid-way in the D. simulans t R N A A r S coding region in order to maximize the flanking sequence homologies observed. This difference in homology will be discussed more fully below. 3.2 Junctions of repeated sequences in pDt27R. p27ry2 and p27simC The sequences that mark the boundaries of the duplicated regions in D . melanogaster are compared to the single copy sequence of D. simulans in Figure 10A. The 200 bp repeats in D. melanogaster correspond to positions -30 in the 5' 81 flank and +100 of the 3' flank of the single p27simC gene. At the -30 junctions of the 200 bp repeats containing genes R12.1 and R12.2 three adenylate residues (boxed) occur that are not present in the single copy p27simC sequence. These additional residues may have arisen during the duplication event. The sequences at these -30 and +100 junctions do not show any obvious symmetries such as small direct or inverted repeats. Thus the only obvious feature of the junctions between the three 200 bp repeats in p27ry2 or pDt27R are the adenylate residues which occur precisely at the duplication termini. The 600 bp repeat in pDt27R is quite different however (Figure 10 B). In this case the duplicated region is bounded by an 8 bp direct repeat 5'TAGCCCAA. The precise junction of this 600 bp repeat with the adjacent 200 bp repeat is difficult to determine exactly because the resulting novel joint apparently contains a duplication of the 5' bp internal segment of the 8 bp repeat (5' CCCAA). Alternatively this apparent 5' bp duplication could also arise by inclusion of 4 nt that are directly repeated immediately adjacent to the 3' 8 bp repeat, followed by the insertion of an additional single adenylate residue. The resulting sequences are the same in both cases and makes the exact position of the junction ambiguous. It is clear however that these 8 bp direct repeats mark the boundaries of the 600 bp repeat in pDt27R and therefore may have played a role in their duplication by some mechanism involving homologous recombination or slippage during DNA replication. 3.3 Divergence of repeated sequence in pDt27R from p27simC When two DNA sequences arise by duplication it is expected that initially they are identical and then subsequently begin to diverge from one another by the random accumulation of nucleotide substitutions and small deletions or insertions. The number of these changes should be approximately proportional to the mutation rate and the length of time that the duplicated regions have existed. In addition, i f the 82 Figure 10. Duplication junctions of the repeated regions in D . melanogaster. (A) The top two sequences show the 5' flanking sequence from the single D. simulans gene (p27simC) and the two D. melanogaster genes (5' R12.3 and R12.4) contained in the 600 bp repeats around position -30. The middle two sequences show the junctions of the 200 bp repeats of the D. melanogaster genes R12.3 and R12.2 corresponding to positions -30/+99 (375' R12.2-3). The three adenylate residues at these junctions are boxed. The bottom two sequences show the position of the 3' ends of the 200 bp repeats around position +99 from D. melanogaster (3' R12.1) and D..simulans (3' p27simC). The numbering of the flanking sequences relative to the mature tRNA coding regions are as in Figure 6 and correspond to the sequence from D. simulans (p27simC). (B). The top two sequences compare the 5' flanking sequence of p27simC around position -449 and the equivalent sequence in the pDt27R R12.4 gene with the sequence at the 3' end of the 600 bp repeat containing R12.4 (position +73). The 8 bp direct repeats that flank this 600 bp repeat are boxed. Five nucleotides that occur at the junction of the R12.4 and R12.3 600 bp repeats are also boxed. The lower two sequences show this junction at the equivalent positions in the R12.1 and R12.2 200 bp repeats, and in the single p27simC gene. _ J L 3misizd,z BB1919J.J.J.B3JLqB3333gBip33.Ll I I I I 111111 111 111 11 l"ZIB.£ H«1019111U313g33339yIp3311 I II1111 1111 I 11 111 Z"ZIB.C «01919111U333g«3339«333311 I-I 111 £'Zia.£/.S B33U303U^«55JS3339UT rJ3311 II H I M K Z I B . S B3BB3B3B gb3339Bl|D1911 I I 111111 ^ 1111 3"!ci2d,g B3BB3B3B )BB33B9B1P19.L3 r z i a .c z' z i - a . s / . c C'ZIU.S/.C t'czia.e 39BBJL99391-I I I I 11 I I I I 39BBJL9939J.-66+ I -9U9133BB19 I 11111 111 I --9B9133BB19 I I I I I I I I I I lilB B 9 9 9 1 J ^ B j ) B 9 1 3 3 B B 1 9 I I I I I I I 1 I I I I I I I I I I I I lllBB9991JjBBBpB9133BB19 I I I I I I I I I I 111BB99911 31B911B31B I I I I I I I I I I II I II I I I 111BB99911 B1B911B311 oc-84 Figure 11 Sequence divergence between D. melanogaster and D . simulans at the PDt27R tRN A A r 8 Incus. The unique t R N A A r g flanking sequences of p27simC are compared with the repeated flanking sequences in pDt27R. All 5 sequences are shared from positions -30 in the 5' flank to + 99 in the 3' flank. The p27simC sequence is shown in full at the top and identities with the pDt27R sequences are indicated below by dashes. Differences between p27simC and the pDt27R repeats caused by nucleotide substitutions are indicated and deletion/insertions are shown by asterisks. The identical tRNA A r S coding sequences are shown as open boxes where SI signifies the D. simulans gene and R1-R4 the four repeated D. melanogaster genes in pDt27R. The direction of transcription is indicated by the arrow and the poly dT tract of the p27simC gene is underlined. -30 5' * 3 TTGGGflflTTTGGGflCGCCGGTTGCGCflflCT I S 1 • 9 9 flGCTCflGGTTflTTflCGfiCTTJTTJfiflCTTfiTTTTTCGTTCGCCCTRTfifiTflTflTTflflTflTGGGflGRTTCCCTflGCCCCRCTCflTTTGTGTflflCCTGfiG GG A - - T — C R 1 TR-T-fl-*TTT R 2 R C — f l * * * T T R 3 fl C—CR***TT R 1 fl— C — R * * * * T 0 — T - -R fi— --R-- — T — — T - — G fl--c 00 86 divergence accumulates randomly then the amount of divergence should be the same in both or all the duplicated regions. As was noted previously in pDt27R (Newton, 1984) and confirmed here again with p27ry2, the amount of divergence is not equal between the three or four duplicated regions in these DNAs. This is shown in Figure 11 where the 30 bp of duplicated 5' flank and 99 bp of duplicated 3' flank are shared between all four pDt27R repeats and the single copy sequence from p27simC. The equivalent regions from p27ry2 are not shown because they are virtually identical to the R l , R2, and R3 sequences from pDt27R. When these repeated D. melanogaster sequences are compared to the D. simulans sequence a curious pattern emerges. The 5' flanks of the D. melanogaster R2-R4 repeats contain fewer nucleotide changes from the corresponding D. simulans sequence, eg. less divergence, than do the equivalent 3' flanks of the same repeats. The exact opposite pattern is observed for the R l repeat: the 5' flank is most dissimilar to D. simulans while the 3' flank is most similar. This flank specific variable divergence accounts for the fact that the p27simC sequence aligns most closely with no single D. melanoagster gene, as is indicated in Figure 9. It would seem therefore that these duplicated regions in D. melanogaster are not all evolving randomly but show different rates of nucleotide substitution relative to the equivalent sequence in D. simulans.. This is at least partly explained below in a model that attempts to trace the evolutionary history of these duplicated loci in D . melanogaster. 4. A model for the evolution of P27rv2 and PDt27R t R N A A r g g£D£ c l u s t e r s A model that describes the evolution of the pDt27R gene cluster and its derivatives (i.e p27ry2) is presented in Figure 12. For simplification the model assumes that the duplicated loci described in pDt27R and p27ry2 represent duplication products that have not been rearranged by secondary events. This assumption is supported by the fact that the Southern analysis (Figure 7) showed that these loci were very similar if not identical in all the strains tested. This however does not exclude the possibility that earlier events may have rearranged the duplication products. The model proposes that a single tRNA A r 8 gene was present at the pDt27R homologous locus in the common ancestors of D. melanogaster and its sibling species (Figure 12A). Soon after the divergence of D. melanogaster and D. simulans the ancestral single gene copy became duplicated on a pair of identical 200 bp repeats only in the D. melanogaster lineage (Figure 12B). This hypothetical two gene intermediate is then proposed to have been maintained in D. melanogaster populations until quite recently. During this time (approximately 5-10 MYR) the flanking sequences of the identical repeats became distinguishable by the accumulation of nucleotide substitutions caused by random sequence drift. This accumulation was not entirely random however; more changes occurred in the flanking sequences between the gene coding regions (hatched) than did in the duplicated flanking sequences outside the gene coding regions (open). The tRNA coding regions, like most other tRNA genes, remained identical during this time. The majority of the sequence differences between the repeated regions in pDt27R and p27ry2 are proposed to have occurred in this two gene intermediate. The second step in this model (Figure 12C) was the generation of a third 200 bp repeat (R3). The high similarity (98% identical) between this third repeat R3 and one of the pre-existing doublet repeats (R2) suggests that R3 arose directly from R2 and probably within the last 1-2 million years. This putative three gene intermediate would be identical to the locus described in p27ry2 (Figure 8). The last and most recent step in the model is the generation of the fourth gene contained in the 600 bp repeat R4 (Figure 12D). This duplication was different from 88 Figure 12 Model for the evolution of the pT)t27R t R N A A r g - Incus. (A) The proposed ancestral pDt27R D. melanogaster locus prior to its divergence from its closest sibling species, D. simulans. . The single t R N A A l " g coding region is shown as a filled box. The adjacent flanking sequences are shown by open boxes to position -30 and +99 and by a thick dark line to position -449 to -454. (B) A proposed two gene intermediate that arose soon after the divergence of D. melanogaster from D . simulans (@ 5-10 M Y R ) . The endpoints of the duplication event are at position -30 in the 5' flank and at position +96-98 in the 3' flank. The junction consists of three adenylate residues shown in the slash connecting the repeats. The cross-hatched boxes correspond to the internal flanking sequences that diverge more quickly than the flanking regions beyond. (C) The three gene intermediate found in a minority of D. melanogaster strains (i.e p27ry2, see Table 2 ) is proposed to have arisen quite recently (0.5-1.5 M Y R ) . It differs from the two gene intermediate by the addition of a third 200 bp repeat (R3) with an identical junction containing 3 adenylate residues. (D) The final event in the evolution of the gene cluster present in most D . melanogaster strains tested (Table 2) was the duplication of a 600 bp region (R4) containing the R12.4 gene. This region is flanked by 8 bp repeats (arrows) and contains an apparent 5 bp duplication of the target sequence at the junction. 89 ( A ) -3 0 ••• ( B ) 1) Divergence of D.mel ond D. Sim (approximately S HYR) 2 ) Dupl icat ion of 2 0 0 bp f lanking unit ( - 3 0 5 ' and * 1 0 0 3 ) w i t h A A A inser ted at junct ion 3 ) D i f fe ren t ia l accumulat ion of f lanking sequence polymorphism -30 -100 5 ' A A A (C) -4S4 -tfc_ (D) Second 200 bp duplication of the R2 repeat i.e by slippage repair or unequal exchange -30 | — 1 * \ 5 ' A A A 5 - A A A ' ^ ^ ^ R1 Duplication of 6 0 0 bp region flanking the R 3 repeat by recombination between 9 bp repeats 30 -60 5 ' AAA 90 the previous events both in the size of the region duplicated (600 bp versus 200 bp) and the novel joints generated (Figure 10). In this case the amount of duplicated sequence was larger and at their termini contains direct 8 bp repeats. The amount of divergence between the resulting two 600 bp regions is very low (<1%, Table 3) and suggests that this four gene locus arose within the last million years. Why and how this four gene arrangement became fixed in the majority of D. melanogaster strains tested is unknown. This model proposes that within the time span since the divergence of D. simulans and D. melanogaster (3-10 MYR, Zweibel et al., 1982, Cohn et al., 1984, Stephens and Nei, 1985, Caccone et al., 1988) a single member of this t R N A A r g gene family in D . melanogaster has been duplicated from one to four copies per haploid genome. In order to account for the irregular pattern of sequence polymorphisms observed between the duplicated regions in p27ry2 and pDt27R, the model proposes that the duplications occurred in successive stages using specific repeats as the templates for each subsequent duplication. The successive nature of these duplications has precedence in the structure of highly amplified DNAs in vertebrate tissue cell cultures selected for drug resistance (reviewed by Schimke, 1984, Stark and Wahl, 1984). Also, to account for different levels of divergence from the single copy D . simulans sequence (Figure 9), the model proposes first, that a two gene intermediate accumulated the majority of flanking sequence polymorphisms present, and second, that this accumulation was not random but showed a higher rate in the intergenic flanking sequences than in the extragenic flanking sequences. This difference in rate at which DNA sequences evolve has been predicted from both hybridization studies of total genomic sequences (Zweibel et al., 1982, Caccone et al., 1988) and from molecular studies of specific genomic regions (Martin and Meyerowitz, 1986). Together these features of the model account for the flanking sequence differences 9 1 that are observed in the three and four gene repeats found in D. melanogaster today. One aspect of this model that is not clear is the mechanism by which these duplications occurred. Two different kinds of event occurred as judged by the size of the duplicated region and novel joints created. In the case of the 600 bp repeat R4 direct repeats mark the boundaries of the duplicated region (Figure 10) and therefore suggest their involvement in the duplication mechanism. This has extensive precedence in examples of procaryotic gene amplification (Whoriskey et al., 1987) and is proposed to result either from homologous recombination between the direct repeats or by slippage-mispairing during DNA synthesis (Moore, 1983). The mechanism of the 200 bp duplications are less clear. In contrast to the 600 bp repeats, no obvious sequence symmetries occur at the junctions of the 200 bp duplicated regions. The only clue of the mechanism is that three additional adenylate residues have been added at the junctions. In other examples of eucaryotic genome rearrangements that involve non-homologous sequences additional nucleotides are sometimes added at junctions to create direct repeats (Roth et al., 1985). This is not the case here and suggests that these sequences may have been added by ligation events similar to what is observed during immunoglobulin rearrangements (Alt and Baltimore, 1982). The tandem duplication of a Drosophila metallothionein gene similarly occurred without flanking sequence direct repeats and also added nucleotides at the junction (Otto et al., 1986). Though the particular boundary and junction sequences differ, both this and the t R N A A r g gene duplication described here have similarities that may reflect a common mechanism for the initial generation of a tandem duplication. In this evolutionary history it is proposed that the three identical 200 bp repeats, with identical junctions, arose by two different events separated by several million years. There are three possible ways to explain this seemingly unlikely occurrence. 92 All require the generation of an initial doublet of 200 bp repeats. Once established, a third repeat (or more) could have arisen (a) by an identical event as the first, (b) by unequal exchanges between mispaired chromatids, or (c) by slippage-repair during DNA synthesis. At present it is not possible to distinguish between these possibilities but the exact conservation of both 5' and 3' flanking sequence polymorphisms between R3 and R2 would favour a slippage repair type mechanism. 5. Relation of D. simulans and D. melanogaster to other Drosophila sibling species The Southern analysis of Drosophila sibling species (Figure 7, lanes i-k ) suggested that the more distantly related D. melanogaster sibling species (D. erecta , D. yakuba , D. tiessieri ) all contain a single tRNA A r S gene at the homologous pDt27R locus. In the case of D. erecta and D. yakuba , molecular analysis of this single gene copy showed that the mature tRNA coding region does not contain the C13 polymorphism that distinguishes the pDt27R genes from all the other gene family copies in the D. melanogaster (Leung, 1988 Ph.D Thesis). This would suggest that sometime after the divergence of the D. melanogaster and D. simulans lines from the remaining sibling species (approximately 30 MYR ago), this ancestral single gene copy incurred a T13-C13 substitution relative to other members of this gene family. Later when D. melanogaster and D. simulans diverged from one another, this variant gene copy became amplified from one to four copies per genome in D. melanogaster lines. The rationale, or apparent selective advantage, for the successive duplication of this tRNA gene copy is not yet known. In the case of the Drosophila metallothionein gene, the gene duplication can be explained by the fact that the flies were selected by cadmium resistance after growth for several generations in media containing cadmium (Otto et al., 1986). However, the advantage of three more copies of a variant tRNAArg gene is not clear. A obvious possibility is that more gene 93 product could be synthesized but the selective advantage of this is unknown. Note that for two of the four duplicated gene copies, only 30 bp of 5' flanking sequence were included in the 200 bp repeats. This raises the possibility that not all of these amplified genes are active because of the now apparent requirement of 5' flanking modulatory sequences for the efficient expression of many tRNA genes. In the case of the 200 bp repeats, the 30 bp of 5' flanking sequence may not be sufficient to confer optimal template activity. The following section of this thesis, therefore, addresses whether all four of these genes are potential templates in vivo by assaying their activity in vitro, and also whether any differences in activity are evident that might help explain the duplication of these genes relative to other members of this gene family. 94 Part III- Functional studies of the t R N A A r S gene family An important question regarding the net function of a tRNA gene family is whether all gene copies are active as templates capable of synthesizing RNA products. In redundant gene families encoding Drosophila ribosomal RNAs (i.e 5S, 18S, 28S) only a portion of the total gene copies may be active (Jamrich and Miller, 1984, Sharp et al., 1984). Therefore it was of interest to determine whether all these t R N A A r S genes copies are at least potentially capable of contributing to the 4S RNA pool in vivo , and in particular, whether any differences are observed in the duplicated 1 bp variant genes found in pDt27R. As was discussed in the Introduction, it is technically difficult to distinguish the products of different gene copies in vivo when they are identical or differ by only 1-2 nucleotides. An alternative approach was to see whether individual gene copies are active in vitro using homologous Drosophila cell free extracts. 1. In vitro transcription of t R N A A r g gene family The cell extracts used for transcription in vitro were derived from embryonic Drosophila Schneider 2 cells (Rajput et al., 1982). Extracts from these cells have been shown to transcribe a variety of Drosophila tRNA genes with high efficiency (Leung et al., 1984, St. Louis and Speigelman, 1984, Lofquist and Sharp, 1986). The plasmid constructions used to compare the activity of each gene family member are described in the Methods and are indicated in Figure 4. They consist of single t R N A A r S genes flanked by at least 100 bp of 5' and 3' flanking sequences contained in closed circular plasmid vectors (pEMBL 8-/+). In each case the plasmid is named after the gene it contains (i.e pR12.5, pR12.6 etc). The cluster of four pDt27R genes were not all subcloned individually. One template (pR12.4) contains a single gene copy derived from a 600 bp repeat (R4, Figure 1). A second template (pR12.2) is 95 derived from a 200 bp repeat (R2, Figure 1). These two templates are representative of the two pairs of 600 bp and 200 bp repeats containing identical t R N A A r g genes. The transcription properties of the two untested genes (R12.1 and R12.3) are assumed to be equivalent because of their almost identical DNA flanking sequences. 1.1 Gene products of the t R N A A r g - gene family Figure 13 shows the in vitro products of identical transcription reactions containing equimolar amounts of different tRNAArg gene templates. The results indicate that each member of the gene family gives rise to a set of discrete products, albeit in some cases, with greatly differing efficiencies (see below). The smallest transcription product (M) is identical in size between all gene copies and therefore probably corresponds to the mature tRNA A r g transcript that is identical in size between each gene template. Each gene also gives rise to one, and in some cases two (pR12.5, pR85.2, pR85.1), higher molecular weight transcription products (P) that are much more abundant as judged by labelling intensity, and vary in size between the different gene copies. By analogy with other tRNA genes transcribed in vitro , these should correspond to the precursor (pre-) transcription products and contain heterogenous lengths of 5' and 3' flanking sequences depending on the site of transcription initiation and termination, respectively. Also included in Figure 13 are the transcription products of a control tRNA2 A r S gene contained in the plasmid pYH48 (Silverman et al., 1979). This gene is the most efficiently transcribed member of this related tRNA gene family using Drosophila Kc cell extracts (Dingermann et al., 1984) and also transcribes efficiently in Schneider cell extracts (Lofquist and Sharp, 1986). In addition to these mature and pre-tRNAA rg transcription products, fainter bands with intermediate electrophoretic mobility are also visible. These probably correspond to lesser amounts of partially processed intermediates. These also differ in size depending on the gene template in question. The major difference 96 Figure 13 In vitro transcription products of the t R N A A r g gene family. Equimolar amounts (1.3 nM) of plasmid DNAs containing single copies of the t R N A A r 8 gene family were transcribed in 25 pi reactions containing 100 mM KC1, 12.5 pi Drosophila Schneider cell extract, and [a^2p] GTP. The labeled transcripts were fractionated by electrophoresis in denaturing 10% polyacrylamide gels and visualized by autoradiography. The predicted primary (P) and mature (M) gene transcripts are indicated by arrows on the right. The template DNAs for each transcription were pYH48 (lane a), pR12.6 (lane b), pR12.5 (lane c), pR12.2 (lane d), pR12.4 (lane e), pR12 .4 T 1 3 (lane f). pR5'/3' (lane g), pR85.2 (lane h), pR85.1 (lane i), pR83.1 (lane j), and pR19.1 (lane k). / I I I K 3 I 9 t • II I I &> pVH48 9 pR12.6 n pR12.5 a PR12.2 • P R I 2 . 4 * PR12.4 • p R 5 V 3 " 9 PR85.2 - PR85.1 pR83.1 * PR19.1 T 1 3 98 between members of this gene family evident from Figure 13 is that the transcription products of the duplicated pR12.2 and pR12.4 genes are much less abundant than other gene copies and the control pYH48 gene. This is examined in more detail in following sections. 1.2 RNase T l Fingerprinting To confirm that the P and M bands correspond to the precursor and mature transcription products of the same gene, they were eluted from gels and subjected to RNase T l fingerprint analysis (Figure 14). In the case of the pR12.5 transcripts, the T l fingerprint of the major P band is identical to that of the M band except for the presence of one additional labeled oligoribonucleotide (Panel A, spot X). This presumably corresponds to the 5' leader T l oligoribonucleotide (5'AACUGp) predicted from the DNA sequence (Figure 6). None of the 3' trailer sequence oligonucleotides will be labelled in this experiment (labeled with [a 32pj GTP) and therefore are not visible in the Panel A. The fingerprints are consistent with them differing by the absence of ribonucleotides removed by transcript processing and thus supports their predicted precursor-product relationship. Panel C in Figure 14 shows the T l fingerprint of the P band derived from the 1 bp variant pR12.4 gene. Note that due to the low abundance of these transcription products in Drosophila cell extracts, this particular experiment used the P band from a transcription reaction that contained a heterologous cell extract in which transcription efficiency was much higher (human HeLa cells, see below). This transcription product also gives rise to a similar fingerprint but includes additional oligoribonucleotides that are expected from the different 3' trailer sequence (Figure 6), and in particular, a large oligoribonucleotide (Y) whose shifted charge mobility and size suggests it corresponds to the single ribonucleotide that should be different 99 Figure 14 RNase Tl Fingerprints of in vitro t ranscription products [cc32p] GTP labeled transcripts were eluted from gel slices and digested to completion with RNase T l . After electrophoresis at pH 3.5 on cellulose acetate in the first dimension (horizontal) and homochromatography on DEAE cellulose plates in the second dimension (vertical), the plates were exposed for autoradiography. Plate (A) shows the fingerprint of the pR12.5 major precursor transcript (P in Figure 13) and plate (B) shows the fingerprint of the corresponding pR12.5 mature transcript (M). Plate (C) shows the fingerprint of the pR12.4 gene P transcript (synthesized in HeLa cell extracts, see Figure 19). The (b) indicated in each panel shows the position of the bromophenol blue dye marker. X indicates the position of an oligoribonucleotide in the pR12.5 M transcript that is absent in the pR12.5 P transcript. In panel C, Y indicates a R12.4 oligoribonucleotide whose size and difference in mobility on cellulose acetate is consistent the single T13-C13 substitution predicted in this gene product. 101 in this transcription product (5' rCCLLAAUGp to 5' rCC£AAUGp). As predicted from the DNA sequence, the 5' leader oligoribonucleotides of the pR12.4 and pR12.5 genes are identical (Figure 6) and consequently give rise to identical spots in Panel A and C. This further supports the above conclusion that the X spot missing in Panel B corresponds to this 5' leader oligoribonucleotide. These results strongly support the conclusion that the P and M products in Figure 13 correspond to precursor and mature t R N A A r g transcription products. In addition, the single nucleotide difference predicted from the DNA sequences of the pR12.4 and pR12.5 genes is evident in the RNase T l fingerprints of their products and could conveniently provide an assay for the expression of these genes. 1.3 Initiation and termination of in vitro transcripts. To determine the length of the 5' leader sequences of these precursor transcripts, and hence the inferred Pol III 5' initiation sites, the in vitro transcripts were analysed by primer extension with reverse transcriptase (Figure 15). The total products of an in vitro transcription reaction were hybridized with a 20-mer synthetic oligonucleotide (Argl, see Methods) complementary to positions 3-22 of the predicted mature t R N A A r 8 . After extension with reverse transcriptase and deoxyribonucleotides, this primer gave rise to 22 nucleotide extension products that should correspond to the mature f R N A A r § and to longer extension products (23-29 nucleotides) that correspond to precursor transcripts that initiate further upstream (lanes c-j). Control reactions of transcription reactions containing only pEMBL vector DNA also gave rise to a 22 nt extension product and suggests that significant amounts of endogenous mature t R N A A r S are present in theDrosophila cell extract (lane a). No additional bands arise when transcripts synthesized from pYH48 ( t R N A 2 A r 8 ) are extended with Argl (lane b) and prior treatment with RNase A results in loss of all extension products (data not shown). 102 Figure 15. Primer extension analysis of; t R N A A r g t r a n s c r i p t s s y n t h e s i z e d in vitro. Total nucleic acids from 50 pi transcription reactions containing 25 pi Drosophila Schneider cell extract, 625 uM 'cold' deoxyribonucleoside triphosphates, and 0.2 pg template DNA (lanes a-i) were purified and hybridized to 5' 32 P-labelled 20 mer (Argl). After extension with A M V Reverse Transcriptase and treatment with RNase A, the extension products were fractionated by electrophoresis through a 12% urea- polyacrylamide gel alongside a sequencing ladder using the same end labeled primer and a cloned gene template (pR12.4, not shown). Extension products in lanes (a-j) were from transcription reactions containing the following template DNAs: pEMBL alone (lane a), pYH48 (lane b), pR12.5 (lane c), pR85.2 (lane d), pR19.1 (lane e), pR85.1 (lane f). pR83.1 (lane g), pR12.2 (lane h), pR12.4 (lane i), and pR5'/3' (lane j). The sizes of the extension products are indicated on the left. On the right is shown a diagram indicating the extension products (vertical arrows) derived from mature and precursor tRNAArg transcripts hybridized to the 20-mer primer Argl (bold vertical arrow). 103 a b e d e f g h i j 104 In lanes (c-j) of Figure 15 all but one member of this gene family (pR83.1) gave rise to the same predominant 26 nucleotide extension product. This corresponds to a 5' initiation site at position -4 relative to the mature 5' end of the tRNA. The only other gene copy not included in Figure 15 (pR12.6) gave identical results (data not shown). Other less abundant extension products that are a few- nucleotides longer or shorter are also visible and differ in size between different gene copies. The longest extension product is 29 nucleotides long (pR12.5, lane c) and corresponds to a minor initiation site at position -7. These major and minor transcript start sites are summarized above the flanking sequence shown in Figure 6. The length of these 5' leader regions shows that major transcription initiation site occurs at the first of two conserved adenylate residues in the 5' flanking sequence of every gene family member except one (R83.1 see Figure 6). In the pR83.1 gene the position of this pair of adenylate residues are shifted upstream 1 bp and, correspondingly, so is the position of a major transcription initiation site (Figure 15, lane g). This suggests that these adenylate residues may serve some functional purpose in the location of the transcription initiation site in members of this gene family. The pR83.1 gene also gives rise to significant amounts of transcripts that appear to initiate at position -1 and slightly less so at position -2. The only other major transcripts which initiate at sites different from the conserved pair of adenylate residues are from the pR12.4 gene (Figure 15, lane i). In this case transcription initiation appears to occur equally at the conserved adenylate (position -4) and at a guanylate residue located 2 nt upstream (position -6). The pR12.2 gene, which is identical in sequence in this region, shows much less initiation at this upstream site. The reason for this difference is unknown. The gene coding regions are identical and the 5' flanking sequence differ only by 1 nucleotide in the 30 bp that are common between these two templates. The only significant difference between these genes is that the 5' flanking sequence is not 105 truncated at position -30 in pR12.4, and thus raises the possibility that upstream sequences can also influence the site of transcription initiation. In almost every case the major and minor initiation sites in these flanking sequences, especially those upstream of the major -4 site (and therefore less likely to be the products of degradation or incomplete extension) occur at purine residues located in the non-coding strand of the 5' flank. This is consistent with properties of RNA polymerase III transcription initiation from a wide variety of examples (Geiduschek and Tocchini-Valentino, 1988). Though not tested directly (i.e by SI mapping) the sites of transcription termination in all gene copies likely occur at the first poly dT tracts that follow the coding regions. This is because the sizes of the precursor products (P) visible in Figure 13 are consistent with the predominant transcript start site and the length of the 3' tailor sequences predicted from the 3' flanking sequences (Figure 6). Transcripts from the pR12.5 gene also include a discrete product approximately 10-15 nt longer than the major P product. A second poly dT tract occurs in the pR12.5 3' flank at this position and suggests that this is a minor readthrough transcription product that terminates at the second poly dT tract. Why just this gene copy gives rise to readthrough transcripts is unknown but may result from the fact that this gene has the shortest poly dT tract (n= 5) and results in a small amount of transcription readthrough. Both the pR85.1 and pR85.2 genes also give rise to minor P products that are a few nucleotides longer than the more abundant P transcripts. The origin of these bands is not clear. 1.4 Transcription efficiency of different gene copies The most obvious difference between members of this gene family is in the signal intensity of the transcription products (Figure 13). Because they are all approximately the same specific activity (i.e number of G or U (T) residues) and were 106 synthesized from equimolar amounts of gene template, the difference in signal intensity should correspond to differences in the total amount of transcripts synthesized. The most extreme differences are in the products of the duplicated pDt27R genes (pR12.2, pR12.4 lanes d-f) which are hardly visible relative to the other gene copies derived from the same (lane b,c) or different chromosomal sites (lane h-k). These other gene family members give rise to transcription products that are comparable or exceed the abundance of a tRNA2^ r 8 gene contained in the plasmid pYH48. To quantify these differences in signal intensity, the transcription efficiency (pmoles of transcript per hour) of each gene was determined as a function of the concentration of template DNA (Figure 16) at the optimal KC1 concentration determined for each gene (Figure 17). The [KC1] was varied because preliminary experiments showed that the pDt27R genes were particularly sensitive to this salt and were markedly inhibited at the standard KC1 concentration (100 mM, Figure 17). All the remaining genes in this family are relatively unaffected by KC1 concentration and show near or optimal transcription rates in the 80-100 mM range. The amount of transcription products synthesized by each template was determined by quantifying the Cerenkov radiation emitted by gel slices containing the abundant precursor transcripts. The number of transcripts was then calculated from their specific activity (ie. number of G or U residues per transcript and concentration of labeled ribonucleotide in the transcription reaction) and plotted as a function of input template DNA (Figure 16). The total amount of accumulated transcripts increases linearly at low template concentration and then plateaus at higher concentrations (0.5-1.0 nM). Double reciprocal Lineweaver-Burke type plots of these data were then used to estimate the apparent V m a x (pmoles/hour) for each gene template (St. Louis and Speigelman, 1985). These results arc summarized in 107 Figure 16 Transcription efficiency of t R N A A r 8 gene templates Reference template pYH48 and each of the t R N A A r S gene templates were transcribed in one experiment using the same batch of Drosophila Schneider cell extract (#402). Each template was transcribed in 6 parallel 25 pi reactions containing 9 pi of extract, [a 32p] GTP, optimal KC1 concentration (see Figure 17 and Table 4), and input template DNA concentrations that varied from 0.1-2.0 pg/ml. The total DNA concentration in each reaction was maintained at a constant 20 pg/ml with pEMBL DNA. The transcription products were then fractionated by electrophoresis as in Figure 13. The radioactivity incorporated into specific primary transcripts (P bands in Figure 13) was quantified by Cerenkov radiation from excised gel slices. Using the number of GTP residues in each primary transcript and the specific activity of [oc-^p] GTP in the transcription reaction, the pmoles of transcript synthesized per hour were calculated and are plotted as a function of the concentration of input template DNA. The t R N A A f g gene templates and reference template (pYH48) are indicated in the box on the left. 108 0.30 1 2 Template DNfl (nM) • PVH48 • PR85.2 • PR12.4 0 PR12.5 • PR12.6 PR85.1 A PR83.1 A PR19.1 Table 4. Comparison of in vitro transcription efficiency between members of the gene family. Gene template Relative trartscription Optimal efficiency a KQ (mM^ pYH48 1.00 85 pR12.4 0.37c 65 PR12.5 1.77 95 pP.12.6 1.00 85 pR19.1 0.98 Q5 PR83.1 0.87 85 pR85.1 0.77 85 PR85.2 1.80 75 PR5V3* 1.12d -a) . Data from Figure 16 were replotted (1/S versus 1 /V) and the y-intercepts (1 / V) were used to estimate apparent maximum transcription rates (pmoles transcript/hour). The trartscription efficiency of each tRNA^Sgeneis expressed as a ratio of the apparent maxhrrum transcription rate of a reference template (pYH48) determined in the same experiment The correlation coefficents (R values) for all but one of the templates was 0.99-1 JOQ. pR83.1 gave an R value of 0.98 which was probably due to anomoloush/ low activity at low template concentrations (see Figure 16). b). The optimal KQ concentration for each template was determined from Figure 17. The data for pYH48 are not included in this figure but gave a broad profile with a maximun around the 85 mM range (data not shown). c) . Identical results were obtained with the pR12J2 gene (data not shown) d) . The transcription efficiency of pR573* relative to pYH48 was determined tn separate experiments in transcription reactions containing the 100 mM K d 110 Figure 17. Effect of K C I on in vitro transcription efficiency. Each t R N A A r § gene was transcribed in sets of 8 reactions (as in Figure 16) where the final added KCI was varied from 0-70 mM. The KCI contribution from the Drosophila Schneider cell extract was estimated at approximately 45 mM. The added and endogenous KCI in the transcription reactions are summed on the abscissa. The primary products (P) of each transcription were quantified as before and the results are expressed on the ordinate as a percentage of the maximal transcription efficiency observed over the range of KCI concentrations tested. The individual gene templates are indicated in each panel. I l l 112 Table 4 and are expressed as a ratio of the rate for the control t R N A 2 ^ r 8 gene (pYH48) transcribed in the same experiment. As suggested initially in Figure 13, 6 of the 10 copies in this gene family are moderately to highly active in these Drosophila cell extracts relative to the pYH48 gene. The four identical pDt27R genes are all several fold less active based on the low efficiencies of the pR12.2 and pR12.4 templates. In the more active group, two genes appear to be most efficient (pR12.5 and pR85.2) while the remainder (pR12.6, pR19.1, pR85.1, pR83.1) vary in activity at slightly more or less than the efficiency of the control gene in pYH48. 1.5 Novel properties of the pDt27R genes. The pR12.2 and pR12.4 templates derived from pDt27R show the most dramatic differences in transcription efficiency of all the members' in this gene family. They are at least 2-5 fold less active than any other gene copy and have a much lower KC1 optima. The basis of this difference in transcription properties were therefore examined in more detail. - Gene coding and flanking sequence dependence: The only differences in DNA structure that distinguish the pDt27R genes (R12.1-R12.4) from other gene copies are first, the 1 bp change at position 13 (C13 versus T13) and second, the adjacent 5' and 3' flanking sequences. To examine the effect of the single coding sequence polymorphism, the C13 of pR12.4 was converted by site specific mutagenesis to the T13 present in all other gene copies (pR12.4T^3) a n ( j transcribed in standard 100 mM KC1 reactions (Figure 13, lane f)- The results are not significantly different from the original pR12.4 template and suggest that this coding sequence polymorphism does not play a role in the distinctive transcription properties of these genes. To test the dependence of the 5' flanking sequences, a template was constructed consisting of the 5' and 3' halves of pR85.2 and pR12.4 genes, respectively, fused at 113 their common Bam HI sites, (see Methods for details of this construction). The result is an intact gene with the T13 coding region nucleotide and 5' flanking sequence of pR85.2 and the 3' coding region and flanking sequence of pR12.4. Transcription of this fusion hybrid (pR5'/3') in standard reactions results in high template activity that is no longer salt sensitive (Figure 13) and suggests that the 5' flanking sequence confers these distinctive properties on the pR12.4 gene. Both pR12.2 and pR12.4 have equivalent activity both in regard to template activity and salt optima (Figure 13, 17). These two templates differ however in the amount of unique 5' flanking sequence preceding each coding regions. The duplication of these genes discussed in Part II included 454 bp of 5' flanking sequence in the case of the pR12.4 template and only 30 bp of 5' flanking sequence in the case of the pR12.2 template. Therefore if they both have equivalent 5' flanking sequence dependant transcription properties, as is suggested by the above flank switching experiments, then these properties are likely to be conferred by the short 30 bp flank region common to both. - Template pre-incubation assays: The experiments above suggest that the 5' flanking sequences confer the distinctive salt sensitivity and low transcriptional efficiency properties of the pDt27R genes. One possible mechanism is by reducing the ability of the gene to form stable pre-initiation complexes with the Pol III transcription factors TFIIIC and TFIIIB or alternatively, the stability of complexes that do form. Such 5' flanking sequence dependence has been seen with other Drosophila (Cooley et al., 1984) and eucaryotic tRNA genes (Raymond and Johnson, 1987, Morry and Harding, 1986, Rooney and Harding, 1988, Sajjadi and Speigelman, 1989). To measure the extent and stability of pre-initiation complexes formed on the pDt27R genes they were transcribed in pre-incubation assays (Sharp et al., 1983) containing pR12.5 or pYH48 as reference templates. Transcription reactions were pre-incubated 20 minutes with increasing concentrations of pR12.4 DNA and then 114 Figure 18 Template pre-incubation assays for stable complex formation. Formation of stable pre-initiation complexes was tested by pre-incubating increasing concentrations of competitor template DNA (0- 400 ng) for 20 minutes at 2 4 ° C in 50 pi transcription reactions containing 25 pi Schneider cell extract, [a 32pj_ UTP, 100 mM KC1, and pEMBL DNA to maintain a constant total DNA concentration of 20 pg/ml. The reactions were then challenged by addition of 200 ng of reference template and then together were incubated for an additional 60 minutes. The mixture of transcription products from the two templates were then separated by electrophoresis and visualized by autoradiography. In the autoradiogram at the top, panel A shows the reference pYH48 transcripts in the presence of increasing concentrations of competitor pR12.4 DNA. In panel B the reference template is pR12.5 with the same concentrations of competitor pR12.4. In panel C the reference pR12.5 template is tested against pYH48 competitor DNA and panel D shows the converse, where pYH48 reference DNA is tested against pR12.5 competitor DNA. In all sets of six reactions, the concentration of competitor DNA was 0, 50, 100, 200, 300, and 400 ng of plasmid DNA. The inhibition of the reference DNA transcripts as a function of increasing concentrations of competitor DNA was quantified by Cerenkov radiation. The results are plotted below (inset, A,B, C, and D, as above) as a percentage of the reference DNA transcripts in the presence of no competitor DNA. 115 Competitor template D N A (nM) 116 challenged by addition of a fixed saturating concentration of reference template (either pYH48 or pR12.5). The results in Figure 18 show that pre-incubation with the pDt27R derived pR12.4 template results in the inhibition of transcription of both pYH48 or pR12.5 templates. This suggests that the pR12.4 template is fully capable of sequestering limiting transcription factors into stable complexes. When compared with the ability of either pR12.5 or pYH48 to form stable complexes, the pDt27R genes appear to be even stronger competitors than these more highly transcribed gene templates. This is also seen with Drosophila t R N A 4 V a l genes (Sajjadi and Speigelman, 1989). These assays were performed at the KC1 concentration (100 mM) where gene transcription of pR12.4 is mostly inhibited (Figure 16) and shows that the ability to form stable complexes is independent of transcriptional activity and the salt concentration in the transcription reaction. In turn this suggests that the salt sensitivity does not result from an inability to form stable complexes with transcription factors. - Extract dependence: Another question was whether the source of cell extract had any effect on the transcriptional properties of the pDt27R genes. Several studies show 5' flanking sequence modulation is most evident in homologous transcription reactions containing genes and cell extracts from the same species (Dingermann et al., 1982, Schaack and Soil, 1985). In one set of experiments, transcriptions reactions were performed at the inhibitory KC1 concentration (100 mM) in parallel reactions containing increasing fractions (pg/pg total protein) of human HeLa cell extract mixed with Drosophila extract (Figure 19). The results show that in Drosophila extract alone (lane i) the transcription products are not detectable. However, replacement of the Drosophila extract with HeLa cell extract (lanes a-h) in reactions containing identical amounts of template DNA leads to at least a 100 fold stimulation of pR12.4 transcription. To show that the stimulation is specific to the pR12.4 gene and does not result from a higher Pol III activity in these HeLa extracts. 117 Figure 19 Comparison of Drosophila and Human (HeLa) ££jj extracts—211 in vitro transcription of pR12.4 The autoradiogram shows the transcription products synthesized in 25 pi reactions containing 100 mM KCI, O.lpg pR12.4 DNA, a 32 p-GTP, and varying proportions of Drosophila Schneider cell and HeLa cell extract. The transcripts were fractionated by electrophoresis and visualized by autoradiography as described before. Lane a shows the products of reactions containing only HeLa cell extract. Lanes b-h contain increasing ratios of Drosophila extract mixed with HeLa cell extract (pg per pg total protein); 0.04 (lane b), 0.10 (lane c), 0.20 (lane d), 0.30 (lane e), 0.40 (lane 0. 0.60 (lane g), and 0.80 (lane h). Lane (i) contains transcription reactions with Drosophila extract alone. The results were quantified by counting Cerenkov radiation of the band indicated with an arrow and are plotted as a percentage of the transcripts present in HeLa extract alone (lane a). 100 0.0 —r~ 02 — r 0.4 —r~ 0.6 0.8 Ratio of Dme/Hsa extracts (micrograms protein) 119 Figure 20 Comparison of KCI sensitivity of pR12.4 and pR12.5 templates in transcrintion reactions using c r u d e HeLa ££il e x t r a c t . (A) The autoradiogram shows the transcription products of 25 pi reactions containing 9 pi HeLa cell extract, 0.1 pg template DNA, [a 32p] GTP, and total KCI from approximately 45-115 mM (as in Figure 17). Lanes (a-h) show transcription products of pR12.4 template DNA at 45, 55, 65, 75, 85, 95, 105, and 115 mM KCI respectively. Lane (i) shows the transcription products of pEMBL vector DNA alone. Lanes (j-q) are those of pR12.5 template DNA at the same KCI concentrations as before. The two major transcription products (arrows) from each template were quantified by Cerenkov radiation and are plotted in (B) as a function of KCI concentration. The filled diamonds correspond to reactions containing pR12.5 DNA and the open boxes c o r r e s p o n to r e a c t i o n s c o n t a i n i n g p R 1 2 . 4 D N A . 120 A PR12.4 pR.2 .5 i 1 i a b e d e f g h i j k l m n o p q 40 60 80 100 120 Approx imate KC1 c o n c e n t r a t i o n ( m M ) 121 the transcription rates at varying KCI concentrations were compared between pR12.4 and the normally more efficient pR12.5 template (Figure 20). The results show that in HeLa cell extract alone the two templates are now almost equivalent in activity and no longer exhibit the dramatic differences in KCI sensitivity seen in Drosophila extracts. However the total accumulation of pR12.4 transcripts is still slightly less than pR12.5 and also plateaus earlier with increasing KCI concentration. This agrees with studies of the t R N A 2 A r S gene of pYH48 in Drosophila and yeast extracts (Schaack and Soil, 1985) which show that heterologous transcription systems exhibit similar but less drastic differences in flanking sequence dependence. The reasons for this observed extract dependence are not known but have been attributed to the presence of additional factors necessary for specific tRNA gene transcription that are not conserved between species to the same extent as the transcription factors necessary for stable complex formation (Geiduschek and Valentino-Tocchinni, 1988). In this case the pDt27R genes may be repressed in Drosophila Schneider cell extracts by specific factors that either are not present or have significantly diverged in HeLa cell extracts. 1.5 Summary of in vitro transcription data. These results show that all members of this gene family are active in vitro and can potentially contribute in vivo to the total 4S RNA pool. In Drosophila extracts each template accurately initiates transcription at a conserved adenylate residue located 4-5 nucleotides upstream from the 5' end of the mature tRNA and terminates transcription at the first tract of dT residues in the 3' flank. In one case (pR12.5) small amounts of readthrough transcript were also detected. The primary transcripts are then correctly processed to yield identically sized mature transcription products. 122 Although all gene copies are active in vitro , they exhibit wide differences in template transcriptional efficiency. The majority of templates have efficiencies equivalent to or greater than the tRNA2 A r & reference gene in pYH48 (Table 4). The recently duplicated pDt27R genes, however, are at least 2-5 fold less active than other gene family members and in addition, are markedly inhibited by KCI concentrations at which the other gene copies transcribe near optimally. Both these properties appear to be conferred by the respective 5' gene flanking sequence. Because the 200 bp (pR12.2) and 600 bp (pR12.4) gene copies behave similarly in vitro and do not appear to be influenced by the 3' flanking sequence, this 5' modulatory region is predicted to occur within the first 30 bp of 5' flanking sequence. This demonstrates that the 30 bp of 5' flanking sequence included in the 200 bp duplication events were sufficient to maintain the original potential regulatory sequences associated with these genes. In regards KCI sensitivity and transcription efficiency, these results are similar to at least two other studies. One of three Drosophila t R N A A s n genes also showed marked differences in KCI optima that was not related to the ability of the gene to form stable complexes or to transcriptional efficiency (Lofquist and Sharp, 1986). Also in a comparison of constitutive and silk gland specific t R N A A ' a genes it was observed that the tissue specific gene showed wide variations in transcription efficiency depending on the extract preparation and that the KCI optima of this gene was much lower than for the constitutive t R N A A * a gene (Young et al., 1986). Thus in this latter example, differences in salt sensitivity and transcription efficiency in vitro reflected the fact that in vivo these genes were expressed in a tissue specific manner. The exact correlation between these properties is not yet known. However, it is tempting to speculate that the pDt27R genes might also be regulated in a tissue or developmental manner, especially in light of their recent duplication. But until 123 the products of these genes can be identified (see below) and followed in vivo , such speculation should be deferred. 2. In YJVQ expression: Identify of the gene nrodiicts The last question in this section addresses the in vivo identity of the tRNA products encoded by this gene family. Previous work by White et al. (1973) on the numbers and abundance of Drosophila arginine accepting tRNAs showed a relatively simple pattern. A single major isoacceptor and 3-4 minor isoacceptors could be resolved by high resolution RPC-5 chromatography (Figure 21 A). The major species corresponds to the t R N A 2 A r S encoded in pYH48. The minor isoacceptors presumably include the tRNA(s) predicted by the gene family described here and any other arginine accepting species that might also be present. According to the genetic code and rules for anticodon: codon 'wobbling', the major arginine isoacceptor t R N A 2 A r S (5'ICG) should recognize the arginine codons 5' CGC, CGU, and CGA. The 5' T(U)CG anticodon sequence of the t R N A A r g family described here predicts that they all should recognize the 5'CGA and CGG codons. Thus all four of the 5'CGN family of arginine codons may be translated by these two families of tRNA. The remaining two arginine 5' AGA and AGG codons may also be translated by a single anticodon sequence. Therefore in addition to the three gene products predicted from the t R N A A r g (5'CGU) family, at least one of the minor species should correspond to the 5'AGA or AGG class of arginine isoacceptors and together could account for all five of the species detected in vivo . To show precisely the correspondence between RPC-5 peaks and arginine accepting tRNAs will require their individual purification and ribonucleotide sequence determination. At the time of this study only one of the minor/)rosophila arginine tRNA had been purified (I.C. Gillam, unpublished results). This species, t R N A 4 A r g , corresponds to one of the peaks that elutes after the major t R N A 2 A r S species on RPC-5 124 Figure 21 Hybridization of purified tRNAj A r S to plasmid DNAs. The upper panel shows a RPC-5 profile of total [ ^ C ] labeled arginine accepting tRNAs from adult D. melanogaster flies (taken from White et al., 1973 with permission). One of the minor peaks (#4) was obtained from I.C. Gillam. Below is shown a filter containing dot blots of increasing concentrations (5 ng -25 pg) of bound, denatured plasmid DNAs. The filter was hybridized in a 50% formamide, 5X SSPE solution at 4 2 ° C with purified t R N A 4 A r g that was 3' end labeled with [ 3 2 P] pCp. The blots were then washed in 0.2X SSPE at 65°C and exposed to autoradiography. The plasmid DNAs used were (a) pDt0.3 ( t R N A 4 V a l Rajput et al., 1982), (b) pYH48 ( t R N A 2 A r g ) , (c) pR12.2, (d) pR83.1, and (e) pEMBL. 12 ADULT (Arg) i 1 N a C I ( M ) 0-70-c /  fractic OO 0-65-/ min/ —' 0-60-CM i O x 4 3 4 V 0-55-u [ \ A 5 0-50-1 ^ 7 i V-/ 1 v . 1 80 100 120 140 Fraction No. a b I c • • • • • • d • • • • • e i 126 chromatography columns. To investigate whether this tRNA is homologous to the gene family described here, tRNA^ 1"*? was end labelled with 32p a n ( j hybridized to filter bound plasmid DNAs containing t R N A A r S coding sequences (pR12.2 and pR83.1). The results (Figure 21B) show that hybrids form between pR12.2 and t R N A 4 A r S under stringent hybridization and washing conditions (see Methods). No hybrids are formed to vector DNAs or to plasmid DNA containing both related ( t R N A 2 A r g , pYH48) and unrelated tRNA genes ( t R N A 4 V a l , pDt0.3 Rajput et al., 1982). This suggests that t R N A 4 A r S is very similar to the gene products in pR12.2 but does not exclude the possibility that it is a related species with a different anticodon. Interestingly, of all the arginine species assayed through Drosophila development, it was noted that t R N A 4 A r S was one species whose abundance changed slightly between developmental stages in 4S RNA isolated from whole organisms (White et al., 1973). It will be interesting to see if tRNA4^ r 8 contains the C13 unique to the the pDt27R genes (i.e Panel C in Figure 14) and whether this change in abundance is specific to a particular tissue as is the case for certain tRNAs in other organisms (i.e. t R N A A l a in B. mori Young et al 1986). Part IV- A t R N A A r g pseudogene or retrotransposon ? 127 One additional plasmid (pDt72R) described in Figure 4 also contains sequences that hybridize the t R N A A r g probe. Like pDt66R (R83.1) this plasmid does not contain a Bam HI site and was therefore predicted to contain another variant gene coding sequence. In addition this plasmid did not give rise to in vitro transcripts using Drosophila extracts (data not shown) and suggested it was an inactive 'pseudogene'. Lastly, attempts to localize this 6.5 kbp Hind III fragment to a specific polytene region by in situ hybridization indicated that the fragment contained, or cross-hybridized with, repetitive sequence located at several different sites in the genome (S. Hayashi, unpublished results). This was also observed in Southern blotting experiments where 15-20 different sized restriction fragments hybridized the fragment (data not shown). To further characterize pDt72R, the region of homology with t R N A A r g was sequenced and is shown in Figure 22 (see also Appendix I) The results confirm that tRNA A r g homologous sequence is present in pDt72R but does not constitute a complete tRNA gene. A single 37 bp sequence is identical to the 3' half (positions 37-73) of the intact t R N A A r g genes found elsewhere in the genome. In addition, at the 3' end the DNA sequence of the non-coding strand reads 5' CCA which is equivalent to the 3' sequences added post-transcriptionally to all eucaryotic and archaebacterial tRNAs. No characteristic Pol III poly dT termination sequence is located downstream from the 3' end of this half-gene (Appendix I) which together suggests that this sequence did not originate from an authentic t R N A A r g gene. The association of tRNA-like sequences with repetitive sequences has precedence in the structure of eucaryotic retrotransposons and retroviruses; cellular tRNAs are thought to act as replication primers for these repetitive elements and are homologous to small regions within the retotransposon or retroviral genome 128 Figure 22 Structure of t R N A A r g sequence in pDt72R. The sequence of pDt72R that is homologous to the predicted t R N A A r g is enclosed relative to the intact genes from other tRNA A rg plasmids. The 3' CCA sequence is in outline print and the adjacent 5' and 3' flanking sequences of pDt72R are in lower case with their 5' and 3' orientations indicated. The first dT residue in the 5' flanking sequence replaces the G36 nucleotide of the intact genes and accounts for the loss of the Bam HI restriction site. 129 Atg tag t t aaa 3" G C AIT C C G T G G G C A T G A o G C A G G V J - « J T C t cg tacggacc 5' 130 (Dahlberg, 1980, Yuki et al., 1986, Varmus, 1988). Recently it was reported that the predicted primer binding sites (PBS) of the Drosophila retrotransposons mdg 1 and 412 are homologous to the 3' 15 nucleotides of a tRNA identical to the t R N A A r S species described here (Yuki et al., 1986). These two families of retrotransposons are divergent in nucleotide sequence but are evolutionarily related based on their identical PBS sites and homology in the predicted reverse transcriptase protein sequence. To test the possibility that the t R N A A r S sequences in pDt72R are related to a Drosophila retrotransposon, the sequences around the half gene in pDt72R were compared to the available sequences of 472 (Will et al., 1981) and mdg 1 (Kulgushkin et al., 1981). No homology was detected between pDt72R and the 412 LTR sequences. In contrast, the pDt72R sequence immediately adjacent to the 3' end of the tRNA A r S half gene is identical to the last 6 nucleotides of the mdg 1 5' LTR (Figure 23). After a deletion of 31 bp the pDt72R sequence resumes homology with the upstream mdg 1 LTR sequence and continues up to the Eco RI site which is the end of the region of pDt72R for which sequences were determined. The overall difference in nucleotide sequence between mdg 1 and this portion of pDt72R is only about 7.5% and shows that these two DNAs are closely related to one another. The next question concerns the identity of the DNA contained in pDt72R. It has obvious similarity to the LTR of mdg 1 but this does not necessarily mean that pDt72R is a retrotransposon itself. The repetitive nature of pDt72R could arise by cross-hybridization with authentic mdg 1 elements. In fact, the genomic Southern hybridization experiments in Part I (Figure 3) suggest that 37 bp t R N A A r S half gene sequences are probably not significantly more numerous than the number of half genes predicted from digestion of the 10 intact t R N A A r 8 gene copies with BamHI (Figure 3, lane d). This suggests that the 37 bp t R N A A r § half gene in pDt72R is present at only one or a few copies per haploid genome. Combined with the rather 131 Figure 23 Comparison of pDt72R to the D. melanosaster rEtrotranspofiQn mdg 1. (A) The sequence of the 5' mdg 1 long terminal repeat (LTR I) is shown (5'-3'from left to right) with position 1 corresponding to the first nucleotide of the 4 bp direct repeat that occurs at the junction of genomic and mdg 1 sequence (Kulguskin et al., 1981). The sequence of pDt72R begins at the Eco Rl site located at position 180 in the mdg 1 LTR and is aligned below themdg 1 sequence. Identities between the two sequences are shown by dashes and deletions are shown by asterisks. The primer binding site (PBS) of mdg 1 is boxed and compared to the sequence of the t R N A A r g fragment found in pDt72R. The arrows indicate the inverted repeats that occur at both ends of each LTR. The thin underlined sequences show the 11 bp direct repeats that flanked the deleted sequence in pDt72R. The thick underline indicates the putative 'TATA' box of the LTR. (B) The relative position of the PBS within a retrotransposon is indicated in the diagram (not to scale). The open boxes correspond to LTR I and LTR II and the connecting thin line indicate the internal gene coding regions. The position of the PBS is shown by the small arrow and would correspond to a tRNA primer in the 5' to 3* orientation adjacent to the 3' end of LTR I. 1 3 2 , , ^ 60 I RTCGfrGTRGT RTRTRCGftftT RTRRTRRCftfl TRRTRRTRRT RRCRRTARTR RTRRTRRTRT 120 TftflTRRTRRT TRTRRTRTGR RTCRTRRTftfl TRftCTCBfrCT RRTRRGTRflfl CTTRGGRCCR 180 CCCTfiflTTCC TTRGGGTCRC CCTRGTRGRT CTTTRGRTRC RCCCTflflTRC TflflflTRTGQG] EcoRI 240 iRRTTQRGGflT GTRCGCCTTT RGGGGTCGGR CTCGRCTCCC RTTGGTTRTC GRGTRRTGRR C G * C * 300 CTTCHTRCRT RCflTRTTGCfl GflOTTTGCTR GTGTCRGCflC TTGGCTGTCfl CflflGRGRTCT -G—C-380 CCCTGTRGftC CRCflCTRRGfl TCRGTTRTRfl TRCRGGftflTfl GflTCRGGflflT GTRCRCTCGC -G— RT- -flT-420 TTflflTRfWRfl CCRRRTRRRG RTftflfl*RTGRC C*ftoCTGCG TTTTGflGflC* TTTRTTRRCT - G - * T - -CT-C - f l -480 RCRTCRGRftG TRTTTRGRflT TCRRflTTRRCT RCfJTGGCGR CCGTGRCRRfl GGftfrCGTTflT mdg 1 ** —^rcGRRC p D t 7 2 R 133 high degree of sequence divergence in the LTR-like portion of pDt72R, this suggests that pDt72R either is not a retrotransposon or is one that is inactive in D . melanogaster and is not present in the high copy number seen for authentic mdg 1 elements. A more clear understanding of pDt72R will require comparison of the sequence on the 5' side of the tRNA region. Homology to mdg 1 on both sides of this PBS-like sequence would support the idea that pDt72R corresponds to an intact, but low copy number or defective retrotransposon. However, the unusual structure of the PBS is one argument against pDt72R being part of an authentic retrotransposon. In 412 , mdg 1, and most other eucaryotic retrotransposons or retroviruses, the PBS consists of the terminal 11-18 nucleotides of a corresponding tRNA (Dahlberg , 1980, and references in Kikuchi et al., 1986). The PBS-like sequence in pDt72R however, although identical in position relative to the LTR of mdg 1 , is 40 nt long (including the 3'CCA) and includes the entire 3' half of the gene. The only other tRNA half gene of this kind was a mouse t R N A p h e pseudogene which also contained the 3' CCA (Reilly et al., 1982). Other unusual Drosophila PBS sequences include one for the mdg 3 retrotransposon which predicts a leucine tRNA that is lacking the terminal 5 nucleotides (Saigo, 1986). The predicted primer of the copia family of retrotransposons is more bizarre; it consists of the 5' half of a tRNA^et that has been specifically cleaved at position 39 (Kikuchi et al., 1986) such that the 3' end of this tRNA fragment is homologous to a 15 nucleotide region adjacent to the copia 5' LTR. Thus while the structure PBSs in different retrotransposons can be quite variable they all tend to be much shorter than the sequence in pDt72R. With incomplete data on the homology of pDt72R to mdg 1 it is difficult to account for the origin of the half gene in pDt72R. If no further homology exists between these two sequences then the origin of pDt72R could be explained by recombination between an intact t R N A A r g gene and an mdg 1 element, which, followed by further 134 DNA scrambling (i.e truncation of the tRNA gene) possibly could yield a half tRNA gene fused to an mdg 1 5' LTR sequence. Alternatively, if more extensive homology exists between pDt72R and mdg 1 then the unusual structure of the PBS site can be interpreted in at least two ways. The fact that the 5' end of the tRNA half gene (position 37) corresponds exactly to one of two processing sites for intron containing tRNAs raises the possibility that a t R N A A r 6 molecule, which do not contain introns, was aberrantly cleaved at this position and then subsequently was utilized as a primer during mdg 1 transposition. During the putative reverse transcription of mdg 1 RNA intermediates, this now smaller tRNA primer may have become mis-incorporated into the DNA genome of this particular element. This would account for the predicted low copy number and position of the t R N A A r S truncation. Alternatively, the half gene may represent an ancestral form of an mdg 1 element. This also might account for the extended size of the PBS site and the sequence divergence in the LTR region. Until additional mdg 1 sequence is available for comparison to pDt72R it is not possible to distinguish between these possibilities. This example does demonstrate however that not all sequences which hybridize to tRNA probes will actually represent intact genes. It also demonstrates the involvement of tRNAs in at least one of several diverse non-protein synthetic functions (i.e Schon et al., 1986, Ferber and Ciechanover, 1987). 135 CONCLUSION and PERSPECTIVES The goal at the outset of this study was to try to understand the duplicated organization of a set of t R N A A r 6 genes found in the plasmid pDt27R. This was attempted first by analyzing additional members of this gene family and comparing their properties in vitro , and second, by comparing the duplicated loci from closely related species. While an explanation for these duplication events is still not clear, several intriguing facts are now available. By comparison with closely related sibling species it is proposed that the genes have been duplicated in successive stages over a period of several million years. Most recently it is proposed that the genes duplicated from 2 to four gene copies by at least two different mechanisms that account for the size difference and novel junctions of the duplication units. In addition it appears that the rates of flanking sequence divergence were not equivalent in the duplicated sequences. This suggests that duplication of these genes did not occur by chance, but rather, were subject to unknown selective pressures that resulted in multiple events occurring at the same locus. The difference in flanking sequence divergence is curious and raises the possibility that mechanisms exist to selectively differentiate the flanking sequences, and in the case of tRNA genes, potential regulatory sequences, of newly duplicated gene copies. By comparison with other members of this genes family, it can now be seen that these duplicated genes are part of a larger gene family and that they are unique from other members of this family both in the structure of their tRNA products and the in vitro transcription properties of the gene templates. These results are consistent with the idea that a specific function may be associated with these genes and that this might be the basis for their duplication. 136 T h e s e d i f f e r e n c e s b e t w e e n the d u p l i c a t e d g e n e s and o t h e r m e m b e r s o f the g e n e f a m i l y s u g g e s t tha t n o t a l l m e m b e r s o f t h i s g e n e f a m i l y a re r e d u n d a n t i n a f u n c t i o n a l s e n s e . T h i s r a i s e s the p o s s i b i l i t y that t he m u l t i c o p y o r g a n i z a t i o n o f t R N A g e n e s i n g e n e r a l i s n o t s o l e l y to r a i s e the a m o u n t o f the t R N A g e n e p r o d u c t , bu t r a t h e r to p r o v i d e the c e l l w i t h m o r e f l e x i b i l i t y i n the a m o u n t o f t R N A , o r s p e c i f i c v a r i a n t s o f a t R N A , n e e d e d f o r d i f f e r e n t c e l l t y p e s o r s tages i n d e v e l o p m e n t . In tu rn t h i s r a i s e s t h e p o s s i b i l i t y t ha t t R N A g e n e s d o n o t n e c e s s a r i l y c o - e v o l v e b y m e c h a n i s m s o f ' m o l e c u l a r d r i v e ' ( D o v e r , 1 9 8 2 ) i n o r d e r to r e m a i n i d e n t i c a l i n t h e i r m a t u r e c o d i n g s e q u e n c e s , bu t i n s t e a d m a y e v o l v e i n d e p e n d e n t l y b a s e d o n the r e q u i r e d f u n c t i o n o f i n d i v i d u a l g e n e c o p i e s . H o w e v e r , n o t e that Drosophila s t r a i ns tha t c o n t a i n c h r o m o s o m a l d e f i c i e n c i e s o v e r t he 1 2 E 1 - 2 r e g i o n are s t i l l v i a b l e ( D . S i n c l a i r , p e r s o n a l c o m m u n i c a t i o n ) . T h i s s u g g e s t s tha t a n y s p e c i a l f u n c t i o n o f t h e p D t 2 7 R g e n e s a re n o t l e t h a l w h e n a b s e n t . W h e t h e r s u c h d e f i c i e n c i e s w o u l d be m a i n t a i n e d i n w i l d p o p u l a t i o n s is no t k n o w n . T o a n s w e r these q u e s t i o n s it w i l l b e n e c e s s a r y to m e a s u r e the e x p r e s s i o n o f t R N A g e n e c o p i e s i n d i v i d u a l l y in vivo. T h i s m a y b e a t t e m p t e d e i t h e r b y g e r m - l i n e t r a n s f o r m a t i o n ( S p r a d l i n g a n d R u b i n , 1 9 8 2 ) o f c l o n e d t R N A g e n e s tha t h a v e b e e n a l t e r e d in vitro to d i s t i n g u i s h t h e m f r o m e n d o g e n o u s g e n e c o p i e s , o r a l t e r n a t i v e l y , b y u t i l i z i n g u n i q u e s e q u e n c e s i n p r i m a r y t r a n s c r i p t s to a s s a y t h e e x p r e s s i o n o f i n d i v i d u a l e n d o g e n o u s g e n e c o p i e s ( i . e i n t r o n s o f the Drosophila t R N A ^ y r gene f a m i l y , S u t e r a n d K u b l i , 1 9 8 8 ) . In t h i s i n t r o n l e s s t R N A A r g g e n e f a m i l y a r e l a t e d a p p r o a c h m i g h t be to u s e the u n i q u e 3' s e q u e n c e s i n t he p r i m a r y t r a n s c r i p t s as g e n e s p e c i f i c m a r k e r s . In the c a s e o f the d u p l i c a t e d t R N A A r g g e n e s , 14 n u c l e o t i d e s o c c u r b e t w e e n the 3' e n d o f the g e n e and the b e g i n n i n g o f the p o l y d T t e r m i n a t i o n s i g n a l . W i t h 1 to 2 n u c l e o t i d e s f r o m the c o n s e r v e d r e g i o n s o n e i t h e r s i d e , an o l i g o n u c l e o t i d e 1 7 - 1 8 n u c l e o t i d e s l o n g m i g h t be s p e c i f i c to t h i s o n e g e n e t y p e and t h u s be u s e d f o r p r i m e r e x t e n s i o n e x p e r i m e n t s o f n u c l e a r p r i m a r y t r a n s c r i p t s . However these regions tend to be AT rich and are shorter in other gene copies pR12.5- 10 nucleotides) and therefore would not be applicable to all members of gene family. 138 REFERENCES: Addison, W.R., Astell, C.R., Delaney, A.D., Gillam, I.C., Hayashi, S., Miller Jr., R.M., Rajput, B., Smith, M. , Taylor, D.M., and Tener, G.M. (1982) The structures of genes hybridizing with t R N A 4 V a * from Drosophila melanogaster J. Biol.Chem. 257 , 670-673. Alt, F.W. and Baltimore, D. (1982) Joining of immunoglobulin heavy chain gene segments: implications from a chromosome with evidence of three D-JJJ fusions. Proc. Natl. Acad. Sci. USA 79, 4118-4122. Arnold, G. J. and Gross, H.J., (1987) Unrelated leader sequences can efficiently promote human tRNA gene transcription. Gene 51. 237-246. Arnold, G.J., Schmatzler, C , Thoman, U., van Tof, H., and Gross H.J. (1986) The human t R N A ^ a l gene family: organization, nucleotide sequences and homologous transcription of three single copy genes Gene 44 . 165-174. Baker, R.E., Eigel, A., Vogel, D., and Feldman, H.,(1982) Nucleotide sequences of yeast genes for t R N A 2 ^ e r , t R N A 2 A r S , and t R N A i V a l : homology blocks occur in vicinity of different tRNA genes EMBO J. 1 , 291-295. Birchler, J.A., Owenby, R.K., and Jacobson, K,B. (1982) Dosage compensation of serine-4 transfer RNA in Drosophila melanogaster. Genetics 102. 525-537. Bjork, G.R., Ericson, J.U., Gustafsson, C.E.D., Hagervall, T .G. , Jonsson, Y.H. , and Wikstrom P.M. (1987) Tranfer RNA Modification. Ann. Rev. Biochem. 5_6_ , 263-287. Bull, P., Thorikay, M . , Moenne, A., Wilkens. M. , Sanchez, H., Valenzuela, P., and Venegas, A. (1987) The Yeast t R N A P n e gene family: structures and transcriptional activities reveal memeber differences not expected by intragenic promoters. DNA 6_, 353-362. Burke, D.B and Soil, D. (1985) Functional analysis of fractionated Drosophila Kc Cell extract cell tRNA gene transcription components. J. Biol.Chem. 260 , 816-823. 139 Caccone, A., Amato, G.D., and Powell, J.R. (1988) Rates and patterns of scnDNA and mtDNA divergence within the Drosophila melanogaster subgroup. Genetics 118. 671-683. Carbon, P., Murgo, S., Ebel, J-P., Krol, A., Tebb, G., and Mattaj, I. W. (1987) A common octamer motif binding protein is involved in the transcription of U6 snRNA by RNA polymerase III and U2 snRNA by RNA polymerase II. Cell 5 1 , 71-79 Cedergren, R.J., Sankoff, D., LaRue, B., and Grosjean, H. (1981) The evolving tRNA molecule. Crit. Rev. Biochem. 1_L 35-104. Chang, Y-N, Pirtle, I.L., and Pirtle, R.M. (1986) Nucleotide sequence and transcription of a human tRNA gene cluster with four genes. Gene 48_ , 165-174. Chang, D.D. and Clayton, D.A. (1989) Mouse RNase MRP RNA is encoded by a nuclear gene and contains a decamer sequence complementary to a conserved region of mitochondrial RNA substrate. Cell 5JL 131-139. Choffat, Y., Suter, B., Behra, R., and Kubli, E. (1988) Pseudouridine modification in the t R N A ^ y anticodon is dependant on the presence, but independant of the size and sequence, of the intron in eucaryotic t R N A ^ y genes. Mol. Cell. Biol, ft, 3332-3337. Chung, J., Sussman, D.J., Zeller, R., and Leder, P. (1987) The c-myc gene encodes superimposed RNA polymerase II and III promoters. Cell 5_1_, 1001-1008. Coen, E.S. and Dover, G.A. (1983) Unequal exchanges and the co-evolution of X and Y rDNA arrays in Drosophila melanogaster. Cell 33. 849-855. Cooley, L., Schaack, J., Burke, J.D., Thomas, B., and Soli, D. (1984) Transcription factor binding is limited by the 5' flanking regions of Drosoph ila t R N A H i s gene and a tRNAHis pseudogene. Mol. Cell. Biol. 4 , 2714-2722. Cribbs, D. L. (1982a) Ph.D dissertation. University of British Columbia. Cribbs, D.L., Gillam, I.C., Tener, G.M. (1982b) The structure of t R N A 5 L y s from Drosophila melanogaster. Nucleic Acids Res. 1_Q_, 6393-6399. 140 Cribbs, D.L., Leung, J., Newton, C.H., Hayashi, S., Miller Jr., R . C , and Tener, G.M., (1987). Extensive microheterogeneity of serine tRNA genes from Drosophila ' melanogaster. J. Mol. Biol. 197. 397-404. Dahlberg, J.E. (1980) tRNAs as primers for reverse transcriptases. In Transfer RNA:Biological Aspects (eds. Soli, D., Abelson, J.N., and Schimmel, P.R.) Cold Spring Harbour Laboratory, Cold Spring Harbour, NY pp. 507-516. de Boer H.A. and Kastelein, R.A. (1986) Biased codon usage: an exploration of its role in the optimization of translation. In Maximizing Gene Expression (Ed. Reznikoff, W and Gold, L.) Butterworths. Stoneham, MA pp. 225-277. Defranco, D, Sharp, S., and Soil, D. (1984) Identification of regulatory sequences contained in the 5' flanking sequence of Drosophila t R N A 2 ^ y s genes. J. Biol. Chem. 259. , 12424-12429. Defranco, D., Burke, K.B., Hayashi, S., Tener, G.M., Miller Jr., R . C , and Soli, D. (1982) Genes for tRNA5^ys f r o m Drosophila melanogaster. Nucleic Acids Res. 1_0_, 5799-5808. Dente, L., Cesareni, G., and Cortese, R. (1983) pEMBL: a new family of single stranded plasmids. Nucleic Acids Res. 1_L, 1645-1656. DeLotto, R. and Schedl, P. (1984) A Drosophila melanogaster transfer RNA gene cluster at the cytogenetic locus 90BC. J. Mol. Biol. 179. 587-605. Dingermann, T., Sharp, S., Appel, B., Defranco, D., Mount, S., Helerman, R., Pongs, O., and Soil, D. (1981) Transcription of a cloned tRNA and 5S RNA genes in a Drosophila cell free extract. Nucleic Acids Res. 9_ . 3907-3918. Dingermann, T., Burke, D.B., Sharp, S., Schaack, J., and Soli, D. (1982) The 5' flanking sequences of Drosophila t R N A A r S genes control their in vitro transcription in a Drosophila cell extract. J. Biol. Chem. 257. 14738-14744. 141 Doran, J.L., Wei, X., and Roy, K.L. (1987) Analysis of a human gene cluster coding for t R N A p h e and tRNA L y s . Gene 5_£ , 231-243. Dover, G. (1982). Molecular Drive: a cohesive mode of species evolution. Nature 299. 111-117. Dudler, R., Egg, A.H. , Kubli, E., Artavanis-Tskonas, S., Gehring, W.J., Steward, R., and Schedl P. (1980) Transfer RNA genes of Drosophila melanogaster. Nucleic Acids Res. & , 2921-2936. Dunn, R., Delaney, A.D., Gillam, I.C., Hayashi, S., Tener, G.M., Grigliatti, T., Misra, V., Spurr, M . G . , Taylor, D . M . , and Miller Jr., R.C. (1979a) Isolation and characterization of recombinant DNA plasmids carrying Drosophila tRNA genes. Gene 1 , 197-215. Dunn, R., Hayashi, S., Gillam, I.C., Delaney, A.D., Tener, G.M., Grigliatti, T.A., Kaufman, T.C., and Suzuki, D.T. (1979b) Genes coding for valine transfer ribonucleic acid-3b in Drosophila melanogaster. J. Mol. Biol. 128. 277-287. Elder, R., Szabo, P., and Uhlenbeck, O. (1980). 4S gene organization in Drosophila  melanogaster . In Transfer RNA:Biological Aspects (eds. Soli, D., Abelson, J.N., Schimmel, P.R.) Cold Spring Harbour Laboratory, Cold Spring Harbour, NY pp. 317-323. England, T.E., and Uhlenbeck, O.C. (1978) 3" terminal labelling of RNA with T4 RNA ligase. Nature 275. 560-561. Ferber, S. and Ciechanover, A. (1987) Role of arginine -tRNA in protein degradation by the ubiquitin pathway. Nature 326. 808-811. Folk, W.R. and Hofstetter, H. (1983) A detailed mutational analysis of the eucaryotic t R N A ! M e t gene promoter. Cell 3_3_, 585-593. Frendewey, D., Dingermann, T., Cooley, L., and Soli, D. (1985) Processing of precursor tRNAs in Drosophila: processing of the 3' end involves an endonucleolytic clevage and occurs after 5' end maturation. J. Biol. Chem. 260 , 449-454. 142 Garel, J.P. (1982) The silkworm, a model for the molecular and cellular biologists. Trends Biochem. Sci. 3_, 105-108. Geiduschek E.P. and Tocchini-Valentini, G.P. (1988) Transcription by RNA polymerase III. Ann. Rev. Biochem. 5J_ , 873-914. Glew, L., Lo, R., Reece, T., Nichols, M. , Soil, D., and Bell, J. (1986) The nucleotide sequence, localization and transcriptional properties of a t R N A L e u (CUG) gene from Drosophila melanogaster . Gene 44, 307-314. Gouilloud, E. and Clarkson, S.G., (1986) A dispersed tyrosine tRNA gene from Xenopus  laevis with high transcriptional activity in Vitro J. Biol.Chem. 261. 486-494. Grunstein, M. and Hogness, D. (1975) Colony hybridization: a method for isolation of cloned DNAs that contain a specific gene. Proc. Natl. Acad. Sci. USA 7JL 3961-3965. Guthrie, C. and Abelson, J. (1982) Organization and expression of tRNA genes in Saccharomyces cerevisiae in The molecular biology of yeast S accharomyces: metabolism and gene expression (Ed. Strathern J.N, Jones E.W., and Broach, J.R. ) Cold Spring Harbour Laboratory, Cold Spring Harbour. NY pp. 487-528. Hanahan, D. (1983). Studies on transformation of Escherichia coli J. Mol. Biol. 166. 557-580. Hansen, L.J., Chalker, D.L., and Sandmeyer, S.B. (1988) Ty3, a yeast retrotransposon associated with tRNA genes, has homology to animal retroviruses. Mol. Cel. Biol. 8_, 5245-5256. Hatlen, L. and Attardi, G. (1971) Proportion of the HeLa cell genome complementary to transfer RNA and 5S RNA. J. Mol. Biol. i £ , 535-553. Hattori, M. and Sakaki, Y. (1986) Dideoxy sequencing method using denatured plasmid templates. Anal. Biochem. 152. 232-238. 143 Hayashi, S., Gillam, I.C., Delaney, A.D., Dunn, R., Tener, G.M., Grigliatti, T.A., and Suzuki, D.T. (1980) Hybridization of tRNAs of Drosophila melanogaster to polytene chromosomes. Chromosoma 7_6_, 65-84. Hayashi, S., Gillam, L C , Grigliatti, T.A., and Tener, G.M. (1982) Localization of tRNA genes of Drosophila melanogaster by in situ hybridization. Chromosoma 8_6_, 279-292. Hedgcoth, C , Hayenga, K., Harrison, M., and Ortwerth, B.J. (1984) Lysine tRNAs from rat liver: lysine tRNA sequences are highly conserved. Nucleic Acids Res. 12. 2535-2541. Hershey, P.D. and Davidson, N. (1980) Two Drosophila melanogaster tRNA^iy genes are contained in a direct duplication at chromosomal locus 56F. Nucleic Acids Res. i , 4899-4910. Hofstetter, H. , Kressman, A., and Birnsteil, M.L. (1981) A split promoter for a eucaryotic tRNA gene. Cell 24, 573-585. Hosbach, H.A. , Silberklang, M . , and McCarthy, B.J. (1980) Evolution of a D ^ melanogaster glutamate tRNA gene cluster. Cell 2JL , 169-178. Hottinger-Werben, A., Schaack, J., Mao, J.I. Nichols, M. , and Soli, D. (1985) Dimeric tRNA gene arrangement in S_. pombe allows increased expression of downstream gene. Nucleic Acids Res. i i , 8739-8747. Huibregtse, J.M., Evans, C.F., and Engelke, D.R. (1987) Comparison of tRNA gene transcription complexes formed in vitro and in nuclei. Mol. Cell. Biol. 7_i 3212-3220. Ikemura, T. (1985) Codon usage and tRNA content in unicellular and multicellular organisms. Mol. Biol. Evol. 2 , 13-34. Indik, Z.K. and Tartof, I. (1982) Glutamate tRNA genes are adjacent to 5S RNA genes in Drosophila and reveal a conserved upstream sequence (the ACT-TA box). Nucleic Acids Res. 10 , 4159-4172. 144 Inouye, S., Saigo, K., Yamada, K., and Kuchino, Y. (1986) Identification and nucleotide sequence determination of a potential primer tRNA for reverse transcription of a Drosophila retrotransposon, 297. Nucleic Acids. Res. 14 , 3031-3043. Jamrich, M. and Miller, O.L. (1984) The rare transcripts of interrupted rRNA genes in Drosophila melanogaster are processed or degraded during synthesis. EMBO J. 3_, 1541-1545. Kikuchi, Y., Ando, Y., and Shiba, T. (1986) Unusal priming mechanism of RNA directed DNA synthesis in copia retrovirus-like particles of Drosophila. Nature 312. 824-826. Kondo, K., Hodgkin, J., and Waterston, R.H. (1988) Differential expression of five t R N A ^ r p (TJAG) amber suppressors in Caenorhabditis elegans. Mol. Cell. Biol. 8_, 3627-3635. Koski, R.A., Clarkson, S.G., Kurjan, J., Hall, B.D., and Smith, M. (1980) Mutations of the yeast SUP4 tRNA^yr locus: transcription of the mutant gene in vitro Cell 22. 415-425. Kubli, E. (1982) The genetics of transfer RNA in Drosophila. Adv. Genet. 21 , 123-172. Kurjan, J., Hall, B.D., Gillam, S., and Smith, M. (1980) Mutations at the yeast SUP4 t R N A ^ y r locus: DNA sequence changes in mutants lacking suppressor activity. Cell 20., 701-709. Kulgushkin, V .V. , Ilyin.Y.V., and Georgiev, G.P. (1981) Mobile dispersed genetic element MDG 1 of Drosophila melanogaster: nucleotide sequence of long terminal repeats. Nucleic Acids Res. 9_ , 3451-3464. Kunkel, G.R., Maser, R.L, Calvet, J.P. and Pederson, T. (1986) U6 small nuclear RNA is transcribed by RNA polymerase III. Proc. Nat'l. Acad. Sci. USA 8_3_ , 8575-8579 145 Krupp, G. and Gross, H. (1983) Sequence analysis of in vitro 32p_i a u CiC (j RNA. In The modified nucleosides of transfer RNA II. (eds. Agris, P.F. and Kopper, R.A.) Alan R. Liss Inc., New York. NY pp. 11-58. Larsen, T .M. , Miller Jr., R . C , Speigelman, G.B., Hayashi, S., Tener, G.M., Sinclair, D.A.R., and Grigliatti, T.A., (1982) RNA:DNA hybridization analysis of tRNA3bVal in Drosophila melanogaster . Mol. Gen. Genet. 185. 390-396. Lassar, A. B., Martin, P.L., and Roeder, R.G. (1983) Transcription of class III genes: formation of preinitiation complexes . Science 222 , 740-748. Leung, J. (1988) Ph.D dissertation. University of British Columbia. Leung, J., Addison, W.R., Delaney, A.D., MacKay, R.M., Miller Jr., R.M., Speigelman, G.B., Grigliatti, T.A., and Tener, G.M. (1984) Drosophila melanogaster t R N A 3 D V a l genes and their allogenes. Gene 3_4 , 207-217. Li , W-H. (1983) Evolution of duplicate genes and pseudogenes. In Evolution of genes and proteins, (eds. Nei, M. and Koehn, R.K.) Sinauer Associates Inc., Sunderland, MA. pp. 15-37. Lin, F.-K., Furr, T.D., Chang, S.H., Horwitz, J., Agris, P.F., and Ortwerth, B.J. (1980) The nucleotide sequence of two bovine lens phenylalanine tRNAs. J. Biol. Chem. 255. 6020-6023. Lin, V.K. and Agris, P.F. (1980) Alterations in tRNA isoaccepting species during erythroid differentiation of the Fiend leukemia cell. Nucleic Acids Res. 8_, 3467-3489. Lo, R . Y . C , Bell, J.B., and Roy, K.L. (1982) Dihydrouridine-deficient tRNAs in Saccharomyces cerevisiae. Nucleic Acids Res. 1Q_, 889-902. Lofquist, A. and Sharp, S, (1986) The 5' flanking sequences of D r o s o p h i l a  melanogaster t R N A 5 A s n genes differentially arrest RNA polymerase III. J. Biol. Chem. 26J. , 14600-14606. 146 Long, E. and Dawid, I. (1980) Repeated genes in eucaryotes. Ann. rev. Biochem. 49. 727-764. Looney, J.E. and Harding, J.D. (1983) Structure and evolution of a mouse tRNA gene cluster encoding t R N A A s P , t R N A G 1 y , and t R N A G l u and an unlinked, solitary gene encoding t R N A A s P . Nucleic Acids Res. 11, 8761-8776. Ma, D.P., Lund, E., Dahlberg, J.E., and Roe, B.A., (1984) Nucleotide sequences of two regions of the human genome containing t R N A A s n genes. Gene 2JL , 257-262. Makowski, D.R., Haas, R.A., Dolan, K.P., and Grunberger, D. (1983) Molecular cloning, sequence analysis and in vitro expression of a rat tRNA gene cluster. Nucleic Acids Res. H , 8609-8620. Maniatis, T., Fritsch, E.F., and Sambrook, J., (1982) Molecular cloning. A laboratory manual. Cold Spring Harbour Laboratory, Cold Spring Harbour, NY. Martin, C.H., and Meyerowitz, E.M. (1986) Characterization of the boundries between adjacent rapidly and slowly evolving genomic regions in Drosophila . Proc. Natl. Acad. Sci. USA £3_, 8654-8658. Marschalek, R. and Dingermann, T. (1988) Identification of a protein factor binding to the 5' flanking region of a tRNA gene and being involved in modulation of tRNA gene transcription in vivo in Saccharomyces cerevisiae. Nucleic Acids Res. 16_, 6737-6752. Maxam, A . M . and Gilbert, W. (1980). Sequencing end-labeled DNA with base-specific chemical cleavages. In Methods in Enzymology. 65 . (eds. Grossman, L. and Moldave K.) Academic Press, New York, NY pp. 499-580. Mattaj, I.W., Dathan, N.A., Parry, H.D., Carbon, P., and Krol, A. (1988) Changing the RNA polymerase specificity of U snRNA gene promoters. Cell 5_5_, 435-442. Melton, D.A., Kreig, P.A., Rebagliati, A.J., Maniatis, T., Zinn, K., and Green, M.R. (1984) Efficient in vitro synthesis of biologically active RNA and RNA hybridization 147 probes from plasmids containing bacteriophage SP6 promoter. Nucleic. Acids Res. 12, 7035-7056. Meng, Y.B., Stevens, R.D., Chia, S., McGill, S., and Ashburner, M. (1988) Five glycyl tRNA genes within the noc gene complex of Drosophila melanogaster. Nucleic Acids Res. 16_, 7189. Miller, J . H. (1972) Experiments in molecular genetics. Cold Spring Harbour Laboratory, Cold Spring Harbour, NY. Moore, G.P. (1983) Slipped-mispairing and the evolution of introns. Trends. Biochem. Sci. 4, 411-414. Morry, M.J. and Harding, J.D. (1986) Modulation of transcriptional activity and stable complex formation by 5' flanking regions of mouse t R N A ^ l s genes. Mol. Cell. Biol. 6_, 105-115. Muller, F. and Clarkson, S.G. (1980) Nucleotide sequence of genes coding for t R N A p h e and t R N A T y r from a repeating unit of X. laevis DNA. Cell 12, 345-353. Munz, P., Amstutz, J., Kohli, J., and Leopold, U. (1982) Recombination between dispersed serine tRNA genes in Schizosaccharomyces pombe . Nature 300 , 225-231. Murphy, S., Di Liegro, C , and Melli, M. (1987) The in vitro transcription of the 7SK RNA gene by RNA polymerase II is dependant only on the presence of an upstream promoter. Cell 5_1_, 81-87. Naylor, S.L., Sakaguchi, A.Y., Shows, T.B., Grzeschik, K.-H., Holmes, M., and Zasloff, M. (1983) Two non-allelic t R N A j M e t genes are located in the p23-ql2 region of human chromosome 6. Proc. Natl. Acad. Sci. USA 8J1 , 5027-5031. Newman, A.J., Ogden, R . C , and Abelson, J. (1983) tRNA gene transcription in yeast: effects of base substitutions in the intragenic promoter. Cell 35., 117-125. 148 Otto, E., Young, J.E., and Maroni, G. (1986) Structure and expression of a tandem duplication of the Drosophila metallothionein gene. Proc. Natl. Acad, Sci. USA 83. 6025-6029. Petes, T .D. (1980) Unequal meiotic recombination within tandem arrays of yeast ribosomal DNA genes. Cell L9_, 765-774. Pirtle, I.L., Shortridge, R.D., and Pirtle, R.M. (1986) Nucleotide sequence and transcription of a human glycl tRNAccc^ly gene and nearby pseudogene. Gene 11 , 155-167. Rajput, B., Duncan, L., Demille, D., Miller Jr., R . C , and Speigelman, G.B. (1982) Transcription of cloned transfer RNA genes from P_,_ melanogaster in a homologous cell free extract. Nucleic Acids Res. 1Q_ , 6541-6550. Raymond, G.J. and Johnson, J.D. (1983) The role of non-coding DNA sequences in transcription and processing of a yeast tRNA. Nucleic Acids Res. 11, 5969-5988. Raymond, G.J., and Johnson, J.D., (1987) The 5' flanking sequences of yeast t R N A 3 ^ e u genes enhances the rate of transcription from stable pre-initiation complexes. Nucleic Acids Res. L i , 9881-9894. Raymond, K . C , Raymond, G.J., and Johnson, J.D. (1985) In vivo modulation of yeast tRNA gene expression by 5' flanking sequence. EMBO J. 4_» 2649-2656. Reilly, J.G, Ogden, R., and Rossi, J.J. (1982) Isolation of a mouse pseudo tRNA gene encoding CCA - possible example of reverse flow of genetic information. Nature 300. 287-289. Rich, A. and RajBhandary, U.L. (1976) Transfer RNA: Molecular structure, sequence and properties. Ann. Rev. Biochem. 45_ , 806-861. Rigby, P.W., Dieckmann, J .M., Rhodes, C , and Berg, P. (1976). Labelling deoxyribonucleic acid to high specific activity in vitro by nick translation with DNA polymerase I. J. Mol. Biol. Ill, 237-251. 149 Ritossa, F .M. , Atwood, K.C. , Lindsley, D.L., and Speigelman, S. (1966). On the redundancy of DNA complementary to amino acid transfer RNA and its absence from the nucleolus organizer region of Drosophila melanogaster Genetics 54. 663-' 676. Robinson, R.R. and Davidson, N. (1981) Analysis of a Drosophila tRNA gene cluster: two t R N A L e u genes contain intervening sequences. Cell 21 , 251-259. Rooney, R.J. and Harding, J.D. (1988) Transcriptional activity and factor binding are stimulated by separate and distinct sequences in the 5' flanking sequences of a mouse t R N A A s P gene. Nucleic Acids Res. 16., 2509-2521. Roth, D.B., Porter, T .N. , and Wilson, J.H. (1985) Mechanisms of nonhomologous recombination in mammalian cells. Mol. Cell. Biol. 5_, 2599-2607. Rothstein, R.J., Esposito, R.E., and Esposito, M.S. (1977) The effect of ochre suppression on meiosis and ascospore formation in Saccharomvces Genetics 8_5_, 35-54. Rubin, G.M. and Spradling, A.C. (1982) Genetic transformation of Drosophila with transposable element vectors. Science 218 , 348-353. Saigo, K. (1986) A potential primer for reverse transcription of mgd 3 , a Drosophila copia -like element, is a leucine tRNA lacking its 3' terminal 5 bases. Nucleic Acids Res. 14 , 4370. Sajjadi, F.G., Miller Jr., R . C , and Speigelman, G.B. (1987) Identification of sequences in the 5' flanking region of a Drosophila melanogaster t R N A 4 ^ a ' gene that modulate its transcription in vitro. Molec. Gen. Genet. 206. 279-284. Sajjadi, F.G. and Speigelman, G.B. (1989) The modulatory element TNNCT affects transcription of a Drosophila tRNA4^al gene without affecting transcription complex stability. Nucleic Acids Res. 17. 755-766. 150 Saluz, H., Dudler.R., Schmidt, T., and Kubli, E. (1988) Localization and estimated copy number of Drosophila melanogaster U l , U4, U5, and U6 sn RNA genes. Nucleic Acids Res. 16_, 1203. Sanger, F., Nicklen, S., and Coulson, A.R. (1977) DNA sequencing with chain terminating inhibitors. Proc. Natl. Acad. Sci. USA J4 , 5463-5467. Santos, T. and Zasloff, M. (1981) A comparative analysis of human chromosomal segments bearing nonallelic dispersed t R N A j M e t genes. Cell 23. , 699-709. Schaack, J. and Soil, D., (1985) Transcription of a t R N A A r S gene in yeast extract; 5' flanking sequence dependence for transcription in a heterologous system. Nucleic Acids Res. 11 ,2803- 2813. Schaack, J., Sharp, S., Dingermann, T., Burke, D. J., Cooley, L., and Soil, D. (1984) The extent of a eucaryotic tRNA gene; 5' and 3' flanking sequence dependence for transcription and stable complex formation. J. Biol. Chem. 259 , 1461-1467. Schaack, J., Sharp, S., Dingermann, T., and Soil, D. (1983) Transcription of eucaryotic tRNA genes II. Formation of stable complexes. J. Biol. Chem. 258 , 2447-2453. Schimke, R.T. (1984) Gene amplification in cultured animal cells. Cell 3_7_, 705-713. Schon, A.. , Krupp, G., Lowe, S., B-., Kannangara, C.G., and Soli, D. (1986) The RNA required in the first step of chlorophyll biosynthesis is a chloroplast glutamate tRNA. Nature 322, 281-284. Sharp, S., DeFranco, D., Silberklang, M. , Hosbach, H.A., Schmidt, T., Kubli, E., Gergen, J.P., Wensink, P.C., and Soli, D. (1981) The initiator tRNA genes of Drosophila  melanogaster: evidence for a tRNA pseudogene. Nucleic Acids Res. 9_ , 5867-5882. Sharp, S., Schaack, J., Cooley, L., Burke, D.J., Soli, D. (1985) Structure and transcription of eucaryotic tRNA genes Crit. Rev. Biochem. (ed. Fasman, G.D.) CRC Press, Florida 19_, 107-144. 151 Sharp, S., DeFranco, D., Dingermann, T., Farrell, P., and Soli, D. (1981) Internal control regions for transcription of eucaryotic tRNA genes. Proc. Natl. Acad. Sci. USA 2ft, 6657-6661. Sharp, S., Dingermann, T., Schaack, J . , and Soli, D. (1983) Transcription of eucaryotic tRNA genes in vitro 1: analysis of control regions using a competetion assay. J. Biol. Chem. 25ft , 2440-2446. Sharp, S., Garcia, A., Cooley, L., and Soli, D. (1984) Transcriptionally active and inactive gene repeats within the D. melanogaster 5S RNA gene cluster. Nucleic Acids. Res. 12, 7617-7632. Shibuya, K., Noguchi, S., Nishimura, T., and Sekiyu, S. (1982) Characterization of a rat tRNA gene cluster containing the genes for t R N A A s P , t R N A G 1 y , and t R N A G l u , and pseudogenes. Nucleic Acids Res. 1Q_ , 4441-4448. Silverman, S.O., Schmidt, O., Soil, D. Hoveman, B. (1979). The nucleotide sequence of a cloned arginine tRNA gene and its in vitro transcription in Xenopus germinal vesicle extracts. J. Biol. Chem. 254. 10290-10294. Sollner-Webb, B. (1988) Surprises in polymerase III transcription. Cell. 5_2, 153-154. Southern, E . M . (1975). Detection of specific sequences among DNA fragments separated by gel electrophoresis. J. Mol. Biol. 9_ft, 503-517. Spradling, A.C. and Rubin, G.M. (1981) Drosophila genome organization: conserved and dynamic aspects. Ann. Rev. Genet. _L5_, 219-264. Sprinzel, M. , Hartman, T., Meissner, F., Moll, J . , and Vorderwulbecke, T. (1987). Compilation of tRNA sequences and sequences of tRNA genes. Nucleic Acids Res. 1!, r53-rl88. St. Louis, D.C. and Speigelman, G.B. (1985) Steady-state kinetic analysis of cloned t R N A ^ e r genes from Drosophila melanogaster. Eur. J. Biochem. 148. 305-313. Stark, G.R. and Wahl, G.M. (1984) Gene amplification. Ann. Rev. Biochem. 5_3_, 447-491. 152 Steffenson, D.M., and Wimber, D.E. (1971) Localization of tRNA genes in salivary chromosomes of Drosophila by RNA:DNA hybridization. Genetics 6_9_, 163-178. Stephens, J.C., and Nei, M. (1985) Phylogenetic analysis of polymorphic DNA sequences at the ADH locus in Drosophila melanogaster and its sibling species. J.Mol. Evol. 22, 289-300. Suter, B. and Kubli, E. (1988) t R N A T y r genes of Drosophila melanogaster: Expression of single-copy genes studied by SI mapping. Mol. Cell. Biol. 8_ , 3322-3331. Underwood, D . C , Knickerbocker, H., Gardner, G., Condliffe, D.P., and Sprague, K.U. (1988) Silkgland specific tRNAs are tightly clustered in the silkworm genome. Mol. Cell. Biol. 8_, 5504-5512. Varmus, H. (1988) Retroviruses. Science 240. , 1427-1434. Weber, L. and Berger, E. (1976) Base sequence complexity of the stable RNA species of Drosophila melanogaster . Biochemistry 1_5_, 5511-5519. Weiner, A . M . , Deininger, P.L., and Efstradiadis, A. (1986) Nonviral retroposons: genes, pseudogenes, and transposable elements generated by the reverse flow of genetic information. Ann. Rev. Biochem.. 5_5_, 631-661. White, B.N., Tener, G.M., Holden, J., and Suzuki, D.T. (1973) Analysis of tRNAs during development of Drosophila . Dev. Biol. 3JL , 185-195. Whoriskey, S.K., Ngheim, V.-H. , Leong, P.-M., Mason, J.-M., and Miller, J.H. (1987) Genetic rearrangements and gene amplification in Escherichia c o l i : DNA sequences at the junctures of amplified gene fusions. Genes and Dev. 1, 227-237. Will, B. M . , Bayev, A.A. , and Finnegan, D. J. (1981) Nucleotide sequence of the terminal repeats of 412 transposable elements of Drosophila melanogaster: A similarity to proviral long terminal repeats and its implications for the mechanism of transposition. J. Mol. Biol. 153. 897-915. 153 Yanisch-Perron, C , Vieira, J. and Messing, J. (1985) Improved M13 phage cloning vectors and host strains: nucleotide sequences of the M13mpl8 and pUC19 vectors. Gene 21, 103-119. Yen, P.H. and Davidson, N. (1980) The gross anatomy of a tRNA gene cluster at region 42A of a D. melanogaster chromosome. Cell 22. , 137-148. Young , L.S., Takahashi, N., and Sprague, K. (1986) Upstream sequences confer distinctive transcription properties on genes encoding silkgland specific t R N A A , a . Proc. Natl. Acad. Sci. USA 8J. , 374-378. Yuki, S., Inouye, S., Ishimura, S., and Saigo, K. (1986) Nucleotide sequence characterization of a Drosophila retrotransposon, 412. Eur. J. Biochem. 158 , 403-410. Zasloff, M. Santos, T., Romeo, P., and Rosenberg, M. (1982) Transcription and precursor processing of normal and mutant human tRNAjMet genes in a homologous cell-free system. J. Biol. Chem. 257. 7857-7863. Zoller, M.J. and Smith, M. (1983) Oligonucleotide mutagenesis of DNA fragments cloned into M13 vectors. In Methods in Enzymology. 100. (eds. Grossman, L. and Moldave, K.) Academic Press, New York, NY. pp. 468-500. Zweibel, L.J., Cohn, V.H., Wright, D.R., and Moore, G.P. (1982) Evolution of single-copy DNA and the ADH gene in seven Drosophilids. J. Mol. Evol. 62-71. 154 APPENDIX The nucleotide sequence of the plasmids containing the t R N A A r g genes are shown. The sequences were determined as described in the Methods. In most cases the gene coding regions were determined on both strands from multiple determinations. The pDt72R half gene was determined only on one strand but with unambiguous gel readings. The gene coding regions are underlined and in all cases show the non-coding strand (= tRNA sequence) from 5* to 3'. The R19.1 gene in pDt67R begins at position 586. The R12.5 gene in pDtl7R begins at position 223; the serine gene downstream begins at position 572. The R83.1 gene in pDt66R begins at position 142. The R85.1 gene in pDt85C begins at position 245; the R85.2 gene begins at position 1299. The R12.6 gene begins at position 153. The half gene sequence in pDt72R begins at position 350. O 1- O l - O C O CM U • C f- C O C J < CM C J C J < t- C J < t— t-C J o < < < C J < C J C J t- C J o t— o < C J C J < t— C J < C J 1-< o cj < < < u C J C J 1-< C J o o C J C J < t- k-O O O l - O t-o < CM h- ' t C J — o <N h- CO < < t— C J o t— t-C J < o (- o C J o t- o o 1- 1-< *~ C J o C J C J t~ C J t- t- t-< < C J 1- t-u 1-1- o < (- < t-o t- t-H C J o O o O 1- o o <D t- o o CM < 1- C V C J co o < o o (- O 1-< C J C J < c> C J < y~ < a (- o o t- C J C J C J < o < < C J o < C J o < < C J 1- C J o t- <S < < «5 •- o < (- o o < O 1 - o < (0 C J <o t- o o a — f- C J C J o t- C J < t- < o < o •< C J t-a t- < C J 1- C J o (-< < C J o < < 1 -< C J < 1 - \-o t- \-1- C J 1-o C J <t 1- 1- t-C J t- o O u O h- o w •<r C J (O < ca C J C J ~- o C M ( -< < C J o 1-< o 1-< t- « t C J (- \-< t- C J < t -C J < 1 -C J C J o C J t-o o C J < < o < < a C J < o \~ < o t- C J < 1- »-O o O 1- o < CM < 1 C J C O t-C J — < ( N C J < o < C J < < t-t- C J (-o < C J < C J t-o < 1 -< t-< < < < o < 1-o 1- o < < < C J o < h- < o h- C J o o 9 < O CJ a) o 5 < CM h-U J t~ t-< J < < o CJ o < t~ o o H - < t- CJ < (- K o 1— o C J C J o 1-o t-o < -^CJ to C J < C J o t-o < < < o < O t- o < O O <o a 00 < O 1-•a- o in < t- C J C J < < < C J t-t- (-< h- < C J o < < C J < o o o o o I-o t- < < < (- < C J 1- 1- C J f- 1- o o o 1-(- y-< t-(- o 1-o o o < O C J t a U J C J CO t-•<» C J in t-U J y-h- 1-o < < C J t-C J 1- 1-o o 1-C J o t-< C J 1 - 1- 1-f- 1- < < 1-< C3 < o 1-1- < o o h-C J o < o < t- H < O C J O H O 1 -d o T O «£> < in C J <D O < 1- o C J o 1-< < o 1- o o < o < o < o o C J C J 1 -t- H 13 C3 C J t-< o 1 - o o a o 1-o o o o 1- < t- < C J o o o o O C J O 1-O ( - CM C J TT (-1- in i - U J C J t- 1- C J <t C J < C J 1- o C J C J o C J o 1-o o H o o ••" < o C J t- < < t- < < < a < 1- C J o o C J o t- 1-o h~ < < >~ o ( J C J a O ( - O l - O C J CO 1- O O CM t-CO C J in h- (0 K o o o C J 1-o o C J o o o o o o t- C J 1-1- C J C J < 1- o C J C J C J C J C J o « J < 1- (- < (- o (- 1- < h- C J < < C J C J C J 1-o < O C J CO < CO t- C D < . C J C J < C J < 1-C J < C J 1-< C J o (-< o 1- a o < o J— o o (- o t- < C J < C J O <t o < CM <t •<r < 00 ( - CO < t-< 1-o CJ < C J o < < o < o < CJ t-< < o < C J < < < C J 1-< 1-I- l -o a o o o < CM < CO C J 0) o C J < o 1-< < < < < < 1-C J CJ t- o o < d < »-C J C J < o < l - < o 1-< C J o o o < CO 1 - o < t~ CO 0) o (- o o (-< < < t- 1-1 - 1-<t o a t-«t < C J C J < C J C J < C J a C J H (- 1-C J t - C J O (- o t-co i - co < < t~ 1— CO C J CJ 1 - < C J C J < 1-< CJ < o C J C J < 1-< < 1 -C J < C J < o O 1 -o C J c» o CJ < 0) C J o C J CJ t- C J CJ l - o < C J o o (- o C J < < < < f -o < O C J o < 1 < co < CO <J f- < co C J 0) o «t CJ CJ o < < l - C J CJ CJ C J 1- t- < (- < 1-< < C J < < O 1-< C J f~ CJ 1- < 01 t-C J < < 1-< o < CJ C J CJ < 1- 1-o o t-C J C J (-pDt17R 20 40 60 80 100 120 ATTCCCGATTTTACGCAGTGCGTGCGTGTTTAACGCATGATATTACGTGGCTCTTCTATTAGCGACGCACTATTAACTGTTAAACTATTGACCATATAGCTGCCATTTGATTGGGGTGTC 140 1G0 180 200 220 240 TT ATGTGCGGTAATGACTTCTTTTCCTTTTGTATCCTTTTATGCTCGAGTCCCTGCTCAGCTAGTTGCTTTTCTTGGCAACTTAAGCCACGTTTAAACAACTGACCGTGTGGCCTAATGG 260 280 300 320 340 360 ATAAGGCGTCGGACTTCGGATCCGAAGATTGCAGGTTCGAATCCTGTCACGGTCGTACAAATCCCTTTTTGTTATCTTCCAAACTTTTTGGCTTTCATTTTTGAACAGTTTACAGCTAAA 380 400 420 440 460 480 TGCGGTGTGTGTATATATTTGGGTTTTCTAATTGCTTAGACATTTCTAGTATGTTAATCCTTTTATTATCCTTCAATGGATATTTCAATATTGGCAATAATTATTGTAGCATCATTTGAT 500 520 „ 540 560 580 600 AGTTACAAATTATGTAAATTTTAGCGACAGTGGAAAAGTAAAAGTGCTCGGACTTTCCAAGTACGTAATTTAACACCAGCTATAACAAGAAGCAGTCGTGGCCGAGTGGTTAAGGCGTCT 620 640 660 680 700 720 GACTCGAAATCAGATTCCCTTTGGGAGCGTAGGTTCGAATCTACCGGCTGCGAATCGAATCCAATTTTTTACACTTTGCATGAGCTACCATATTTTTATGTGCGCCTCAATTAAACTTGA 730 TGACAAACCAAAGTCC pDt66R 20 40 60 80 100 120 AAGCTTATGCCGGCAAGGGTGGTTTTTACTGCCACATCCTGGGAGGTGGAGGTGTCAGGATGCGAAAGTGTGGTGAAAGTATGTCCTGGGAGTACCCATTTTGAGCTTTATAGGGCAGGG 140 160 180 200 220 240 ACAAACGGGACGTTTCAACCGGACCGTGTGGCCTAAAGGATAAGGCGTCGGACTTCGAATCCGAAGATTGCAGGTTCGAGTCCTGTCACGGTCGATTTAACGGCTGAATGCATTTTTTGC 260 280 300 320 340 360 ACCGACTGGCTTGTTTAAACTTCTCTAGTTACTACCCTCGTATGGTTTGCGTACTCAGAAACTCATCTTGTTTGTTTTGAAACACAATAAAAACCAGTTTCTTTCGACTCTGTGGCAACG 380 400 420 440 460 480 CAAAATGCAAAATCAATTAGGTATGGAAAAAAAGACAACTGAGCAACCGAAAACCGATTGGATCGACCACTGGTAATGAATATCCAGCAGATATCTCGTCTTCTCCAATGCTTCTAAATA 500 520 540 560 ' 580 600 TCAAATGTCCACTATAAGTTATTTTTTGAGCCAGTAAAACTTGTGAATTAGCTCTAATACGTGCCATTCA-TTCCGTAAGCTGAGTTTTTTGGTTGAAGCTGAGCAATACAGTGGCTTGAA 620 640 660 680 700 720 TGTTAGTTAAACGTATCACTTTAGATATATGCTTGTCAATATATGTATCTTAATTTCTCCTAATGAACATGTTTTTTTAAAACATCTTGGGGCGTGGATAAAAGACCGTTTCATTAAGTT 740 760 780 800 820 840 TTTTTTCTGGCTGATTTAGCTTAGAAAACTTAAAACTTAAAAACAACATTTCTCATTCGAGATTAGCAAATGTAATTTATCAGAAACTTGTCTAATTATCCAGCCTTATATGATGAA TGA 850 860 870 880 890 900 910 920 TGGCGTCCATTCAGTTCAAAAAGAGCAAGCCTATAAGTCGAGTCGAAATAAATATGTGTGTAAGTCCCCGAAAGACTTTGAACCTAGT r 20 40 60 80 100 120 AATGCAATGCGGTGATTGATGGCTAGGTGGTGGTATTCAATGGCQTAAAGAATTTTAATATTAGGAAATCAAGGAATTTCCCTTAATAAAGATTTTTATATTTACTGTTTTTAACGTAGA 140 160 180 200 220 240 GTTCTAGATTTATAATCTCAAATGGGGTATTCGGATAAAAAACGATTTAGCAACTGTTAACGGTGAACTAAATTATTTGTAAAGCCCGTCTTTATGTTAGTCATATTTTTCAGAGTTGCC 260 280 300 320 340 360 AACTGACCGTGTGGCCTAATGGATAAGGCGTCGGACTTCGGATCCGAAGATTGCAGGTTCGAGTCCTGTCACGGTCGTCGTAAAATTAACTTTTTTCTTTTGTATCCAGAATTTTTTTTA 380 400 420 440 460 480 TTTATTTTATGAAAATGAGATTTGAGGGCATTTGGTTGCATTTATCACACTTTGTAAGTCTGTATCTCACCTTCTTGAAGCGCCTTCGCTGCGATGAAGAGTGCGTCGATGCAGTTTCGA 500 520 540 560 580 600 GC AGATTGCTGTTGTTATCCCCCCCATCCCTTTTGCCCAGTGCCTGGAGAAGATCGTCAACGTTGCGTATATAGGGGCGATCTCGATCCCCGCCGCGTTCCCTTTCCTTTTCGCCACCCC G20 640 660 680 700 720 CTCCACCGCCGCCCATGCAGCTGGTTATGCTCCGATTGCCCACACTATTCTACGCGATCGGGATCTGGTCTTTTGCTTGCCCAGCCAGTAAACCTGATTTGCCAAATTTATTTCGGCGGC 740 760 780 800 820 840 TGCCGCAGCCTTTCTTCTGCCCAATTGTTGTTGCTCCTGCACCTGTTGATGCTGCTGCTGCTGTTGTTGCTGCTGCTGCTGTTGTTCACTGGCCAGATATTCGCCGGCTATATGAACCTT 860 880 900 920 940 960 GGGTGCACTAACCAACTGCAGGCCAGTGGCGCCGAATTGCTGAGGQAATTCGTGCAACTCGAGGGGGAAACGGGGATCCATGGTCTGACCCGGTTGACGCGGGTACATATAGCCATCGCG 980 1000 1020 1040 1060 1080 TAGAAACATCTTGCTGGTGGTATAGACCCTATCCGCGATTTGGTACATCCACTCGAGCCGGGACCCTCGCCGCCGCCAATAATTCCATTTCGGACATAGGCTGTCGGGCGATAGCGAACG 1100 1120 1140 1160 1180 1200 AGTGGGTTATATACATTCGATTTGGGGGGTTTCGTTTTGAAACTGCGACGTCGCGTGCTGGGCTTCATATTTTTGTTGTATTTAATTTTCGTTTTTGTTCGTTTGTATTTGCACCCCGCG 1220 1240 1260 1280 1300 1320 ATAACAGAATGTTTGCTCTACAGAAAATCCACCCTTGTGCTGTAGGTTGCTTGCACACGTATCAAATGTTTTCGAGTTTAAGCGTGCTTGGAATAAGCGACCGTGTGGCCTAATGGATAA 1340 1360 1380 1400 1420 1440 GGCGTCGGACTTCGGATCCGAAGATTGCAGGTTCGAGTCCTGTCACGGTCGAAACCAAAGTATTATTTTTTTCTTTTTATTTTTTTTTTGTGAGAAACTTATGTTTTGTTCTTTAAAAAA 1460 1480 1500 1520 1540 1560 TTCAATTTGTTGCAAACTAAAACCATAAAATAAATCAACAAAAACAACAAAAATTTTAAGGTTCACTTGGCTAATTTTACATAAAATCTACACTGTCTTTAGACTGACGATAGAATGTTT 1580 1600 1620 1640 1660 1680 TATGATTACATGTAAAATTAGATCACGCAAAATTATATTTTTTCCTTTGGAAATATATAAAAAAAAAAAAA-AATAAAACCAGATTAGTGTTTCAAGGGTGTTGAGTTAATTTAGAGACTT 1700 1720 1740 1760 1780 1800 TTTGTTCTTTTTCAAATGTAATTTTCGACCGCGTAAAAGTATGCTACAGATATGGTTGAATTATTTTTCATGGCGTTCTTGTTTAACAGCGCACAAAAGCAGGCAACATTTGTGTTTTTA 1820 1840 1860 1880 1900 1920 AAAACGCCTGAGAGAGGCCAATAAAAAAGAGATAAACCCTATAAACTATATTGAAAATAACCGATTTTTTAAGGGTTTTACGCGACCAACACGTTCGGCACACTGCACGTTCGATTTTCC 1940 1960 1980 2000 2020 2040 ACTGCAATTGTTTGATTTTATTGCAACAACTTTTTCTGCTTTTCTGTTGTTTGCTGGAAGGCACGCACTGCCAATCCAATTAGCCGGGGTTCACCTTTCGAGCCGTCGTCTTTGCGCAAG 2050 2060 GTAAATTGTTTTTCGAACCCAAAGTGG PRI2.6 20 40 GO 80 100 120 AAGCTTCGTTTCGCGTTGAAACTGAATTTTTTGCAATTCAACCCTTCCCACTTATTATAGTTTTCGTTCTGTTCTCACTAGCAAATGTTCTCACTCCAGTTTCTCTCGCCTCTCCCTCTT 140 160 180 200 - 220 240 TATATTTGTTGTTACGGCCTGGTAATCCAACTGACCGTGTGGCCTAATGGATAAGGCGTCGGACTTCGGATCCGAAGATTGCAGGTTCGAGTCCTGTCACGGTCGACCGCTCTATACTTT 255 270 285 300 315 330 345 TTTTTTAATATTCATATTTTTCCTTGAGCTATGAATATTACAGCTTTTATTAATTGGCCAAGTCAATTGCTGCAAAAAATATTTATTAGTTCTTTAAGGAACTAGAAGCT pDt72R 20 40 60 80 100 120 GTCCC TCGCAGTCQTTCGGGCAGCTTTTCTTAAAAGCAGGCAGGCTTTTCGQATGGGGAGTTGGAGTTTTGTTAATTGTTAACGATT T TATTTGTTTT TCTGAGGGAT T TTTTTCTTAAG 140 160 180 200 220 240 ATATTCGACACAGTGGTCTAGCTCAGCTTGTAGTTTTTTAGCTTTCGACCATTGGGGCGGTGGAGTGTTCGTATATAATAGCCACTTTATGTGCGCTATTCTCTTTTTTGCTTTATTTCG 260 280 300 320 340 360 AACTCCGCCCCTGTTTTTACCAATCGCACTAACACTCTACTCACTCAGATCGCTGTTTCTCCATAATTGCCGCCGACTTGGTGGATAAGCATCGAATGCCAGGCATGCTGATCCGAAGAT 380 400 420 440 460 480 TGCAGGTTCGAGTCCTGTCACGGTCGCCATGTAGTTAAATAGGAGTCAAAACGCAATCGGTCATATTTATCTTTATTTGGTTTTTATTCAGCATGTGTACATTCTTGATCTATTCCTGAT 500 520 540 560 580 600 TCAT A ACTGAACTTGGTGTGGTCTACAGGCAGATCGCTCGTGACAGCCAAGTGCTGACACTAGCAAATTCTGCAATATGTATGTATGAAGTTCATACTCGATAACCAATGGGAGTCGAGT 610 620 GCGACCCCTAAAGCGCACATGCTGAATT 


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items