UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Concerted evolution of a cluster of X-linked tRNA4 7 genes from Drosophila melanogaster Leung, Jeffrey 1988

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


831-UBC_1988_A1 L48.pdf [ 18.94MB ]
JSON: 831-1.0098070.json
JSON-LD: 831-1.0098070-ld.json
RDF/XML (Pretty): 831-1.0098070-rdf.xml
RDF/JSON: 831-1.0098070-rdf.json
Turtle: 831-1.0098070-turtle.txt
N-Triples: 831-1.0098070-rdf-ntriples.txt
Original Record: 831-1.0098070-source.json
Full Text

Full Text

CONCERTED EVOLUTION OF A CLUSTER OF X-LINKED tRNA4,  Ser 7  GENES  FROM  Drosophila melanogaster by JEFFREY LEUNG B.Sc, University of British Columbia, 1980  A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY  in THE FACULTY OF GRADUATE STUDIES GENETICS PROGRAM DEPARTMENT OF ZOOLOGY  We accept this thesis as conforming to the required standard  THE UNIVERSITY OF BRITISH COLUMBIA January, 1988 0  Jeffrey Leung, 1988  In  presenting  degree  this  thesis  in partial fulfilment of  requirements  for  of  department  this thesis for or  by  his  or  scholarly her  I further agree that permission for  purposes  advanced  representatives.  permission.  Department of  Zoology  The University of British Columbia 1956 Main Mall Vancouver, Canada V6T 1Y3 h\, (MX  extensive  may be granted by the head of It  is  understood  that  publication of this thesis for financial gain shall not be allowed without  Date  an  at the University of British Columbia, I agree that the Library shall make it  freely available for reference and study. copying  the  copying  my or  my written  ii  ABSTRACT  Multigene families have posed an acute problem for evolutionary biologists ever since the revelation that many families exhibit unexpected sequence homogeneity within and between individuals of a species. A family that is shared between several species, in contrast, often reveals substantial heterogeneity between them. This cohesive and species-specific pattern of variation, which disengages from the classical mode of random genetic drift and selection, has been formally described as Molecular Drive (Dover, 1982). Based on initial observations (Cribbs 1982), the tRNA4  Ser  and tRUA^t genes on the X-  chromosome of Drosophila melaaogaster also showed intriguing characteristics reminiscent of Molecular Drive. However, in this unusual case, the coevolution process would not only encompass the individuals within a family, but would also ensnare members from a different family. This thesis is an in depth study on the concerted evolution of both gene families and provides evidence consistent with the view that they are undergoing Molecular Drive. Eight tRNA4,7^ genes have been cloned from bands 12DE on the X-chromosome of D er  metanogaster by molecular walking. There are two tRNA4  Ser  and two tRNA7^ genes that er  contain sequences expected from their known tRNAs (Cribbs et. al., 1987a). Of the 86 nucleotides, they only differ from each other at positions 16, 34 and 77 (non-standard numbering, see Sprinzl et al., 1987). The difference at position 34 corresponds to the anticodon and accounts for their difference in codon recognition. These genes have been designated as either 444 or 777 genes, based solely on the three diagnostic differences. However, there is also a single 474 and two 774 genes, which are recombinant structures of the bona fide genes. The remaining gene, 444*, has the three nucleotides diagnostic of tRNA4  Ser  but contains a mutation at the tip of the extra arm. Thus collectively, the entire  caste of tRNA4,7  Ser  genes at 12DE forms a graded series of transitional states, bridging the  narrow sequence variability between true tRNA4  Ser  and tRNA7 . Ser  Flanking sequences of these hybrid and the 444* genes show segmental homologies related to both the 444 and 777 genes within the cluster, again a strong indication that both gene  iii  types are undergoing concerted evolution. Examination of selected genes from two distantly related sibling species, D, erecta and D. yakuba, shows their equivalent flanking sequences have diverged from those of melanogaster. As expected, the base changes in these species, often occurring as clusters, are also non-random and appear to have been propagated to certain respective members to maintain a species-specific and cohesive pattern of variation consistent with Molecular Drive. One possible mode of spreading sequence variation and creating the hybrid genes in the process could involve an initial stage of asymmetric pairing between 4 4 4 and 777 DNA. To examine this possibility, a tRNA  AF  8  gene cluster also from 12DE was conveniently exploited as  independent "monitors". This family shows fluctuations in the number of genes among the different species and strains (Newton, unpublished), which could also be explained by asymmetric pairing of DNA followed by unequal exchange. Thus, even though the  tRNA 8 AF  and tRNA4,7 genes have embarked on different evolutionary pathways, both phenomena Ser  may be explained by their common susceptibility to local asymmetric pairing of DNA.  iv T A B L E OF CONTENTS Page Abstract  ii  Table of Contents  iv  List of Tables  xii  List of Figures  xiii  Abbreviations  xvii  Acknowledgments  xx  Dedication  xxi  Introduction  1  Modified nucleotides in tRNAs  2  The Universal Cloverleaf  3  The tRNA Tertiary Structure  4  Structures of Eukaryotic tRNA Genes  9  Intron-Containing tRNA Genes  10  tRNA Variant Genes and Pseudogenes  11  Number, Diversity and Organization of tRNA Genes  14  Transcription of tRNA Genes  19  Flanking Modulatory Sequences  21  Formation of Transcriptional Complexes  27  Maturation of tRNA Transcripts  30  Other Unusual tRNA-mediated Cellular Functions in Eukaryotes  35  1. Protein Degradation  35  2. Primers for Reverse Transcription of Retrotransposons  36  3 • Chlorophyll Biosynthesis  37  4. Induced and Naturally Occurring Suppressor tRNAs The Present Studies Methods and Materials  38 40 48  V  REAGENTS  48  Enzymes Used in Molecular Cloning  48  Oligonucleotides  48  Nucleotides  50  Phenol  50  Formamide  50  Acrylamide  51  Agarose  51  Galactosides  51  Autoradiography  51  Supplies for Culture Media  51  BACTERIAL STRAINS  51  CULTURE MEDIA AND CONDITION  52  Ecoli  52  fruitflies  54  MASS COLLECTION OF EMBRYOS FROM />. meknogaster PLASMIDS AND BACTERIOPHAGE VECTORS  54 55  INTRODUCTION OF PLASMID AND DOUBLE-STRANDED BACTERIOPHAGE Ml 3 DNA INTO £ Coli  56  Reagents  56  BACTERIAL TRANSFORMATION ISOLATION OF PLASMID AND DOUBLE-STRANDED Ml3 DNA LARGE-SCALE DNA ISOLATION  56 58 58  Plasmid DNA  58  Double-Stranded Ml 3  58  Lysis by Triton X-100  59  Lysis by Alkali - Large-Scale DNA Preparation  59  CsCl Gradient Purification of DNA  60  vi Purification of DNA by Column Chromatography  60  Small-Scale Mini-Preparation  61  ISOLATION OF TEMPLATE DNA FOR SEQUENCING  61  Double-Stranded DNA  61  Single-Stranded DNA  62  Bacteriophage Ml3  62  The pEMBL Plasmids  63  Preparation of the Helper Phage IR1  63  DNA SEQUENCING  64  Chain-Terminator Method  65  Single-Stranded DNA Templates - Ml3 and pEMBL Plasmids  65  Double-Stranded DNA Templates - pUC13 and Double-Stranded pEMBL Plasmids  65  Chain-Te rminatio n Reactio ns  65  Single-Stranded Templates  65  Double-Stranded Templates  66  Purification of Radiolabeled Restriction Fragments For Maxam-Gilbert Sequencing  66  TREATMENT OF GLASSWARE AND PLASTICWARE  68  ISOLATION OF GENOMIC DNA FROM Drosophila  68  Quick Method  68  Large-Scale Method I  70  Large-Scale Method II  71  PARTIAL DIGESTION OF GENOMIC DNA FOR LIBRARY CONSTRUCTION  72  D. melanogaster Genomic DNA  72  D. erecta and D. yakuba Genomic DNAs  72  SIZE FRACTIONATION OF D. melaaogaster DNA  73  NaCl Linear Gradient  73  vii Gel Fractionation  73  CONSTRUCTION OF GENOMIC LIBRARIES  76  Choice of Cloning Vectors  76  Bacteriophage Lambda  79  Large-Scale Lambda Preparation  79  CsCl Gradient Purification of Live Lambda Bacteriophage  80  Preparation of Lambda Vector Arms  80  Preparation of Cosmid DNA  81  Preparation of Cosmid Vector Arms  81  Ligation of Lambda Vector Arms to Drosophila DNA  84  D. melanogaster Libraries  84  Drosophiht Sibling Species Libraries  84  Ligation of Cosmid Arms to D. melanogaster DNA  85  cosPneo Vector  85  pJB8 Vector  85  IN VITRO PACKAGING OF BACTERIOPHAGE AND COSMID DNA  85  Freeze-Thaw Two-Strain Packaging Extracts  86  In vitro Packaging Using the Two Strain System  87  cos' Packaging Extracts  87  In vitro Packaging Using cos' Extracts  87  AMPLIFICATION OF GENOMIC LIBRARIES  88  PREPARATION OF RADIOLABELLED PROBES  89  Nick-Translation  89  Oligonucleotide Probes  89  Construction of tRNA4,7 - and tRNA «- Specific Probes by Strand Ser  Ar  Synthesis  90  EMPIRICAL EVALUATION OF GENOMIC LIBRARIES BY SOUTHERN HYBRIDIZATION  91  viii  SCREENING GENOMIC LIBRARIES  92  Plating Bacteriophage 31 Libraries  92  Plating Cosmid Libraries  92  Lysis of Membrane Bound Bacteriophages or Bacterial Colonies  93  Prehybridization  93  Hybridization  93  Isolation and Purification of JL Clones  94  Isolation and Purification of Cosmid Clones  94  RESTRICTION ENDONUCLEASE DIGESTS  95  GEL-ELECTROPHORESIS  97  Agarose Gels  97  Acrylamide Gels  97  RECOVERY OF RESTRICTION FRAGMENTS FROM GELS  99  Agarose Gels  99  Acrylamide Gels  100  SOUTHERN TRANSFER  100  RESTRICTION MAPPING  .'.  101  Low Resolution Restriction endonuclease Mapping  101  Restriction Endonuclease Mapping by Partial Digestion  101  A Novel Restriction Endonuclease Mapping Method by Indirect Labelling with Sequencing Oligonucleotide Primers  102  MOLECULAR CLONING IN PLASMID AND DOUBLE-STRANDED M13 BACTERIOPHAGE VECTORS  103  Restriction Endonuclease Digestion of Vector DNA  103  Dephosphorylation of Vector DNA  103  DNA ligation  106  PREPARATION OF i 3'- PltRNA 32  Synthesis of Cytidine 3', 5'-diphosphate  106 106  ix RNA Ligase-Catalyzed Addition of 15 P] -pCp ,J2  DNA DOT BLOTS  107 107  ORIENTATION OF tRNA 7 r GENE TRANSCRIPTION  107  Se  4i  CHAPTER I  110 Characterization of the Entire tRNA Gene Cluster at Polytene Bands 12DE Ser  by Chromosomal Walking  110  RESULTS  112  1 (A). A Chromosomal Walk in the pDt73 Region (B). Interspersed and Tandemly Repeated Elements 2 (A). Chromosomal Walk in the pDtl7R Region  112 116 122  (B) . Localization of tRNA^ Genes Within the Walk  125  (C) . Sequence Analyses of pE4.6 and pE1.8  131  3 (A). Chromosomal Walk in the pDt27R Region (B). Localization of tRNA Genes Within the Walk 4 (A). Chromosomal Walk in the pDtl6R Region CHAPTER II  132 135 135 145  Flanking Sequence Relatedness in the t R N A 4 , 7  Ser  Genes in D. melanogasier  And Sibling Species RESULTS  145 147  Part I- Homologies in the 5'-Flanking Regions of the melanogaster Hybrid Genes: Wedded Patchworks  147  (A) . The 474 Gene is Most Closely Related to the 444-1 Gene  147  (B) . The 774 Gene in pDtl6R is Most Closely Related to pDtl7R-777  149  (C) . The 774 Gene in pDtl7R is Possibly Related to pDtl6R-777  149  (D) . The 444* Gene Has a Patchwork 5-Flanking Region Characteristic of pDtl7R-777 and 444-1 Genes Sequence Homologies in 3'-Flanking Regions Part II- Examination of Loci Homologous to pCS474 and pDtl6R in Drosophila  149 150  X Sibling Species  152  (A) . Detection of Homologous DNA By Genomic Southern Hybridization  152  (B) . Isolation of pCS474 Homologous Fragment from D. erecta  157  (C) . Isolation of pDtl6R Homologous Fragment from D. erecta  160  (D) . Isolation and Sequencing of the D. erecta tRNA * Genes 561  Homologous to pDt27R  166  (£). Isolation of pDtl6R Homologous DNA Segment From D yakuba  171  (F). Analysis of pDt27R Homologous Region in XDY16-82  171  Part III- Rates of Flanking Sequence Divergence in Homologous tRNA " 581  Genes From Different Drosophila Species CHAPTER III  180 189  tRNA 8 Genes at 12DE  189  RESULTS  191  Af  tRNA « Genes in ADE16 From D erecta A r  tRNA « Genes in XDY16-82 From D. yakuba Ar  DISCUSSION  ;  191 194 198  1. The Overall Molecular Organization of 12DE  198  2. Co-evolution of the tRNA4,7 Genes  206  3. A Model Postulating the Origin of Type II Homology Patches  211  4. Possible Mechanisms Involved in Generating the Hybrid Genes  218  Ser  5. tRNA « Genes From Drosophila Sibling Species Ar  228  APPENDIX CHAPTER IV  235  tRNA3b Genes and Related Sequences  235  RESULTS  236  Val  Sequence Analysis of pDt4lR  236  xi Homologies With Another tRNA3b Containing Plasmid VaL  DISCUSSION CHAPTER V  236 245 249  Dosage Compensation  249  Regulatory Genes  250  fly-Acting Regulatory Sequences  251  RESULTS  257  DISCUSSION  262  REFERENCES  .'  265  MST QF TABLES Table I.  List of Oligonucleotides  Table II.  Deoxy-dideoxyribonucleoside Triphosphate Mixes For ChainTermination Sequencing  Page 49  67  Table III. DNA Sequencing Reactions by the Maxam-Gilbert Method  69  Table IV. Specific Buffers For Restriction Enzymes Used In Library Construction  %  Table V.  A Summary of tRNA Genes Identified in Bands 12DE  Table VI.  A Summary of Identified Drosophila melanogaster Variant Genes  199 247  L I S T OF FIGURES  Page Figure 1. Generalized two-dimensional representation of a tRNA molecule  5  Figure 2. Tertiary interactions in yeast phenylalanine tRNA  7  Figure 3. Two typical chromosomal sites enriched for tRNA genes  17  Figure 4.  5'-Flanking sequences of different tRNA genes from S. cerevislae  Figure 5.  Splicing of tRNA in yeast  33  Figure 6.  Sequence of tRNA? "  42  561  25  Figure 7. Fractionation of Mbol partial digest of Oregon-R DNA by NaCl gradient  74  Figure 8. Restriction maps of cosPneo and XEMBL3  77  Figure 9. Systematic testing of intactness of the restriction ends in both vector and genomic DNAs before packaging Figure 10. Restriction mapping by oligonucleotide indirect labelling method Figure 11. Transcription orientation of tRNA genes Figure 12. Molecular walk in the pDt73 domain  82 103 109 113  Figure 13. DNA sequence of tRNA " 474 gene from Canton S is identical to its 561  homologue in Oregon-R  115  Figure 14. Interspersed repeated sequences shared between pDt73 and pDtl7R domains  118  Figure 15 Hybridization of a 1 3 kb SstI fragment corresponding to one repeat unit of the Stellate sequences to fly strains deficient for polytene bands 12DE Figure 16. Chromosomal walk in the pDtl7R domain  120 123  Figure 17. Restriction mapping of pE4.6 by the oligonucleotide indirect labelling method  126  Figure 18. Localization of the tRNA gene in pE1.8 by the oligonucleotide indirect Ser  labelling method  129  Figure 19. Nucleotide sequence of the 444* gene in pE4.6 from the Oregon-R strain Figure 20. Nucleotide sequence of 774 gene in pEl .8 from Oregon-R Figure 21. Chromosomal walk in the pDt27R domain Figure 22. Sequence of pArgl2.6 of D. melanogaster  133 134 136 138  Figure 23. Chromosomal walk in the pDtl6R domain  140  Figure 24. DNA sequence of the pCSl6-777 gene from Canton S  142  Figure 25. DNA sequence of pCS16-774 gene from Canton S  143  Figure 26. 5'-Flanking homologies in tRNA4/7 genes in D. melanogaster  148  Figure 27. 3-Flanking homologies of non-allelic genes of D. melanogaster  151  Ser  Figure 28. Evolutionary relationship among the eight species of Drosophila species subgroup based upon their polytene chromosome banding patterns  153  Figure 29. Genomic Southern blot of Drosophila sibling species subgroups with probe pCS474  155  Figure 30. Genomic Southern blot of Drosophila sibling species subgroup with probe pDtl6R  158  Figure 31. Subclone of the 8.5 kbBjtmHI fragment fromfcDE73from D. erecta  161  Figure 32. Nucleotide sequence of the 474 gene from D. erecta  162  Figure 33 Restriction map of X.DE 16  163  Figure 34. Nucleotide sequence of the 774 gene from D. erecta  164  Figure 35 Nucleotide sequence of the 777 gene from D erecta Figure 36. Sequence of D. erecta 444-1 gene  165 167  Figure 37. tRNA4 gene of D erecta homologoustopDt27R 444-2 gene Ser  of D. melanogaster  168  Figure 38. Localization of the 444-1 and 444-2 genes in the 1.6 kb BamHI fragment of ADE16 by oligonucleotide indirect labelling method Figure 39. Restriction map of pDtl6R/pDt27R homologous region from D. yakuba Figure 40. Localization of the 777 gene in JLDY16-82 by the oligonucleotide indirect  169 172  XV  labelling method  173  Figure 41. Sequence of the 777 gene in D. yakuba  175  Figure 42. Localization of the 774 gene in XDY16-82 by oligonucleotide indirect labelling method  176  Figure 43. Nucleotide sequence of the 774 gene from D. yakuba  178  Figure 44. Sequence of the 444-2 gene in D. yakuba that is homologous to the D. melanogaster pDt27R 444-2  179  Figure 45. Comparison of 5'-flanking sequences among homologous genes from different Drosophila species  181  Figure 46. Comparison of 3'-flanking sequences among the homologous bona fide genes from the different Drosophila species Figure 47. Evidence for concerted evolution of tRNA4,7 genes Ser  Figure 48. Sequence of pDeArg-1 from D. erecta  185 192  Figure 49. Sequence of 3'-end of pDeArg-6 from D erecta  193  Figure 50. Nucleotide sequence of pDyArg-1 from D. yakuba Figure 51. tRNA4,7 and tRNA * genes at 12DE Ser  183  195 201  Ar  Figure 52. The current progress in the assignment of the X-linked tRNA genes to polytene bands  204  Figure 53- Genealogy delineating the formation of the tRNA genes at 12DE Ser  212  Figure 54. Schematic stepwise diagrams showing the possible lineages of hybrid tRNA4,7 genes encountered at 12DE based on their shared flanking Ser  homologies  214  Figure 55- The four tRNA 8 genes and flanking sequences between the direct repeats from Af  the Drosophila sibling species are summarized  230  Figure 56. The sequenced segments of pDt41R, pDt48 and the corresponding region from Canton-S from region 1 of chromosomal site 90BC are shown  237  Figure 57. The cloverleaf structure of Drosophila tRNA3t> with sites of the Val  KV1  four differences indicated for a hypothetical product from the variant gene  ...239  Figure 58. Restriction maps of pDt48 and pDt41R as constructed by the Smith and Birnstiel method  241  Figure 59. Homologies among all possible flinfl fragments in pDt48 and pDt41R as deduced by Southern-cross hybridization Figure 60. A summary of repeated sequences from 12DE in the cloned 157 kb Figure 61. Measuring copy number of the repeats in pDt73 Figure 62. Molecular cloning of the white locus  243 255 258 260  xvii  L I S T OP ABBREVIATIONS  A  Adenosine  ATP  Adenosine 5 -triphosphate  A26O  absorbance at 260 nm  A26O  units  the amount of material giving an absorbance of 1.0 in 1.0 mi of solution in a 1 cm light path at 260 nm at neutral pH.  bp  base pairs  BSA  bovine serum albumin  C  cytosine  CH buffer  40 mM Tris, pH 8.0,1 mM spermidine, 1 mM putrescine, 0.1% (Jmercaptoethanol, 7% DMSO  CIP  calf intestinal phosphatase  Cm  2-0-methylcytidine  I)  dihydrouridine  ddNTP  2',3-dideoxyribonucleoside triphosphate (nucleosides may be specified as G, A, T. or C)  DEAE  diethylaminoethyl  dNTP  2'-deoxyribonucleoside triphosphate (nucleosides are specified as CA.T.andC)  DMS  dimethylsulfate  DMSO  dimethyl sulfoxide  DNA  deoxyribonucleic acid  DNase  deoxyribonuclease  DTT  1,4-dithiothreitol  EDTA  ethylenediaminetetraacetate  G  guanosine  HEPES  N-2-hydroxyethylpiperazine-N'-2-ethanesulfonic acid  xviii HZ  hydrazine  i*>A  N^-isopentenyladenosine  IPTG  isopropyl 8-D-thiogalactoside  kb  kilobase pairs  LB  Luria-Bertani  mG  7-methylguanosine  M+G  Maxam and Gilbert  P2O5  phosphorus pentoxide  pCp  cytidine 3,5'-diphosphate  PEG  polyethylene glycol  PFU  plaque forming units  pol  RNA polymerase  Q  7- (4,5- cis-dihydroxy-1 - cy clopenten- 3-y laminomethyl )-7-  7  deazaguanosine R  a purine nucleoside  RNase  ribonuclease  rpm  revolutions per minute  rRNA  ribosomal RNA  RT  room temperature  S  Svedberg, sedimentation unit  SDS  sodium dodecyl sulfate  snRNAs  small nuclear ribonucleic acids  SSC  standard saline citrate (0.15 M NaCl, 0.015 M sodium citrate)  T  thymidine  tA  N-[(9-B-D-ribofuranosylpruin-6-yl)carbamoyl]threonine  TBE  Tris:borate£DTA electrophoresis buffer  Tris  Tris( hydroxy methyl )aminomethane  TE  10 mM Tris-Cl, pH 8.0,1 mM EDTA  6  200 mM Tris-Cl. pH S.0,200 mM NaCl, 1 mM EDTA N,N,N' ,N-tetramethylethylenediamine uridine ultraviolet 5-bromo-4-chloro-3-indolyl-B-D-galactoside a pyrimidine nucleoside a-(carboxyamino)-4,9-dihydro-4,6-dimethyl-9-oxo-lHimidazo[l,2-a]-purine-7-butyric acid dimethyl ester  ACKNOWLEGEMENT  xx  I would like to thank, with deep sincerity, Dr. Gordon Tener for his inexhaustible patience, energy, encouragement and financial support throughout the entirety of this project. Sine  quibus non .... in the most literal sense; to you with gratitude. I also thank him for his humorous (unintended, I think!) and philosophical discourse on "why rotors have lids" and his generosity in paying for the damages after I left my mark in history;toDr. Sinclair for generously contributing ideas and help in many phases of this project. The benefits derived from working with such an intellectually tenacious individual could never be overstated; to Dr. Hayash i for her excellent in situ hybridization experiments in clarifying many gene mapping studies; to Dr. Ian Gillam who unwittingly introduced me to the beauty of  Ambidopsis I would also like to thank Dr. Tom Grigliatti for reading and correcting this thesis, which definitely made it more compact and digestible; but most of all, free reign into my intellectual twilight zones. Lastly, I thank the individual who had the foresight to stock up on the lab supply of Tylenol.  For my favorite people Mom and Dad, who suffered hard and brutally long through it all; this thesis truly rings of hollow consolation. Kathleen, never the one who is bound by longings to which the fruit is sorrow, whose courage to move on is inspiring and whose warm friendship will forever lodge in the dells of my memory.  1 INTRODUCTION  Transfer RNA [tRNA] is a small species of RNA in the cell. It sediments at about 4S, and is a mixture of molecules some 74-92 nucleotides in length. This is the "adaptor species first 1  proposed by Crick (1966) as the agents responsible for fitting amino acids to their correct nucleotide triplets on the messenger during protein synthesis. Each transfer RNA thus has the dual properties of specifically recognizing both its particular amino acid and the codon representing it. The first function is achieved by its interaction with the amino acid activating enzymes; the second is mediated by the ribosome, which enables a triplet in messenger RNA to be recognized by a complementary trinucleotide sequence of bases in the tRNA, the anti codon. By now, the sequences of several hundred tRNAs from a variety of different oraganisms have been determined (Sprinzl et. al., 1987). In all cytoplasmic tRNAs, despite the variation in their exact nucleotide sequence of residues, certain positions are well-conserved from one tRNA to another such as positions Us, A  H ,  Gls, G19, A21,1133,653, T54, '¥55, C56, A58, C61,  C74, C75 and A7& (7 is pseudouridine). Some other positions are almost always occupied by a pyrimidine (Y) and yet other sites by purines (R): Y\i RJ5 R24, Y32, R37, Y48, R57 and Y60 t  J  (Sprinzl et. al 1987). Also, at the 3' end of each mature tRNA always terminates in cytidylic acid, cytidylic acid, adenylic acid or CCA-OH The other end, the 5' end, always carries a 5'terminal phosphate group and is often guanylic acid. The patterns described above are generalizations distilled from many tRNA sequences and any particular tRNAs may show critical differences from the general rule. One class of tRNAs, the initiator methionine tRNAs, are distinctly different from the rest. Prokaryotic tRNAfMet lacks a base-pair at the 5' end of the acceptor stem and has an Al 1 U24 base-pair in the D-stem rather than the usual Yn  R24 pair.  In eukaryotic tRNAi^et l^yy  is  replaced by A54U55 and Y60 is replaced by A. In tRNAjMet of higher eukaryotes the normally invariant U33 adjacent to the anticodon is replaced by a C residue (Sprinzl et al., 1987; Addison, 1982). The deviations of the initiator methionine tRNAs from the "standard"  z form may reflect their special status in proteins synthesis as they are also recognized by a different set of translation factors (EF-Tu or eEFl) for binding to the ribosome (reviewed by Lewin, 1987).  Modified Nucleotides in tRNAs A further striking aspect of all tRNA molecules is their high content of unusual bases (other than G, A, U, or C). Over 50 different modified bases have been isolated from tRNA. Many of these unusual bases differ from normal bases by enzymatic modifications of preexisting bases or the ribose moieties such as addition of methyl (CH3) groups, or from replacement of the oxygen atoms in the bases by sulfur (Kim, 1978; reviewed by Nishimura, 1978). Although the function of most of the unusual bases is not yet understood, most of them occur only at one or a few characteristic positions in the tRNA structure. Often the same modified nucleotide or its derivatives occupy the same site in homologous tRNAs from a wide variety of organisms. These obsevations suggest that the modified bases of tRNA play some important roles in structure and function. Modified nucleotides in the first, or "wobble" position (nucleotide 34) of the anticodon are directly involved in the codon-anticodon interaction. Unmodified A or U residues are almost never found in the "wobble" position; adenosine is usually modified to inosine (I). In ribosome binding assays, I in this position is capable of pairing (or "wobbling") with A, C or U in the third position of the codon. Uridine in the "wobble" position is often modified to 2-thiouridine or its derivatives. These nucleotides will pair only with A in the third position of the codon. In £ a*//tRNA a U in the "wobble" position is sometimes modified to uridine-5-osyacetic acid permitting pairing with A, G or U in the codon. The hypermodified base Q (derived from G) or its glycosylated derivatives are found in the first position of the anticodon of some tRNAs. This will base pair with either U or C but has a greater affintity forU. The third position of the anticondon (nucleotide 36) pairs with the first position (5'-end) of the codon during translation.  This interaction must be highly specific to avoid  3 accumulation of error during protein synthesis. If a tRNA has an A residue in the third position of the anticodon the A is almost invariably flanked on the 3' side by a hydrophobic base such as N_ -isopentenyladenine, Y base, or its derivatives. If a tRNA has a U residue in 6  the third position of the anticodon the hydrophilic nucleoside t A or its derivatives are 6  found immediately 3' to the anticodon. The hypermodified bases could function to stabilize the A-U base pairing between the first position of the codon and the third position of the anticodon. G or C residues at the third position of the anticodon are flanked by simple methylated purines or by unmethylated A. Other modified bases, such as m?G, are found only in tRNAs specific for certain amino acids. Yet others are found only in the tRNAs of some organisms.  For instance, 4-  thiouridine is restricted to prokaryotic tRNAs while 5-methylcytosine is found only in that of eukaryotes. Any specific functions associated with these modified nucleotides remain speculative at this point.  The Universal Cloverleaf Usually, each amino acid is represented by more than one tRNA. These multiple tRNAs charged with the same amino acid are called isoaccepting tRNAs. A group of isoaccepting tRNAs is thought to be charged only by a single aminoacyl-tRNA synthetase specific for their amino acid; thus, these isoacceptors must share some common feature(s) enabling the enzyme to distinguish them from the other groups of isoaccepting tRNAs. The entire complement of tRNAs is divided into 20 isoaccepting groups; each group is able to identify itself to its particular or cognate synthetase. The common or distinctive features that characterize one group of isoacceptors from another is not known. Early attempts to correlate these identity features with the primary sequences of tRNAs were met with failures. Even though the sequences of the tRNAs are variable, they nonetheless conform to the same general secondary structure. Each tRNA sequence can be written in the form of a cloverleaf, maintained by base pairing between short complementary regions (fig. 1). There are four major arms, named for their  1  structure or function. The acceptor arm consists of a base-paired stem that ends in an unpaired sequence whose free 2- or 3-OH group is aminoacylated. The other arms consist of base-paired stems and unpaired loops. The "TfC arm" is named for the presence of this triplet sequence; the "anticodon arm" always contains the anticodon triplet in the center of the loop, and the "D arm" is named for its content of the base dihydrouridine. The most variable feature of tRNA is the so-called "extra or variable arm", which lies between the anticodon and the TfC arms. Depending on the length of the extra arm, tRNAs can be divided into two classes. "Class 1 tRNAs" have a small extra arm, consisting of only 3-5 bases and represent about 75% of all tRNAs. "Class 2 tRNAs" (tRNAs , tRNAs *r Leu  s  a n  d the  prokaryotic tRNAsty ) have a large extra arm with 13-21 bases, and about 5 base pairs in r  the stem. In fact, in this class of tRNAs, it could be the longest arm in the entire tRNA molecule. The base pairing that maintains the secondary structure is virtuallly invariant. There are always 7 base pairs in the acceptor stem, 5 in the T*C arm, 5 in the anticodon arm, and usually 3 (sometimes 4) in the D arm. Within a given tRNA, most of the base pairings will be conventional partnerships of A-U and G-C, but occasional G-U, G-¥, or A-T pairs are found. The additional types of base pairs are less stable than the conventional pairs, but still allow a double-helical structure to form in RNA.  The tRNA Tertiary Structure Even though the cloverleaf structure is often conveniently used to illustrate the conformation of the tRNA, X-ray crystallographic studies showed that it can fold into a higher order structure by additional H-bonding between regions that are unpaired in the cloverleaf structure. The crystal structure of yeast tRNA^ was first published at 2.5 A e  resolution in 1975 (reviewed by Kim, 1978; Rich and RajBhandary, 1976). It is a flat, Lshaped molecule that is about 20-25 A thick (Rich and RajBhandary, 1976). The amino acid acceptor CCA group is localized at one end of the L extending out into the solvent, some 70 A from the anticodon, which occurs at the opposite end. The dihydrouracil-rich (D), the  1 2 3 4 5 6 7 U*  ID arm 17  Pu  O  !3l2Py 10  — — — — — —  C 73 72 71 70 69 68 67 66  Acceptor stem  Tfc arm 65 64 63 62 C  20  0  0  A  Anticodon arm  P  A*  9  Pu 49 50 51526 Py O  G*  2223PU 25  py 59  26 27 28 29 30 31  Y* U 34  C T U  —-43 °o —-42 44 O 45 O —•41 46 0 47 40 O O —-39 Q  38 35  Pu* 36  O o °  Extra arm  Anticodon  Fig.  1.  G e n e r a l i z e d t w o - d i m e n s i o n a l r e p r e s e n t a t i o n of a t R N A m o l e c u l e .  w r i t t e n i n the f o r m of a c l o v e r l e a f s t r u c t u r e . diagram.  B y c o n v e n t i o n , the t R N A  is  T h e v a r i o u s a r m s of the m o l e c u l e a r e i n d i c a t e d i n the  Some of the more conserved p o s i t i o n s a r e i n d i c a t e d as the actual base, r a t h e r than a n u m b e r .  F o r s e m i i n v a r i a n t b a s e s , P y and P u a r e u s e d to i n d i c a t e the p r e s e n c e of e i t h e r p y r i m i d i n e or p u r i n e . a s t e r i s k i n d i c a t e s that the base i s m o d i f i e d , but that the f o r m of the m o d i f i c a t i o n m a y v a r y .  An  Circles  s i g n i f y h i g h l y v a r i a b l e p o s i t i o n s where e x t r a bases are often f o u n d (for e x a m p l e at p o s i t i o n s 17 and 2 0 , and at the v a r i a b l e a r m ) ,  Base p a i r i n g w i t h i n the stem s t r u c t u r e s a r e shown as l i n e s , (modified f r o m  L e v i n , 1987, a n d K i m et. a l . ,  1974.1  6 variable and the TfC loops stack to form the corner of the L. The double stranded region of the tRNA  Pne  approximate an RNA A helix.  In this helix the base-pairs are tilted with  respect to the helix axis and do not intersect with the axis. This results in a 6 A hole running through the center of the helix. The A helix has a very deep major groove and a very shallow minor groove (fig. 2). A major contribution to the stability of tRNA structure is made by the extensive basestacking present in the molecule.  All but five bases in tRNA^be  a  r  e  involved in the  stacking interactions (Holbrook et. al., 1978) forming the two columns or arms of the Lshaped structure. Many tertiary hydrogen-bonding interactions involve bases that are invariant in all known tRNAs, strongly supporting the belief that all tRNAs have basically the same tertiary configurations. Hydrogen bonds form an intricate network holding the two arms of the tRNA in the correct orientation to one another. All the base-pairs in the major groove of the D-stem are involved in tertiary hydrogen bonds with the variable loop. The conserved GG doublet in the D-loop is bonded to the T*C sequence in the T-loop. Uridine 8 and A9, located between the acceptor stem and the D stem, are hydrogen bonded to A14 and A23  respectively. With the exception of the  G19-C56  bond, none of the tertiary hydrogen  bonds involve conventional AU and GC pairs. Other tertiary interactions stabilizing sharp bends in the molecule involve groups of the ribose-phosphate backbone, including the 2 OH of the ribose sugars. Only a few tertiary hydrogen bonds hold the anticodon stem to the remainder of the molecule, raising the possibility that the relative orientation of the anticodon region may change during protein synthesis. Since the structure of yeast tRNA^ was elucidated, the tertiary structures of several ne  other tRNAs have been determined at varying degrees of resolution (.£. coli tRNAfMet yf  00  et al., 1980; yeast tRNA Asp, Moras et al., 1980; yeast tRNAi , Shevitz et. al., 1980). It is Met  comforting to know that the structure of yeast tRNA^ appears to be typical of at least ne  those tRNAs with a small variable loop. The structure of tRNA in solution is very similar to its structure in a crystal lattice.  Burgeoning evidence gathered by a wide variety of  techniques including oligonucleotide binding, tritium exchange, base-specific chemical  7  F i g . 2. Tertiary interactions in yeast phenylalanine tRNA. (A). The molecule is drawn in the conventional cloverleaf structure with solid lines connecting bases that are additionally hydrogenbonded. (B). Sequence rearranged to show continuous stacking of the anticodon stem on the D stem and of the acceptor stem of the TfC stem. Note the close interaction between the TfC and D loops. (C). Diagram illustrating the folding of the yeast tRNA^ molecule. The ribose-phosphate backbone is drawn as a continuous ribbon, and internal hydrogen bonding is indicated by crossbars. Position of unpaired bases are indicated by rods that are shortened intentionally. The shaded areas represent the two ends of the folded molecule, the anticodon and the amino acceptor stem. The numbering of the nucleotides are identical to the above. (After Kim et. a/., (1974).! ne  8  (0  9  modification and NMR spectroscopy tend to support this conclusion (reviewed by Kim, 1978).  Structure of Eukaryotic tRNA Genes A DNA sequence containing all the information necessary to code for a complete tRNA structure is generally defined as a tRNA gene. However, this is merely an operational definition since the extent of regulatory elements flanking the structural sequence necessary for proper transcription are as yet ill-defined. An inventory of genes from a variety of organisms have now been cloned and sequenced (Sprinzl et. al., 1987), which represents most of the 61 possible "sense" codons of the genetic code. In contrast to prokaryotes, the 3' terminal CCA end of the mature tRNA is not encoded in eukaryotic genes but must be added post-transcriptionally by the enzyme nucleotidyltransferase (Deutscher, 1982). Almost all transcriptional units are monomeric and appears not to be dependent on the coherent organization of an operon. An exception of this generalization has been observed in yeast, where the tRNA^ genes exist in dimeric forms with the tRNA^et genes er  (Mao et. al., 1980). To date, there is no specialized sequence in the ^'-flanking region that is conserved for all tRNA genes of an organism. In fact, the general observation is that even within the set of genes for a single tRNA isoacceptor, little 5'-flanking sequence homology can be found. Similarly, the 3-flanking sequences of tRNA genes are also highly variable with the exception of their high A+T nucleotide content. The best conservation is however, associated with a stretch of T nucleotides that function as a polymerase III termination signal (Silverman et. al., 1979; Valenzuela et. al., 1977). The single D. melanogaster  species is similar to all other tRNAHis species that  have been sequenced in that the 5' stem of the acceptor arm is one nucleotide longer than all other tRNA species. The 5' terminal nucleotide is an unpaired guanylate residue and is not encoded by the tRNA exon. In vitro transcription of the tRNA^ gene from the 48F s  region of chromosome 2 demonstrated that this terminal guanylate residue is added postranscriptionally, which appears to be a common mechanism for eukaryotic tRNAHis  10  biosynthesis (Altwegg and Kubli, 1980; Cooley et al., 1982 and 1984).  Intron-Containing tRNA Genes About 10-20% of nuclear encoded tRNA genes in eukaryotes contain introns. Of the 300400 tRNA genes in yeast, perhaps 40 contain introns. These intervening sequences range from 8 to 60 nucleotides in length and in any family of isoacceptor tRNA genes, so far as is known, are homologous sequences; but between different gene families, the intronic sequences are completely divergent. Introns appear to be rare in Drosophila tRNA genes, with their presence thus far confined to a pair of tightly-linked tRNA  Leu  genes at  chromosomal site 50AB (see below). Sequences at the exon-intron boundaries in general are not conserved, although their location is always one base pair 3' to the anticodon in all cytoplasmic tRNA genes examined thus far (Abelson, 1979). For most precursor tRNAs transcribed from such intron-containing genes, the intervening sequences usually contain complementarity to the anticodon and to the anticodon stem and loop. Notable exceptions to this have been found in the I laeris tRNA^yr gene, a tRNATrp gene of Dictyosteliumdiscoideum.zxA 5. pombe tRNA Y gene. L  s  The influence of introns on expression of tRNA genes has been examined in the yeast tRNA^yr (SUP6). The intron was precisely excised from this gene (Guthrie and Abelson, 1982) and its ochre suppression phenotype was not affected. Similar studies, although in vitro, have been investigated by Wallace et. al. (1980). They showed that deletions of 8, 10, 13, or 20 bp from the intron of a yeast tRNA3  Leu  gene resulted in altered templates  suffering no impairment compared to wild-type in their ability to direct in vitro transcription. Insertion of extra oligonucleotides up to 30 bp into a natural Hpal site inside the wild-type intron also introduced no ill-effect on the templates. However, when an extra 103 bp long fragment was inserted into the Hpal site, transcription rate was slightly reduced. Thus, the corroborating in vitro data support the contention that transcription of tRNA genes is not sensitive to the absence, presence or variable lengths of introns. This notion is also consistent with the finding in S. cerevisiae, that only one of the two closely-  11 related tRNA  Ser  genes contains an intron (Olson el al., 1981). Both unlinked genes are  equally functional in vivo, as both species have been obtained as SUP-RL1 amber (19 bp intron) and SUQ5 ochre (no intron) suppressing mutations, respectively.  Additional  evidence has been recently obtained in D. melanogaster, where two closely linked tRNA genes at chromosomal position 50AB have almost i d e n t i c a l  i n t r o n s o f 38  Leu  and 45 b p i n length  (Robinson and Davidson, 1980); while a third copy, as yet cytologically undetermined, is intron-less(V. Dartnell, personal communications). Although, in these Drosophila genes, there is as yet no evidence for either in vitro or in vivo activity implicating effects of introns on their expression.  tRNA Variant Genes and Pseudogenes An unusual feature of eukaryotic tRNAs and their genes stemming from molecular cloning and sequencing studies is the simplicity of tRNAs relative to the number of tRNA genes. Individual tRNAs usually correspond to multiple, identical gene copies, even if these genes are derived from different chromosomal positions. However, potential tRNA genes displaying nucleotide heterogeneity from known corresponding tRNAs have frequently been encountered. Examples of these include genes from the initiator tRNA^ families of et  D. melanogaster (Sharp et. al., 1981a), I laevis (Koski and Clarkson, 1982), and human (Santos and Zasloff, 1981), and D melanogaster variant genes for tRNA 3b Val (Addison et. al, 1984; Leung et al, 1984); tRNA4 * (Addison, 1982 and Addison et, al, 1981; Rajput et, al, Va  1982); tRNA5 y (DeFranco et. al.. 1982); tRNA L  s  1984); and t R N A 4 j  Ser  Glu  (Hosbach et al.. 1980); tRNA*rg (Newton,  (Cribbs, 1982; Cribbs et. al., 1987a). All these genes showed 1 to 6 bp  differences, b u t otherwise exhibit strong homology and identical anticodon sequences, to known corresponding tRNAs. In some cases where in vitro transcriptional activities have been obtained, there does not appear to be any impairment of their function that can be attributed to the nucleotide differences (see Table VI).  variant tRNA  His  A more extreme example of a  gene has been recently studied by Cooley et. al.,(\%2 and 1984). The  structural sequence differs from the tRNA by a consecutive alteration of 8 bp, beginning at  1Z  position 38 in the anticondon stem, which could be explained by a simple inversion of the normal sequence. This gene is also poorly transcribed in vitrovbA the precursors are not properly processed. However, the transcription inhibition could be alleviated by replacing the 5'-flanking sequence with that from another bona fide tRNAHis, Their results imply that the internal sequence alteration exerts only a minor influence on its poor transcriptional ability in vitro. All such reported variant tRNA gene sequences have been classified as pseudogenes by Sharp et. al. (1985). While the sequence heterogeneity is reminiscent of features found in  Xenopus 5S rRNA genes (Jacq et. al., 1977), their transcriptional competence is not. Thus, the term "pseudogenes" may be misleading, since it implies non-functional templates. Furthermore, whether all such variant tRNAs, including the more extreme tRNA^is, participate in specialized cellular metabolism analogous to the tRNA^l* and tRNA^ in the er  silk gland of Bombyx (Sprague et. al., 1977; Hentzen et. al., 1981) cannot as yet be eliminated. In S. cerevisiae, such variant genes are apparently either rare or totally absent (Guthrie and Abelson, 1982). The reason is unknown but it may suggest that the mechanisms of sequence rectification in the yeast genome would be much more stringent than those in higher eukaryotes. Other recorded heterogeneities are more drastic; they are composed of incomplete or remnants homologous to known tRNA genes. DNA sequencing of plasmids hybridizing to initiator tRNA in Drosophila has revealed that one clone has several DNA segments homologous to parts of tRNAi^et x  n e  longest region of homology corresponds to positions  7 through 39 within the coding sequence, which represents approximately 50% of the  intact gene (Sharp et. al., 1981a). The D. melanogaster DNA insert of this plasmid hybridize to more than 30 sites in the D. melanogaster genome and has a pattern of hybridization reminiscent of middle repetitive or genetic mobile elements. Parallel observations have been obtained with a plasmid clone cross-hybridizing to a tRNA Arg probe (Newton et. al.„ manuscript in preparation). This plasmid contains a 3-half of an expected tRNA S gene Ar  13 starting at position 37 and ending at the mature 3-end at position 76. In situ hybridization with a 600 bp restriction fragment containing this gene-fragment also shows characteristic patterns of middle repetitive elements.  The immediate 3' end does not contain the  anticipated poly-T tract as the putative termination signal; but curiously, it does include the triplet CCA, which is normally added post-transcriptionally. A similar case of a tRNAPbe pseudogene has been identified in the mouse genome (Reilly et al., 1982). This truncated gene contains homology to the known tRNAP  from position 39 to 76, including the  fle  terminal CCA triplet. The inclusion of the CCA sequence at the end of the tRNA^rg and genes is unusual and it has been ventured that the information of the mature tRNA was reverse  transcribed followed by integration  into the  genome by a  retrotransposon-like mechanism (Denison et al., 1982; Hollis et al, 1982; Wilde et at., 1982). Other pseudogenes are less stunning in their digression from complete genes. In the rat genome, the genes for tRNAAsp, tRNA ^. and tRNA^u are tightly clustered on a 33 kb 6  EcoRI fragment which is reiterated about ten times in the haploid genome.  Sequence  analysis of six copies reveals that five of the tRNA^ly genes have deletions of seven nucleotides between residues 20 and 26 (Brown and Sugimoto, 1973; Shibuya et al., 1982). Three of the tRNA  Glu  genes lack 14 nucleotides,  11 nucleotides from the 3' end to 3  nucleotides beyond. All of the above fragmented genes in l>rosopnila($h?Lrp et al., 1981a), mouse and the rat (Shibuya et al., 1982) failed to support RNA synthesis in vitro. Even if they were transcriptionally competent, the hypothetical transcripts should fail to achieve the correct tertiary conformation typical of tRNAs. Whether these sequences retain their ability to compete for transcriptional factors have not been examined. These incomplete genes, with their degenerated states, may be more correctly addressed as pseudogenes. A novel pseudogene arrangement representing fusion of two different tRNA gene sequences has been recently reported in D.  discoideum (Dingermann et al., 1985).  Overlapping a segment of DNA encoding a tRNA Val, cloverleaf-like structure resembling a a  tRNAftts pseudogene could be superimposed with the 5' terminus sequence, GTTCG, of the  1 1  tRNA Val g  e n e j  which serves to form the common tRNA TlC loop of the tRNA  His  gene. The  tRNA His pseudogene does not encode several of the conserved nucleotides, found in all tRNAs, in the D loop and the anticodon-loop. Thus, a putative RNA transcribed by the pseudogene would fail to achieve tertiary structure of normal tRNA.  Number. Diversity and Organization of tRNA Genes The total number of tRNA genes in Drosophila melanogaster has been estimated to be 750 per haploid genome by Ritossa et al., (1966) and Tartof and Perry (1970). A slightly lower value of 590 copies per haploid genome was given by Weber and Berger (1976). These estimates thus correspond to approximately 0.013% - 0.015% of the total DNA. By reverse phase chromatograpy, total Drosophila tRNAs can be resolved into 63 major and 39 minor isoaccepting peaks (Grigliatti et al., 1973; White et al., 1973). Many of these are probably "homogeneic"; that is, the tRNAs transcribed from the same genes but are modified to different extents post-transcriptionally. If transcription is proportionaltothe number of these genes, this would predict an average redundancy of 10-13 copies for each tRNA gene. Crude estimates of gene numbers for particular isoacceptor tRNAs localized at specific sites were obtained by Elder et al., (1980a,b) based on grain densities from in situ hybridzation. Their results suggest that two genes each for tRNA2^ are localized in the et  regions 48A and 72F-73A, whereas eight and five tRNA2 8 genes are localized at 42A and Ar  84EF, respectively. However, hybridization methods are generally deemed inaccurate as demonstrated by Tener et al., (1980) in their attempt to estimate the number of genes by in  vitro hybridization both in solution and on filters. The plateau level of hybridization (or the equilibrium) is dependent on the concentration of tRNA used. These experiments would therefore portend that many hidden factors can intensify the complexity of such hybridization experiments, such as the extent of modification of the tRNA probes, the presence of polymorphic tRNAs and pseudogenes (see above). Other features of tRNA gene organization and arrangement may also affect the ability to detect genes by hybridization. A major finding of the DNA sequence analysis of the D.  15 melanogaster recombinant plasmid pCIT12 concerned the arrangement of the individual genes (Hoveman et al, 1980). Eight tRNA genes have been detected on pCIT12, which are irregularly spaced and arranged such that five genes are in one transcriptional direction and three in the other. Since the transcription direction is different for various genes of the same isoacceptor, the tRNA genes are capable of forming inverted repeat structures in which the homology extends over the entire coding region of the tRNA genes. Structures with inverted repeat stems of 70-100 bp were observed by electron microscopy during analysis of heteroduplexes of pdT12 and the vector DNA ColEl (Yen and Davidson, 1980). The nature of the inverted repeats has been enlightened by DNA sequencing; they are formed by homologous tRNA genes having opposite polarity. The occurrence of inverted repeats thus explains the difficulty in detecting tRNA genes in the original heteroduplex analysis with radioactive tRNA probes (Hoveman et al,. 1980).  If similar tRNA gene  arrangements occur with a reasonable frequency in the genome, one would expect that the estimation of gene numbers to be less than actual. One major advantage of studying tRNA gene organization in Drosophila is the exploitation of large polytene chromosomes as hybridization templates.  Transfer RNA  probes of high specific activities can be obtained by iodination with 125i and hybridize directly to the denatured chromosomes (Commerford, 1971; Prensky, 1976).  With  radiolabelled total 4S RNA as a probe, over 50 sites of hybridization could be detected, and approximately half of these represent major sites of hybridization (reviewed by Elder et al., 1980a). Most of the them are distributed randomly over the four arms of the two large autosomes. The only major site on the X is at bands 12E and no hybridization was observed over the small chromsome 4. Further refinement of in situ localization was conducted using highly purified tRNA isoacceptors (Grigliatti et. al., 1973,1974;Delaney et. al., 1976; Dunn et al., 1978, 1979a; Kubli and Schmidt, 1978; Schmidt et. al., 1978; Hayashi et. al., 1980; Schmidt and Kubli, 1980). These parallel lines of experiments showed that in general: (1) many isoacceptor can be found at more than one site in the Drosophila melanogaster genome, (2) more that one isoacceptor  16 can be detected in the same region. Even though in situ hybridization is a powerful mapping technique, the sensitivity is too low to tell whether genes for the different isoacceptors are intermingled or segregated from each other. This problem has been overcome by molecular cloning and DNA sequencing technology, where the exact number of genes and their relative arrangement within the cluster can be easily determined. The first isolated recombinant plasmid containing Drosophila tRNA genes was pCIT12, derived from the 42A region (Yen et. al., 1977). DNA sequencing of segments containing tRNA genes (Hovemann et. al., 1980) showed the presence of eight genes within a 9.3 kb region: three for tRNA^sn three for tRNA2 . Lvs  one for tRNA2A8, and one for tRNA^ . The analysis was extended in both directions by f  e  "molecular walking" (Yen and Davidson, 1980; see Chapter I) where a total of 94 kb of sequence derived from 42A was recovered on overlapping recombinant X phages. By restriction mapping and subsequent DNA sequencing, a total of 18 genes were identified (including those on pCIT12) scattered over a region of approximately 46 kb: eight for tRNA , f Asn  our  f  tpj^Arg, five for tRNA2 , and a single tRNA gene. These genes Lvs  or  Ile  are irregularly spaced and are transcribed from both strands of the DNA. Furthermore, redundant genes for a particular tRNA isoacceptor contain identical sequences (fig. 3)The other major hybridization site at90BC has also been similarly analyzed by molecular walking (DeLotto and Schedl, 1984). Although the analyis is incomplete, at least six tRNA genes have been identified scattered over approximately 31 kb, which have been divided in seven smaller regions in their discussion.  In region 1, one for tRNA3i»Val  j o  n  e  f  or  tRNA ; in region 2, one or possibly two for tRNAAl* one for tRNAP ; in region 4, two for Pro  f0  tRNATbr. Other regions (3.5,6 and 7) have not been sequenced but are thought to contain additional tRNA genes. The arrangement of the characterized tRNA genes are also randomly interspersed and similar to those at 42A; they are also transcribed from both DNA strands (fig. 3) Although not as extensively studied as 42A and 90BC, other smaller gene clusters derived from different chromosomal locations have also been isolated in plasmid clones. These  17  25 k b  C  "< ?  c c c  <<<  25 k b  42A  — O >a-  O D Q-< X'  6  31 k b  7  90BC  F i g . 3. T v o typical chromosomal sites e n r i c h e d for tRNA genes. Both sites have been characterized by molelcular walking and DNA sequencing. The size of the cloned region is indicated on the right in kilobases (kb). Top: The 50 kb region from 42A site has IS genes encoding five different tRNA isoacceptors (modified from Yen and Davidson, 1980). Bottom: The 31 kb region from 90BC site has at least six genes encoding four different tRNA isoacceptors. The structural genes are depicted as dots and their direction of transcription is indicated by arrow heads from 5' to 3'. At the 90BC site, genes in regions 5. 6, and 7 have not been sequenced but are known to contain homology to total 4S RNA from Southern hybridization studies. The tRNA * tRNA * genes in region 1 are identical to those reported in Chapter IV of this thesis. The sequence of one tRNA gene (?) has only been partially determined but it probably corresponds to another tRNA* gene (modified from DeLotto and Schedl, 1984). v  18  1  0  18 include genes for tRNA2 y (Gergen 0* al. 1981), tRNA^i" (Hosbach et, al.. 1980), tRNA L  s  Leu  and tRNAN (Robinson and Davidson, 1980), tRNA y (Hershey and Davidson, 1980), e  G1  tRNAi^et (Sharp et. al.. 1981a). tRNA4  Val  (Addison, 1982 and Addison et al.. 1981), and  tRNA^rg (Newton, unpublished). All these smaller clones impart the same pattern of irregular gene arrangements resembling those at 42A and 90BC. Hybridization studies with yeast DNA by Schweizer et al., (1969) and Feldmann (1976) showed that there are approximately 360 tRNA genes. This would suggest that the average reiteration frequency is on the order of eight genes per tRNA species. The general features that emerged from isolation of various nonsense suppressor tRNA genes and from random cloning experiments (Guthrie and Abelson, 1982) are that a particular tRNA species is encoded by multiple, but solitary, genes found on different chromosomes. In fact, there is little evidence for the clustering either of isoaccepting species or of tRNA genes per se. With possibly rare exceptions such as dimeric tRNA genes that are coordinator/ transcribed (Schmidt et a/,1980), the overall distribution of tRNA genes may in fact be close to random. This lack of organization is contrary to that seen in Drosophila, where tRNA genes are typically found in clusters containing multiple copies of several different tRNA genes (see above). In the haploid human genome, there are approximately 1000 tRNA genes representing about 60 different genes of 10 to 20 copies each (Hatlen and Attardi, 1971). There is some evidence intimating that the organization of tRNA genes in human is similar to that in the fruitfly. The initiator tRNA** genes have been cloned and sequenced. They are solitary 61  genes embedded within sequence of high homology and are distributed randomly throughout the genome (Santos and Zasloff, 1981). In contrast, interspersion of different tRNA gene types have been detected in X phage clones from a human library. One copy each of the genes for tRNA^ys tRNA^ , and tRNA n  Leu  have been localized within 1.6 kb and  are separated from each other by about 0.4 to 0.5 kb (Roy et. al., 1982). Recently, a tRNA^iu gene was identified on a 2.4 kb fragment that also contains a sequence capable of folding into a tRNA-like structure (Goddard etal., 1983). Whether this is fortuitous or a pseudogene  19 is not known. Clusters comprised of different neighboring tRNA genes could also be the predominate theme in the rat and mouse genomes. Although detailed structural analyses in these two organisms are lacking, several recombinant clones containing a mixture of genes for tRNA Asp, tRNA y, and tRNA * have been discussed in the section dealing with G1  G  u  pseudogenes.  Transcription of tRNA Genes Transcription studies of tRNA genes from different organisms demonstrated the necessary promoter elements to direct accurate transcription reside internally in the genes (Ciliberto et al., 1983; Murphy and Baralle, 1984; Stewart et al., 1985). Nuclease-mediated deletions performed on these templates have identified the essential internal control regions (ICRs) to be split into two non-contiguous sequences corresponding to the D and T loops within the tRNA, now variously termed 5-ICR or A box and 3-ICR or B box, respectively (Sharp et al., 1983a). Because these ICRs are highly conserved sequences throughout eukaryotic tRNA genes (excluding organelle tRNAs), they would appear to serve a critical function in gene regulation as well as in the tRNA itself. The comparison of ICRs in over 100 tRNAs and tRNA genes has led to the generalized 5' ICR sequence, 5 TRRYNNARYGG- 3', corresponding to positions 8 to 19 within the D loop and the 3' ICR sequence 5 -GGTTCGANTCC-3 corresponding to positions 52 to 62 within the T loop (Sprinzl et al„ 1987; Sharp et al., 1985). The sequence of the 3' ICR is highly constrained, incidentally, in both prokaryotes and eukaryotes. The stringent conservation may reflect the added importance of the T loop in the proper function of the tRNA itself, rather than strictly relegated to gene regulation perse. Mutational analyses performed on the S. cererlsiae\RSH$Y (Kurjan et al, 1980; Allison r  et al., 1983), the Xenopus tRNAi  Met  (Folk and Hofstetter, 1983) and the S. pombe tRNA  genes (Willis et al., 1984) have all honoured the importance of the split promoter elements in mediating accurate tRNA gene transcription. Many point mutations that engendered  Ser  20  drastic decreases in their transcription ability of the templates all mapped to these elements. Additional sequences outside of these central promoters may also affect tRNA gene function. In the S. pombe tRNA$  er g  e  n  e  (Willis et al., 1984), a transcription-down  mutaion (A4$) has been mapped at the junction of the extra arm and the T stem. Similar down mutations within the extra arm and T stem coding region have been reported for the tRNA  Pro  gene of C. elegans (Ciliberto et. al., 1982) and the tRNA  Tvr  gene of S. cerevisiae  (Kurjan et al., 1980; Allison et al., 1983). It is not clear whether these mutations have uncovered yet another separate control element, or the extra arm is merely an extension of the 3' ICR. Sharp et al., (1981b) favored the latter explanation based on their 3' deletion analyses of the D. melanogaster tRNA^rg gene. However, their interpretation is weakened by their use of grossly altered templates, rather than more selectively targeted mutations. Thus, the separate contributions of the extra arm and the T loop to the overall transcription efficiency of the gene could not, realistically, be cleanly demarcated based on their studies. In contrast to the 3' ICR, the canonical sequence of the 5' ICR is much more degenerate. Furthermore, in comparison of all known sequences of tRNAs and tRNA genes showed that the D loops can also be variable in length (Sprinzl, et al., 1987). Extra nucleotides have been localized adjacent to positions 17 and 20, and are numbered as 17A, 20A and 20B, accordingly. Also, the 5' ICR (GGTCTAGTGG) of the C. elegans tRNA  Pro  gene is functionally  interchangeable with the first 11 nucleotides (AGCCAAGCAGG) from the 5S rRNA gene promoter from lenopus (Ciliberto et al., 1983). Even though both sequences honour the 5' ICR consensus motif, both in fact differ from each other by many point changes. This high degree of functional flexibility in the 5' promoter would imply that the DNA sequence perse is not critical in conferring trancription competence to the tRNA gene. However, their "hybrid" 5S/tRNA constructs have only been tested in lenopus extracts, it would be interesting to test whether extracts derived from C. elegans are equally indiscriminate. This is because certain polymerase III transcription extracts have been postulated to confer species specificity to some tRNA genes, through some unknown interaction between "compatible" 3" ICR and the 5 - flanking sequence (Sharp et. al., 1985; see Formation of  21  Transcriptional Complexes). If the primary DNA sequence of the 5' ICR is not critical for transcription of tRNA genes, then could maintenance of the stem-loop structure be sufficient to sustain template activity? Questions of this sort have been explored by contructing of three D-stem mutants in a yeast tRNA  Leu  gene by synthetic oligonucleotides (Mattoccia et al., 1983). The first  mutant contains simultaneous changes of GCC to AAA at positions 10, 11 and 12; while the second mutant contains changes GGC to TTT, in the complementary strand corresponding to positions 24,25 and 26. Both of the mutants would disrupt the proper base pairing structure in the D-stem. The third mutant is a double mutant, coupling both sets of changes to preserve the stem-loop configuration in the 5' ICR. Transcription of the yeast genes in the heterologous Xenopus germinal vesicle system showed that all three mutants are not dramatically affected, However, accurate excision of the intron occurs only in the double mutant where the proper D stem-loop structure can be maintained (Baldi et. al., 1983) The behaviour of the mutants are entirely different in the homologous yeast extracts (Newman et. al., 1983)  The AAAjO-12 mutation reduced transcription 10 fold, while the  complementary mutant TTT24-26 showed very little effect. The double mutant is also poorly transcribed. It would appear that in the homologous system, the sequence of the control region is critical, while the capacity to form a stem-loop structure is not. These completely opposite results described above would thus caution that the degeneracy of the 5' ICR consensus sequence distilled from many diverse organisms may not necessarily indicate functional flexibility. It could also mean that the great diversity in sequence could confer some "sequence context" specific for a particular tRNA gene that is read differently by different transcriptional complexes.  Flanking Modulatory Sequences While the internal split promoter sequences are essential for faithful transcription of the tRNA genes, deletion of the flanking sequences upstream and downstream can also markedly affect the rates of transcription. One method for identifying potential modulatory  zz  elements is to simply compare flanking sequences of a large number of tRNA genes. For some members of certain tRNA gene families, the 5'- and 3'-flanking sequences are highly conserved. These include genes coding for the Drosophila melanogaster tRNAjMet (Sharp et al., 1981a). tRNA  Glu  (Hosbach et at.. 1980), and tRNA Y (Hershey and Davidson, 1980), G1  and tRNA 8 at 12DE (Newton. 1984; chapter III), the human tRNAj Ar  Met  (Santos and Zasloff,  1981), tRNA P of D. discoideum (Peffley and Sogin, 1981) and the rat tRNA^sp (Shibuya et Tf  al., 1982). The sequence conservation in these gene members often extends several hundred base pairs in both directions from the structural genes. However, it is unlikely that these conserved flanking sequences are held captive by virtue of functionality; rather, their patterns are suggestive of duplicative events either by unequal crossing-over (tRNAArg  d tRNA l» genes of Drosophila) or transposition-like mechanisms (tRNA^rP G  a n  gene of D discoideum). With a few exceptions (see below), sequence conservation of this type is generally rare. In fact, flanking sequence similaries that can be attributed to transcriptional control function are not common. 5' Negative modulatory sequences resembling RNA polymerase III termination signals have been shown to inhibit in vitro transcription of tRNA2 (DeFranco et al., 1980 and Lvs  1981), and tRNA2A8 (Dingermann et al., 1982) genes from D. melanogaster. The sequence, f  GGCAGTTTTTG, is well conserved in front of a number of tRNA2  Lvs  genes. This conserved  element is positioned from about -12 to -25 from the start of the structural gene, although the absolute positions do vary in the different genes (Hoveman el at, 1980). Deletion of this sequence in tRNA2 gene 2, and replacement with pBR322 DNA, led to a dramatic increase Lys  in transcriptional efficiency in I. laevis germinal vesicle (GV) extract (DeFranco et. at., 1980). The function of this element is extremely sensitive to its position relative to the gene; moving this sequence by only one base pair closer to the tRNA gene can substantially neutralize its inhibitory effect (DeFranco et at, 1981). Another tRNA2  Lvs  gene (gene 4 in  their studies), has only four consecutive T residues in its conserved element; however, this gene is efficiently transcribed in GV extracts. The D. melanogaster tRNA2A8 gene also has five consecutive T residues beginning at f  23  -21 from the structural gene (Dingermann et al., 1982). This sequence also appears to inhibit efficient transcription of the tRNA gene, but only in homologous extracts and not in extracts derived from either HeLa cells or GV. A similar stretch of five consecutive T residues has also been identified in another Drosophila tRNA3  LVS  gene. These residues are  located at -20 to -24 in front of the gene, and yet when challenged by ia vitro t r a n s c r i p t i o n in KcO cell extracts, this gene is highly efficient. From these studies collectively, the conceptual edifice of the 5' poly T residues merely acting as general RNA polymerase III termination signals to inhibit transcription would certainly collapse as being too simplistic. As pointed out by Sharp et al., (1985), the length of such poly T residues does not always consistently correlate with the strength of inhibitory effects observed in the different extracts prepared from both HeLa cells and Drosophila KcO cells. In this regard, it is worth noting that an active SUPlfb locus (i.e. functional in vivo) in yeast, which codes for an ochre suppressor tRNA^yr does contain six T residues positioned at -14 to -19. Moreover, this gene is also efficiently transcribed in homologous cell-free extracts (reviewed by Sherman, 1982). Likewise, in the Bombyx mori, there are two blocks of five T residues within the first 21-bp 5' to the tRNA2 * gene. The natural terminator is composed of a A  a  single block of four T residues located 17-bp downstream from the mature coding sequence. Yet, this gene is constitutively expressed in the silk worm as well as efficiently transcribed  in vitro (Sprague et al, 1980). An extensive study of the I. laevis tRNAi^et  g e n e  D V  systematic resections carried out on the 5-flanking region described a surprisingly complex mosaic of modulatory elements, including other inhibitory sequences with the potential to form Z-DNA (Clarkson et a/,1981). Positive modulatory sequences have also been tentatively identified. In the Drosophila tRNA2 « gene, removal of sequences between -8 to -33 in the 5-flanking region caused a Ar  95% reduction in transcription efficiency (Sharp et. al., 1981b). However, the nature of the positive modulatory element(s) has not been well defined in these crude deletion studies. More recently, by using site-directed mutagenesis, Sajjadi (1987) showed that a pentanucleotide, TCGCT, may play a positive modulatory role in the transcription of a  Z4  Drosophila tRNA^al. A degenerate form of this sequence, TNNCT (N=any nucleotide), is also imperfectly correlated with other tRNA genes that are efficiently transcribed in vitro. Since the element is rather short and the degenerate form may occur frequently, Sajjadi (1987) proposed that its functional competence may either be positionally dependent (beyond -30) or in concert with additional surrounding modulatory elements. The most completely characterized positive modulatory element to date has been that in S.  cerevisiae. Four of the nine tRNA Leii genes showed extensive 5- and 3-flanking sequence 3  homologies, in addition to their intervening sequences (Raymond and Johnson, 1983; Frischloff et. al., 1984). tRNA A 3  Leu  Deletion mutagenesis performed on the amber suppressor  showed that the 5'-ftanking region between -1 and -15 is critical for in vitro  activity in yeast cell extracts (Raymond et. al., 1985). This region also contains a pentadecanucleotide sequence, TTTCAACAAATAAGT, that is highly conserved in all four genes. Clones with progressively deleted 5-flanks were transformed into different yeast strains containing the amber mutations lys2-801, met8-l and tyr7-l. Upon transformation with the yeast-vector clones, suppression is very effective at the met8-l locus with all forms of the tRNA3A constructs. Suppression of the lys2-801 and tyr7-l mutations in the Leu  yeast host strain parallels the template activities in vitro, correlating with the absence or presence of the putative modulatory element. They noted that this sequence is also well conserved in 14 other different yeast tRNA genes, although their positions vary somewhat between them (fig. 4). The tRNA ", tRNA y , tRNA r«, and tRNA Le  T  f  A  Glu  genes which show  the best fit to the sequences all code for tRNAs which are abundant in S. cerevisiae (Ikemura and Ozeki, 1982). The authors proposed that this sequence may be one mechanism by which yeast cells adjust tRNA biosynthesis to match demand created by codon use preferences. DNA sequences referred to as sigma and delta have been found adjacent to tRNA genes of yeast (del Rey et al., 1982; Eigel and Feldmann, 1982). The 340 bp sigma element is repeated many times in the genome and, when found adjacent to a tRNA gene, is always located either 16 or 18 bp to the 5' side of the mature coding sequence. The position of the 340-bp  \  23  -30  -25  -20  -15  -10  -5  -1  5  10  15  20  TTTCRRCRRRTRRGT TaTCRRCRflgTRRtT aTTCRRCfiRtTRRaT TTaCRRCRRRaflaGa aTTCRRCRRRTRgta aTgCflflCflaflTflaGT TTaCflflCflflflaflgta TTTCflflgfltggflflGa TTTtflgCflflflaflflag aTTtflflaaaflTflfltT gagCflflCtflflTfltaT TTaafltCRRRafitGa aTTCflgtflgaTaRGT TaTCaRttgRaflRGT gTTCRtaflflgaRRtT  Fig.  4.  5 - F l a n k i n g s e q u e n c e s of d i f f e r e n t  tRNA  genes f r o m S. c e r e v i s i a e .  m o d u l a t o r y s e q u e n c e of t R N A 3 L e u gene, T T T C A A C A A A T A A G T , reference sequence (100%). below.  The other  genes w i t h s i m i l a r  NfiflE  SHRTCH  LEU3  -100-  SUP53 TVRSUP4 TVRTVG  80 80 73  RRG19?  73  GLUPV20  73  RRG18U TRP GLH SERRL1  67 67 67 60  PHEPT5 GLUPV5 11ET3  60 60 60  HISFD12  60  HISFD2  60  The putative  positive  is g i v e n i n c a p i t a l letters and used as the pentadecanucleotide  sequences a r e  listed  T h e bases w h i c h c o r r e s p o n d to the t R N A 3 L e u sequence are c a p i t a l i z e d and m i s m a t c h e s a r e  lower case.  T h e p o s i t i o n of the s e q u e n c e r e l a t i v e to the c o d i n g s e q u e n c e i s d e p i c t e d and the  o r d e r e d b y p e r c e n t match (from Raymond et. a l . , 1985).  in  genes  26 delta sequence relative to the tRNA genes is not as precise. Both of these elements have been hypothesized to influence transcriptional regulation of the adjacent genes and share a common sequence, -CAACA-, found very near their ends. This same sequence constitutes part of the conserved pentadecanucleotide observed in tRNA3L . eu  Other examples of positive modulatory sequences have been tentatively identified in genes coding for a human tRNA * (Goddard et al., 1983) and in the tRNA2 gene from G  u  Ala  Bombyx mori (Larson et al., 1983; Young et al., 1986). The human tRNA  gene is  Glu  transcribed very efficiently in HeLa cell extracts and its ^'-flanking sequence is capable of forming a tRNA-like structure. This tRNA-like structure has been brought to attention as a potential regulatory element, but has not been critically tested. From deletion analyses and flanking sequence replacement of the tRNA2 * gene derived from Bombyx, the A  a  hypothetical positive modulator element has been roughly localized at position between -16 and -37, although at present the nature of this sequence is not as tightly defined as that in yeast (Young et al., 1986). The influence of the 3'-flanking region of transcription of tRNA genes has not been as well characterized. Initial work showed that the human tRNAjMet  g e n e  supported both in  vitro and in vivo transcription even though its natural 3'-flanking region had been entirely replaced with a thymidine kinase gene (Adeniyi-Jones et al., 1984). However, St. Louis (1985) in his transcription experiments employing a Drosophila tRNA$ hybrid er  gene (designated as pDt73) with its 3'-end fortuitously removed during cloning reported that this gene is virtually inert in homologous extracts. Re-attachment of a 3' region from another tRNA4^ gene somehow allowed this construct to regain detectable, but low level, er  of transcription. The 3-flanking region of the Bombyx gene mentioned above, appears to be required for full transcriptional activity (Wilson et. al., 1985). The use of high template DNA concentrations to overcome the effects of an inhibitor present in extracts was found to be partly responsible for masking the contribution of this region in their previous investigations. This region has now been shown to participate in factor(s) binding during transcriptional activation (unpublished, cited in Young et. al., 1986). This observation has  Z7 been fortified from studies using a yeast suppressor tRNATyr {SLTPfO) gene (Allison and Hall, 1985). Deletions into the oligothymidyate sequence 3' to the gene appear to diminish its ability to compete for a limited transcription factor in the extracts.  This is also  consistent with nuclease protection experiments (Klemenz et at., 1982; Camier et at., 1985), where this region in several tRNA genes are resistant to nuclease attack.  Formation of Transcriptional Complexes All Class III genes, including 5S rRNA, VA RNA, Alu I, some snRNAs and tRNAs, are transcribed by RNA polymerase III. Initial studies with the purified enzyme from mature I. taevis oocyes and human KB cells failed to produce transcriptional activity with either 5S rDNA or VA DNA templates (Parker et at., 1976). Instead, faithful transcription of these genes could be elicited in the presence of bulk chromatin, along with the addition of purified RNA polymerase III. These observations led to the suggestion that several protein components in chromatin are necessary to carry out the accurate and selective catalytic process.  Several of these components appear to be shared among the transcriptional  complexes of all Class III genes, because partially purified RNA polymerase III transcriptional complex from KcO cells can potentiate transcription of both 5S rRNA and tRNA genes (Burke et at., 1983). As well, serum antibodies of patients with autoimmune diseases diagnosed as systemic lupus erythematosis have been shown to specifically immunoprecipitate ribonucleoproteins, or RNPs (Lerner et at., 1981). These small RNP particles have been shown to form a complex with tRNAs and 5S rRNA in uninfected cells (Rinke and Steitz, 1982), and VA RNA (Gottesfeld et at., 1984). Addition of the serum antibodies can inhibit transcription of 5S rRNA and tRNA genes in HeLa cell extracts, raising the distinct possibility that the inhibited antigen could be a basic component(s) in Class III transcriptional complexes (Rinke and Steitz, 1982). Fractionation of cytoplamic extracts from human KB cells on phosphocellulose and by additonal chromatographic steps revealed that in addition to RNA polymerase III, at least two other ^distinct fractions were required for reconstitution of specific tRNA gene  28 transcription (Segall et al., 1980). The changeability of the phosphocellulose factors from human cells with similar fractions derived from lenopus also suggest these components are evolutionarily well-conserved (Shastry et al., 1982). Similar purification steps carried out with cell extracts derived from Bombyx and S. cerevisiae appear to correlate well with the human studies (Ruet et al., 1984). The above investigations, would thus imply that the transcriptional complex contain at least two distinct fractions and RNA polymerase III. Although it is now well established that an additional factor, TFIIIA, is required for specific transcription of 5S rRNA genes, fractionation of other eukaryotic extracts thus far has failed to reveal further repertoire of transcriptional components (Segall et al., 1980; Burke  et al., 1983; Shastry et al., 1982; see below). The 3' ICR of tRNA genes appear to bind stably to at least one component in the transcriptional complex (Lassar et al,. 1983; Newman et al., 1983). The specificity of this interaction has been examined by DNase I protection or competition assays (Klemenz et al., 1982; Fuhrman et al., 1984; Van Dyke and Roeder, 1987). In addition, a yeast SUP53 tRNA gene inflicted with a mutation at the highly conserved nucleotide C56 to G56, fails to bind to this factor(s) and shows a concomitant decrease in its competitive ability (Newman et al., 1983). This factor, variously referred to as Factor C, TFIIIC, or tau, has been partially purified from yeast (Ruet et al., 1984), and HeLa cell (Fuhrman et al., 1984). The HeLa cell factor is required for formation of stable transcription complexes and for faithful transcription of both an adenovirus VAI gene and the Bombyx tRNA2 * gene. Recently A  a  Van Dyke and Roeder (1987) have suggested that TFIIIC may exist in two distinct forms, a cytoplasmic form and a nuclear form. Both forms of TFIIIC possess functional activity when assayed by in vitro transcription using a VAI RNA gene as the template. However, DNase I protection experiments showed that only the nuclear form is able to afford a protection ladder in the 3' ICR. Because the cytoplasmic form is incapable of binding and it is physically segregated from the gene, it was suggested that this form may be inactive in r/VzKVan Dyke and Roeder, 1987). Barring experimental artifacts, they further proposed that a general mechanism of a Class III gene regulation may depend upon the  29  interconversion of the active and nascent forms of the TFIIIC. Similar results have also been obtained by Yoshinaga et al., (1987), although their interpretation of the data differed slightly in detail. By chromatography and sedimentation velocity gradient analysis, they shoved that the two forms of TFIIIC, named 1 and 2, are distinct components of approximately 400-500 kDa and 200 kDa, respectively, rather than simple activation of the same nascent form. The TFIIIC2 can bind tightly to the 3' ICR while TFIIIC1 has very low affinity for the 5' ICR, as revealed by DNase I protection experiments. Either form alone can only sustain barely detectable transcription in vitro using VAI gene as the template. However, active transcription complexes can be reconstituted with the presence of both complements. While the VA RNA gene is able to interact with TFIIIC without other components, stable complex formation with tRNA genes, at least those in Drosophila and human requires the presence of another factor, TFIIIB. Furthermore, TFIIIB does not appear to remain stably bound, but recycles rapidly (reviewed by Lassar et al., 1983). This is consistent with the mechanism described by Dingermann et. al., (1983) for stable complex formation using various deleted tRNA2 S genes in an unfractionated Drosophila KcO cell extract. In their Ar  scheme, they proposed that the two transcription factors, TFIIIB and TFIIIC, interact with the D and T control regions of the gene, respectively. From competition experiments, it appears that the both the 5'- and 3'-flanking sequences aid in the binding stability of these factors (Sharp et al., 1983b; Schaack et al., 1983). The cooperative binding of the two factors would then bring about stable complex formation, although the binding activity of TFIIIB has yet to be shown experimentally. These factors described above, however, appear to display some species specificity. In some cases, the Drosophila KCo cell TFIIIC is functionally equivalent to that in HeLa cells extract, and can be reconstituted with human TFIIIB to promote transcription of tRNA genes. Sharp et al., (1985) suggested that for this to occur, there may be some type of "compatibility" between the TFIIIC and the 5-flanking region of the tRNA gene. Such "compatibility" may not always be consistently maintained between species. As well, the  30 DrosophiteTLllWl cannot replace the human counterpart in the heterologous transcription system (Dingermann el al., 1982; Burke el al., 1983). Thus, tRNA gene transcription may involve a general mechanism, but a higher level of complexity is raised by the additonal revelation of species specificity. From competition assays, tRNA genes have been shown to sequester, rapidly and stably, a limiting component when added to transcriptionally active cell-free extracts. These assays rely on the ability of a test gene (or gene fragments) to inhibit the transcription of a "reference" gene. Mutant Drosophila tRNA^f g genes with varying degrees of deletions from either the 5' or the 3' side were examined for the ability to compete for limiting components (Sharp et. al., 1983b) and it appears that the 3' ICR is the most important region for stable complex formation. However, the stability and the rate of binding is also affected by the presence of the DNA throughout the coding as well as the flanking regions. In particular, removal of the 5' ICR invariably leads to a drastic reduction in the maximum rate and strength of the complex formation. Thus, this kinetic effect implies that this stable complex relies on some kind of recognition of the 5' ICR prior to stable binding to the 3' ICR (Schaack et. al., 1983). While formation of this stable complex for the tRNA Arg gene occurs fairly rapidly, there is a further lag phase between 10 to 30 minutes before transcription is detectable (Schaack et. al., 1983). This latent period is also temperature sensitive within the range of 24 °C to 30 °C; thus, suggesting a second priming step perhaps involving rearrangement of the tRNA gene and the bound components prior to initiation of transcription (Sharp et al, 1985; Schaack et al., 1983).  Maturation of tRNA Transcripts tRNA transcripts are initially synthesized as precursors containing extra nucleotides at both the 5' and the 3' ends. The initiation sites of transcription are usually 4-7 nucleotides upstream of the structural gene, and normally coincide with a purine residue. Termination is thought to involve an oligothymidylate sequence that is located near the 3' end of the mature coding sequence. Maturation of the transcription probably occurs by removal of,  31 first, the 5' and then the 3' extra nucleotides by ribonucleases, followed by addition of the CCA end by nucleotidyltransferase (reviewed by Young et al., 1980; Lund et al., 1980; Ghosh and Deutscher, 1980). If intervening sequences are present, they are exicised by splicing enzymes. The accuracy of the process depends on the proper conformation of the tRNA precursor (Ogden et al, 1979).  Since the tertiary structure is maintained by  unconventional base pairing involving modified nucleotides, this would imply that at least a limited amount of modification must have occurred before excision of the intron. The trimming at the 5' end is thought to require an RNase P endonuclease, first identified in £. coli by both biochemical and genetic means (Shimura et al., 1980). An enzyme with similar catalytic activity has also been partially purified from Sc. pombe. This enzyme can process the 5' end of an £ coli tRNATyr plus a variety of other yeast tRNA precursors produced in vitro, to the mature 5' terminus (Kline et al., 1981). From these in vitro studies, this reaction appears to be a one step process. tRNA y , tRNA2 , and t R N A i r T  f  Ser  m  n 0  S e r  The in vivo transcripts of yeast  genes have been examined by a modified Northern  blotting procedure (Hopper and Kurjan, 1981). For all three tRNA genes, they were able to detect only three species of transcripts. Two correspond to the hypothetical precursors of 108 and 92 nucleotides; the latter is probably a transcript with both the 5' and 3' extra nucleotides removed but retaining the intron. The last transcript is 78 nucleotides in length, corresponding to a full size mature tRNA. Thus, their results agree with the in vitro studies, suggesting that both the 5' leader and the 3' tail are removed as a single-step process. However, injection of a yeast precursor tRNA^y into the nucleoplasm of the r  Xenopus oocyte showed that removal of the 5' leader is at least a three-step reaction with a progressive removal of small oligonucleotides, rather than as a single catalytic step (Melton  et al., 1980), The reasons for the discrepancy are not known, but there could be some fundamental differences in the variety of organisms used as model systems. The enzymology of splicing in eukaryotic tRNAs has been best characterized only in yeast. Several temperature-sensitive mutants defective in the process have been isolated (Peebles et al., 1979), The precursors accumulated at the nonpermissive temperature in the  32  raalot lost mutants have provided substrates for the assay of splicing in vitro. Peebles et al. (1979) have isolated an activity from a ribosome wash that is capable of removing the introns from all ten of the precursors accumulated from the mutant strains. The splicing components appear to be fairly pure since approximately 96% of the input tRNA precursors can be spliced, with very little random degradation or abortive splicing pathways. In this  ia vitro system, occasionally transient appearance of smaller RNAs with the mobility of half-tRNA-sized molecules.  From their kinetic analyses, they proposed that these half  molecules are probably true intermediates in the splicing reaction. Furthermore, accumulation of higher amounts of the half-molecules can be enhanced by the omission of ATP. These half-molecules have been subsequently purified by gel-electrophoresis and shown to be substrates for the second step in the splicing process, formation of a phosphodiester bond between the two half tRNAs (Knapp et al., 1979). Both the splicing endonuclease and the ligase required for rejoining the processed tRNA have been physically separated, although in vivo they may be integral components of a larger splicing complex. In these subsequent reports, the endonuclease appears to be integrally bound in membrane, rather than associated with the ribosomes as reported initially (Peebles et al., 1983). The discrepancy has never been explained but it could be due to contamination of rough endoplasmic reticulum, which is intimately associated with ribosomes. The activity of the enzyme is stimulated in the presence of spermidine (to stabilize the secondary structure of the tRNA precursor) and non-ionic detergents. The ligase, however, appears to be a peripheral protein also associated with membrane, but it is easily dislodged during the preparation steps (Greer et al., 1983)- Its activity is stimulated in the presence of Mg?* and ATP. Their intimate association with membranes is consistent with the observations that splicing of pre-tRNA is coupled with transport of the mature transcript from the nucleus. Also, in the losland r/ra/yeast mutants precursors are found to accumulate in the nucleus (Guthrie and Abelson, 1982). Characterization of the splicing intermediates has revealed several interesting features (fig. 5). The intervening sequence is probably excised as a discrete, linear polynucleotide  33  F i g . 5- Splicing of tRNA in yeast. The pathway proposed for joining of tRNA halves by yeast ligase is summarized after Greer (1986). The first step shows the formation of halves from pre-tRNA by yeast endonuclease. The sequence of the subsequent reactions is tentative since the precise order has not been determined and many of the enzymes involved have not been identified. IVS-intervening sequence. The different symbols around the phosphates are to facilitate tracing each through the ligation pathway. The yeast product (last step in the diagram) shows a 2' phosphate which is subsequently removed by a phosphatase.  34 X IVS h ®  V ENDONUCLEASE  X  Y OH  .NI  NI 3'  CYCLIC OPENING  X  -© OH  M [pjppA+KINASE  X  PPAA+LIGASE  x -©  -  \7  I  LIGASE- A A  ALT  LIGASE  \7  X -(p  Y  YEAST PIODUCT  33 with 5-OH and 2',3'-cyclic phosphodiester termini (Knapp et al, 1979). Similar analyses of all the gapped tRNA products in yeast reveal that in each case the endonuclease reaction produces 2',3'-cyclic P in the 5' half tRNA and 5'-0H termini in the 3' half. It has been proposed that cyclic phosphodiesterase activity associated with the ligase can catalyze the formation of a 2-P at the 5' half-tRNA. The 5'-0H terminus of the 3' half-tRNA is then phosphorylated, the 5'-phosphate is adenylated by an activated ligase, and this is followed by ligation of the half molecules and release of AMP. The 2-P is then subsequently removed by an unknown phosphatase (Greer et al., 1983). In HeLa cells and also possibly in lenopus, no 2',3-cyclic phosphates have been found (Filipowicz and Shatkin, 1983). In these higher eukaryotes, the 5' half- and 3' halfmolecules contain only 3-P and 5-OH groups, respectively, and the ligation steps probably involve a slightly different RNA ligase. As mentioned in an earlier section, the transcription of tRNA genes is not particularly influenced by the presence of the intervening sequence. However, as shown by Johnson and Abelson (1983), the proper splicing of the intron in a yeast SUPS-Q tRNA is required for the correct modification of the mature transcript in an ensuing step. The precise deletion of the intron from the gene significantly reduced the suppressor activity of its product relative to that of the unaltered gene. Analysis of the anticodon of the tRNA showed that the sequence normally contains a ¥ in its middle position. Removal of the intron somehow engendered a defect in pseudouridylation reaction as well as concomitant decrease in the amount of suppressor tRNA. As to the absence of ¥ in the anticodon, the authors suggested that the "¥ synthase" probably requires the proper pairing of the intervening sequence with the anticodon.  It is important to note, though, tRNA  Tvr  is the only yeast tRNA  sequenced to date that contains 7 in the anticodon. The presence of intervening sequences in the other tRNAs does not appear to be correlated with anticodon modifications in general.  Other Unusual tRNA-Mediated Cellular Functions in Eukaryotes 1. Protein Degradation:  36 Besides the familiar role of tRNAs in protein synthesis, they have also been implicated in the ubiquitin and ATP-dependent pathway in protein degradation.  In an earlier  investigation, it was shown that a free ct-NH2 group of the protein substrate is an important structural determinant for recognition by the ubiquitin system (Hershko et al., 1984 and 1986). More recently, Ferber and Ciechanover (1987) have shown that tRNA is essential for conjugation of ubiquitin and for the subsequent degradation of proteins with acidic amino termini. Both bacteria and eukaryotes contain an unusual class of enzymes, aminoacyltRNA-protein transferases, which catalyze post-translational conjugation of specific amino-acid residues to the mature amino termini of acceptor proteins. The best studied enzyme so far is the arginyl tRNA-protein transferase (Ferber and Ciechanover, 1987), which transfers an arginine to the amino terminus of proteins that are destined for proteolysis. The degradation process can be inhibited by the addition of either RNase A or micrococcal nuclease and can be entirely resurrected by the subsequent addition of the aminoacylated tRNA^rg after removal of the nucleases, suggesting that the tRNA species is critical in the proteolysis pathway.  More recently, another possible tRNA- and ATP-  dependent histidylation of substrates with acidic amino termini is being investigated (Ciechanover et al., 1985: Ferber and Ciechanover, 1987). Modification of proteins by lysine and leucine has also been reported (Shyne-Athwal et al., 1986), although its relevance to proteolysis by the ubiquitin system is as yet unclear. These discoveries have kindled another interesting question. Are only certain isoacceptors of a tRNA species relegated for this special function, or are all isoacceptors equally accessible to the ubiquitin-dependent proteolysis pathway?  2. Primers for Reverse Transcription of Retrotransposons Molecular analyses of several mobile elements identified in the Drosophila genome have shown that their genomes contain sequences that would encode putative enzymes similar in amino acid sequence to the retroviral reverse transcriptases.  One class of these  transposable elements known as copiah&s been extensively examined. Cbpia- related virus  37 particles with reverse transcriptase-like enzyme activity have been identified in Drosophila cells (Shibaand Saigo, 1983). Furthermore, Flavell (1984) has found linear and circular extra-chromosomal copia sequences that can be attributed to reverse transcription. His conclusions have been elaborated by Arkhipove et al., (1984) who detected genome-sized RNA-DNA complexes that are presumably intermediates in the reverse transcription of two Drosophila retrotransposons, mdgl and mdgj Proper initiation of retroviral reverse transcription requires a particular species of host cell tRNA as a primer, which can bind to the viral genome RNA via 18 bp Watson-Crick pairs (Varmus, 1983). Such primer tRNA has been shown to specifically interact with retroviral reverse transcriptase. The high affinity of the enzyme to the exposed surfaces of the Lshaped tRNA (the stem structures) may be related to the enzymes ability to open the acceptor stem, allowing the denatured stem to bind to retroviral primer binding site (Garret et at., 1984). Using synthetic oligonucleotides as probes, Inouye et. a/,0986) have isolated three potential tRNA primer coding sequences for the retrotransposn, 297  Sequence  analysis showed that they are related to tRNA^er <j tRNA7$ genes. They have also er  an  isolated both tRNAs and showed that the tRNA7$ contains the predicted 18 nucleotides er  (including the CCA-fjH d ) from the 3'-end exactly complementary to the putative primer eQ  binding site of 297, while tRNA4$ differs by one nucleotide. Thus, the authors proposed er  that tRNA4  Ser  and/or tRNA7  Ser  can probably act as primers for this class of  retrotransposons. Similar homologies between specific tRNAs and potential primer binding sites have also been correlated for other retrotransposons such as 17.6 (tRNA4^ /tRNA ^ ), 412 and er  er  7  /n^/(tRNA ), zrafc?(tRNA 7tRNA ) and gypsy{i$&&YS) (reviewed by Saigo, 1986). Arc  Leu  Ile  3. Chlorophyll Biosynthesis A molecule of chlorophyll is synthesized from a series of intermediates that are light regulated. One such intermediate step in the pathway is the conversion of glutamate to a penultimate compound known as 6-aminolevulinate or DALA (Astrid Schon et al, 1986).  38 The components performing this conversion have been isolated from barley and Chlamydomonas and can be separated by serial chromatography. One of the components is extremely sensitive to ribonucleases and has been shown by direct nucleotide sequencing to be a chloroplast glutamate tRNA isoacceptor (denoted as tRNA^ALA) which is encoded by the chloroplast genome. Glutamate attached by an aminoacyl bond to the CCA-OH end of the tRNA is the essential substrate for the subsequent steps in the biosynthetic pathway. The remaining two glutamate tRNA isoacceptors have also been purified from barley chloroplast and examined for possible activities in the 6-aminolevulinate conversion reaction. Both showed negative results, even though both species can be efficiently charged by the aminoacyl-tRNA synthetases present in the preparation. These results strongly suggest that the tRNA^ALA j  sa  highly specialized glutamate tRNA isoacceptor,  probably adapted specifically in chlorophyll biosynthesis.  However, its distinguishing  features that mark this isoacceptor from the other two have not been well characterized, except that tRNA^ALA appears to be hypermodified in the anticodon.  4. Induced and Naturally Occurring Suppressor tRNAs Termination codons, or stop codons, are TJAA (ochre), UAG (amber), and UGA (opal). These codons normally function to signal cessation of protein synthesis and release of the growing polypeptide from tRNA. Occasionally, mutations can occur within the reading frame converting a "sense" codon into a stop codon. The consequence of this mutation is premature termination, resulting in the production of a truncated protein. Nonetheless, the detrimental effect of the stop codon can be relieved by inducing "suppressor" tRNAs with mutations in the anticodon which can pair with any one of the stop codons and inserting an amino acid substitution in its place. The biology of these mutationally induced suppressor tRNAs in prokaryotes (Murgola, 1985) and in eukaryotes (Korner el al., 1978) have been extensively reviewed, and will not be discussed here. However, there have been reported cases of naturally occurring suppressor tRNAs that appear to constitute part of the normal and functional machinery of the cell (below).  39 Selenium is present in many biological systems in trace amounts, higher levels being toxic (Stadtman, 1974). More than 80% of the element can be traced to proteins containing selenocysteine, an analogue of cysteine in which the sulphur atom has been replaced by an atom of selenium. The question on whether selenium is incorporated into protein during translation or as a post-translational modification, has been a long contention. Previous investigations have identified a specific selenocysteylaminoacyl-tRNA (tRNA ), which Sec  suggests that the modified amino acid is directly incorporated into the protein during translation (Hawk.es et, al., 1985). This controversy appears to be resolved by recent report on the cloning of two genes coding for selenocysteine-containing proteins: a mammalian glutathione peroxidase (Chambers et at., 1986) and the £. coli formate dehydrogenase (Zinoni et al., 1986). DNA sequencing of both genes showed that the triplet corresponding to the selenocysteine position in the protein is UGA, which is usually recognized as a termination codon. Thus, the use of UGA for encoding selenocysteine seems to apply for both eukaryotes and prokaryotes, and tRNA^ would function analogously to a suppressor ec  tRNA. However, how the cell can distinguish this supposed "nonsense" codon from its natural usage as the termination codon remains unknown. Two naturally occurring opal suppressor serine tRNAs have been identifed in mammalian, avian and lenopus tissues (Hatfield et al., 1982; Diamond et al., 1981). In all cases, they represent about 1-3% of the total seryl-tRNA in these tissues. These natural suppressors are 90 nucleotides in length and are thus the longest tRNAs sequenced to date (Diamond et al., 1981; Hatfield et al, 1982). Those from the the bovine liver have been sequenced and are >90% in homology (5 differences) and their anticodons are CmCA and NCA (N is probably a modified U). The most unusual aspect is that in all higher eukaryotic genomes studied, there is only one coding sequence detectable. Since the two tRNAs differ by several pyrimidine transitions, including one in the wobble position of their anticodon as mentioned above, the implication is that these transitions must occur post-  transcriptionally (Hatfield et al., 1982; O'Neill et. al., 1985; Pratt et al., 1985; Lee et. al., 1987 Moreover, it was anticipated that at least the anticodon, C CA, should recognize the m  40 tryptophan codon UG6 as predicted by simple Watson-Crick base pairing; instead, both isoacceptors have been shown to recognize UAG in ribosome binding assays and confirmed in in vitro protein synthesis experiments. These isoacceptors are distinguished by their unique characteristic of forming phosphoseryl-tRNA in the presence of a kinase from bovine mammary (Hatfield et at., 1982) and liver tissues (Mizutani and Hashimoto, 1984). The enzymes have been partially purified and appear to consist of at least two different components. Moreover, these enzymes specifically phosphorylate only these two seryltRNAs, and no other serine tRNA isoacceptors (Mizutani and Hashimoto, 1984; Sharp and Stewart, 1977). The unique property of the opal suppressor tRNA^ to form phosphoseryler  tRNA may indicate some special role in cellular events requiring suppression. As with the tRNA^c discussed above, they may translate only UGA codons that have the appropriate neighboring sequence context and dictate the insertion of phosphoserine directly into protein (Hatfield, 1985).  The Present Studies The present investigation deals with characterization of a group of genes coding for the major serine tRNA isoacceptors, tRNA4$  er  and tRNA7^ , which are localized to the er  polytene bands at 12DE on the X chromosome. From previous in situ hybridization, using highly purified tRNAs as probes, Hayashi et al. (1980) showed that this X-linked region constitutes the major hybridization site. Minor hybridization has also been detected at three other autosomal loci: at 23E on the left arm of chromosome 2 (2L), 56D on 2R and 64D on 3L. From RNA sequencing of tRNA4  Ser  and tRNA7 . Cribbs (1982) demonstrated that Ser  these two different tRNA isoacceptors are highly homologous in sequence, having only three differences in 86 nucleotides: the Ci6,134 (inosine) and A77 in tRNA7$ are replaced er  by Di6 (dihydrouridine), C34and G77, respectively, in tRNA4  Ser  (Cribbs, 1982). Note that  the numbering system for the tRNA47^ here does not follow the convention in Sprinzl et er  al., (1987), where the 77^ nucleotide should actually be number 68. I have used the alternative system, which is the actual nucleotide in the molecules, to maintain consistency  41 in the numbering between the tRNA and the corresponding gene. This numbering system was first adopted by Cribbs (1982), and thus for historical reasons, it is also preferred to maintain consisteny between his and this report. The cloverleaf structure of the tRNA7$er is shown in fig. 6; its differences from tRNA4  Ser  are indicated accordingly. Because they  are highly similar in sequence, they are indistinguishable by hybridization; thus, ia situ hybridization does not convey the actual distribution pattern, whether both gene types are located at all four cytological sites or if they are segregated at different sites. The nucleotide at position 34 is in the anticodon and accounts for the two isoacceptors' different codon recognition. tRNA4^ is TJCG-specific while tRNA7 er  UCU (White et al., 1975; Cribbs et at., 1987a).  Ser  can read codons UCA, UCC and  Since the two tRNAs recognize non-  overlapping sets of codons, they are thus functionally distinct. Five recombinant plasmids hybridizing to Drosophila melanogaster tRNA47^ have er  been recovered by Dunn et al., (1979b). Sequences corresponding to their putative genes have been obtained (Cribbs, 1982; Newton, 1984).  Since the coding sequences  corresponding to either tRNA4$ and tRNA7$ are expected to differ at the three er  er  nucleotides, for convenience (and for describing "hybrid" genes later, see below) Cribbs (1982) has designated them as either 444 or 777 genes, based solely on the three diagnostic differences. The major impetus to my present investigations stems from the molecular analysis on the coding sequences for the tRNA4  Ser  and tRNA7  Ser  (Cribbs, 1982; Newton, 1984). Four of the  plasmids pDtl7R, pDtl6R, pDt73 and pDt27R are all derived from the major X-linked site at 12DE; the other, denoted as pDt5, hybridizes to the 23E site on chromosome 2 (Hayashi et. al., 1980). DNA sequence analysis showed that both pDt5 (Newton, 1984) and pDtl7R (Cribbs, 1982) contain a single 777 gene; that is, corresponding to tRNA7^ . pDt27R contains two er  444 genes matching the predicted sequence of tRNA4$ (Newton, 1984). In this thesis, I er  have referred to these genes, with known corresponding tRNA products, as "bona fide" genes. pDtl6R contains two genes, one corresponds to an expected 777, the other a 774. The latter gene is a "hybrid" structure with positions 16 and 34 allied with tRNA7$ but the last er  42  AOH ~  8 5  C G i-pG-C A-U Gm" V ~ • - U - A -*G C~G K GA » U°" CAUCC m'A. GUAGG ^ G ac CCG j l|) G GGC G j? D A m?G ' , GQ DA ^mUp / U i U-A C rC y "m3 U c -so C-G so U-A so- G-C mC A U iA IGA. 5  7 0  D  UA  C  c  4  m  5  o  A  2  u  20  3  6  C  T  R  N  A  S  7  R  ( 4 )  F i g . 6. Sequence of tRNA7 . The three nucleotides that distinguish t R A N ^ * from t R N A 7 replaced as shown at positions 16. 34 and 77. [from Cribbs (1982).) Ser  1  Ser  are  43 nucleotide at position 77 is characteristic of a tRNA4^ . pDt73 contains a 474 gene, which er  has an anticodon of a tRNA7$ , but the other two nucleotides are diagnostic of tRNA4 . er  Ser  Hence, the entire caste of genes appears to form a graded series of intermediate sequences ranging from tRNA4  Ser  to tRNA7 . Ser  As pointed out by Cribbs previously (1982), within the same tRNA gene family, members encoding functionally distinct isoacceptors usually show sequence divergence between 1030% (Sprinzl et. al., 1987). Also the pattern of mutational events in the different members tend to be random; that is, neither the positions nor the types of the nucleotide changes in the genes can be reliably predicted. In contrast, it is unusual that the genes coding for the two functionally distinct serine isoacceptors, tRNA4$  er  j tRNA ^ , would show such a er  a n (  7  high degree of homology (96%). Further, it becomes even more striking that the related variant genes are merely permutations of the above, with the nucleotide changes played out with almost absolute predictability. These observations prompted Cribbs (1982) to speculate that the tRNA4$  er  j tRNA7$ genes are probably not free to diverge; but their er  a n (  similarity in sequence would allow them to keep in check with one another and to continually evolve as a cohesive unit. The intermediate forms would thus reflect the imperfections in this "checking" process. The driving force for such a maintenance process is not clear but has been suggested by Cribbs (1982) to be non-reciprocal recombination and specifically gene conversion. Such a concept would fit well with concerted evolution of other multigene families (both coding and non-coding), which has been eloquently forged into a unifying theory known as Molecular Drive by Dover (1982). The formulation of this theory is based on the observation that in many multigene families that are prevalent in many different species (for example, tRNA genes), the members exhibit unexpected and substantial sequence homogeneity within a species but not between species. Family homogeneity, or cohesive evolution, could be achieved by several molecular mechanisms. For gene families with their members arranged in tandem arrays, such as rDNA (Coen and Dover, 1983), both gene conversion and standard recombination are thought to be operative to maintain sequence homogeneity of the gene  44 members. However, for multigenes that are irregularly spaced and in random orientation, such as the tRNA genes, standard recombination may cause duplication and deletion in the gene members. Instead, it has been hypothesized that gene conversion may hold hegemony as the predominant force in sequence maintenance. When variations arise in a member of the family, they may become fixed in a population as a consequence of stochastic and directional (biased) transmission of the variation. This concerted pattern for fixation of variations or sequence turnover in a gene family and in a population is defined as Molecular Drive. This is in opposition to the Mendelian-mode of evolution, which is modeled on the premise that mutations are unitary and passive events, and their spread through the gene family relies on the activities of selection and the vagaries of drift. Note that these two activities, in turn, must rely on basic theoretical and empirical assumptions that would allow appropriate allotment of "adaptive" and "non-adaptive" values to each. Whereas, no such assumptions are necessary for Molecular Drive, for this alternative process in multigene evolution cannnot be studied mathematically using traditional ad hoc assumptions. The other major impetus to the present work stems from the phenomenon of dosage compensation associated with the X chromosome in Drosophila  Females have two X  chromosomes, while males have one; but despite the dosage differential of the X chromosome in the two sexes, most of the X-linked genes exhibit a more nearly equal expression than expected based strictly on the number of the genes present. That is, the normal two-dose female is roughly equivalent to the one-dose male in X-linked gene expression. This equalization or buffering effect was first recognized by Muller in 1932 (reviewed by Stewart and Merriam, 1982) and has been termed dosage compensation. The phenomenon can manifest itself in another way. In mutants of short segmental aneuploid series - rather than in chromosomally wild-type males and females as discussed above - the genes in question can exhibit a dosage effect. In females with a small deletion removing one copy of the gene, the total output of the gene product at that site would only be 50% of that in males, even though both sexes are now hemizygous for this gene. In  45 males with a duplicated copy of a gene, the total output would be twice that compared to normal females (i.e. both sexes have two copies of the X-linked gene). As predicted from this dosage effect rule, it follows that a duplicated female with three copies of a gene would only be 50% more active at that locus when compared to wild-type males, despite a threefold difference in gene copy number. Hence, in an apparently paradoxical manner, the ability of the X chromosome to maintain equal expression between the sexes is also reflected in their differential escape from the dosage compensation mechanisms in the segmental aneuploids. To test whether tRNA4$ and tRNA7$ genes follow the rules of dosage compensation, er  er  Birchler et. al., (1982) analyzed the dosage effects of the genes in genetic crosses using X:Y translocations (Stewart and Merriam, 1974 and 1975) that result in progeny with one, two, or three doses of the 12A-13A region in females and one or two doses in males. If the locus responds to compensation, then the level of gene product would be expected to be directly proportional to the dosage of the short chromosomal segment in each sex, but the expression in males would be approximately twice as great per copy. Although their results are complicated by the presence of other minor tRNA^Ser it©s on the autosomes, they do s  suggest that the X-Iinked tRNA4$ genes are compensated, but interestingly, not the er  tRNAySer genes. Ideally, both phenomena of concerted evolution and dosage compensation of the tRNA47^ genes should be investigated by using a unique, or at least a distinguishable er  marker that can be easily followed. Indeed, this fact has been exploited in the extensive genetic and molecular analyses of gene conversion of suppressor tRNA^ genes in S. er  pombe (Munz et al., 1981). However, a parallel approach in Drosophila would be more difficult since the ground work on suppressor tRNAs is virtually non-existent. Furthermore, genetic screens for convertants of tRNA genes is relatively simple in yeast; both lethality and phenotypes based on spore colours can be engineered to assist in the recovery of convertants. While this may be possible in theory, a similar approach with the  Drosophila tRNA47S genes may be a much more difficult and laborious task in practice. er  46 As an alternative, I have elected to molecularly walk the 12DE region in order to analyze all members within this gene cluster (Chapter I).  An immediate benefit from this  expansive cloning study would be a clearer identification of the number and gene types encoded for by this region, which has not been resolved by in situ hybridization. DNA sequences of these genes, in conjuction with the autosomal copies of tRNA^Ser genes in progress by Dr. D. A. R. Sinclair, should at least provide some idea on whether the hybrid genes are likely to be reciprocal or non-reciprocal products.  I have also attempted to  address the possible origins of these hybrid genes in three ways. The first (Chapter II, Part I) is to identify "sequence signatures" in the flanking regions of the hybrid genes that may delineate their possible relationship with the rest of the tRNA47S genes at 12DE and with er  those on the autosomes where possible.  The second is to analyze strain differences in  representative tRNA4jSer genes. It was reasoned that if different permutations of the hybrid genes can be identified at homologous loci, then this would attest to the dynamics of sequence turnover as intimated by Cribbs (1982), and would provide convincing evidence for interactions between the tRNA4,7  Ser  genes.  The third (Chapter II, part II) is a  conjoined study of other Drosophila sibling species that have diverged from melanogaster for various increments of time. This last approach should delimit the approximate times of origin of the hybrid genes. The cross species approach should also provide a deeper insight into their mode of evolution. If in fact the tRNA$ genes are undergoing a cohesive mode er  of evolution, then the prediction based on the theory of Molecular Drive, would be that gene members or their surrounding sequences within a species should show more sequence similarities than between species. Furthermore, for co-evolution of irregularly spaced multigenes, conversion has been invoked as the predominant mode of transmission of genetic variations. In the current molecular models of conversion, heteroduplex formation has been espoused as the key intermediate step in the process (see Discussion).  If the  hybrid tRNA47^ genes are formed as a consequence of conversion, this would suggest er  that DNA slippage and mispairing between the tRNA4$ and tRNA $ er  7  er  genes would be  required as the initiating events. Such an occurrence, slip-sliding of DNA, as a distinct  47 possibility has been hinted from the previous studies on a tRNAArg cluster located 600 bp downstream from the tRNA4  Ser  genes within pDt27R mentioned previously (Newton, 1984).  The tRNAArg genes are arranged as four tandemly duplicated units, including large amounts of flanking sequences. Each duplicated unit is demarcated on either side by an eight base pair direct repeat, TAGCCCAA. This duplication pattern conveys the impression that they are likely to be formed by unequal crossing-over near the short direct repeats, Thus, in Chapter III, I have used the tRNAArg genes as independent "markers" to test whether DNA slippage can account for both gene duplication and gene conversion observed at 12DE by examining the organization of tRNAArg genes in distantly related melanogaster sibling species. The general organizational pattern imparted by the tRNA^jSer genes at 12DE may suggest why the tRNA4  Ser  genes are dosage compensated, while the tRNA7  Ser  genes are not  despite their close proximity. One possible scenario could be that the two gene types are segregated at this chromosome site permitting some form of differential regulation at the level of dosage compensation. Alternatively, the supposed inability of the tRNA7$ genes er  to dosage compensate could be an artifact stemming from insufficient sensitivity in the assays employed by Birchler et al., (1982). In Chapter V, I have also attempted to examine this problem by correlating the presence of repetitive sequences surrounding the tRNA$  er  genes and the promoter region of white, to search for potential candidates involved in dosage compensation. Chapter IV is a tangential excursion into the sequence organization of tRNA3bV * at a  the chromosomal bands 90BC, as part of a comprehensive analysis of the in vitro and in vivo expression of these genes (Dunn et al., 1979a; Larsen et al., 1982).  48 METHODS A N D M A T E R I A L S  REAGENTS Enzymes Used in Molecular Cloning Restriction endonucleases were purchased from Bethesda Research Laboratories (BRL), New England BioLabs (NEBL), Boehringer Mannheim Canada (BMC) and P-L Biochemicals (P-L). Other enzymes were purchased from the following sources: Enzvme  Suppliers  Klenow enzyme  Promega, BMC, PL, BRL  Polynucleotide Kinase  P-L, BMC, Dr. D. L. Cribbs  £, coli DNA polymerase I  P-L, BMC  SI nuclease  BRL  DNasel  BMC  Calf Intestinal Phosphatase  BMC  Ribonuclease A  Sigma  Proteinase K  BRL, BMC  Lysozyme  Sigma  T4 DNA Ligase  P-L, BRL  T4 RNA Ligase  BRL, Dr. D. L. Cribbs  Oligonucleotides All oligonucleotides used in this work are listed in Table I Oligonucleotides synthesized by T. Atkinson (UBC) were supplied as a crude powder and were purified before use. To effect this, they were dissolved in 100 ul of distilled sterile water and an aliquot of 1-2 A260 units of the crude material (10-20 ul) was made 50% in formamide. The mixture was heated at 90 °C for 3 minutes and immediately applied to a 20% polyacrylamide denaturing gel (1% bis-acrylamide, 45 mM Tris-Cl pH 8.3, 45 mM boric acid, 1 mM EDTA, 8.4 M urea). Electrophoresis was carried out at 1,500 volts for about 3 hours and the bands were visualized by shadowing over fluorescent silica gel plates under UV illumination.  49 TABLE I  N a m e  GT6  -LIST OF OLIGONUCLEOTIDES  S e q u e n c e  5 -GCAGTCGTGGCCGA-3'  Supplier  T. Atkinson  GT7  5 -CGCTCCCAGAGGGAATCTG-3'  T. Atkinson  Arg5'  5'-ATCCATTAGGCCACACGG-3'  T. Atkinson*  Arg3'  5 -CGAGTCCTGTCACGGTCG-3'  T. Atkinson*  Fl  5-GTAAAACGACGGCCAGT-3'  T. Atkinson*  RI  5-CAGGAAACAGCTATGAC-3'  T. Atkinson*  Pex  5 -CCCAGTCACGACGTT-3'  * Purified by C. H. Newton * Also purchased from P-L Biochemicals  P-L Biochemicats  50 The band corresponding to the full length oligonucleotide was excised with a scalpel and the oligonucleotide was eluted overnight at 37 °C in 1 ml of 0.5 M ammonium acetate, 10 mM MgCl2 The supernatant was passed through CIS SEP-PAK and the column was washed with successive one ml volumes of 60% methanol. The eluate containing the oligonucleotide was evaporated to dryness with a Savant Speed Vac Concentrator. The oligonucleotide pellet was redissolved in 50 - 100 ul of TE and its concentration determined by UV absorbance at 260nm. Nucleotides Deoxyribonucleoside triphosphates and 2,3-dideoxyribonucleoside  triphoshates were  purchased from P-L Biochemicals. The nucleotides were dissolved in TE to approximate concentrations of 10 mM. The exact concentrations were determined spectrophotometrically. [c<32p]-d y ibonucleoside triphosphates and Iy32p]-ATP were purchased from Amersham. eoX  r  They were supplied in solutions as triethylammonium salt with specific activities of "3000 Ci/mmol. Cytidine 3-monophosphate, containing an equal amount of cytidine 2-monophosphate contaminant, was purchased from Sigma. Phenol Liquified phenol was purchased as an 88% aqueous solution from Mallinckrodt and purified by distillation by C. H. Newton. Aliquots were stored at -20 °C in the dark. When required, they were thawed at 65 °C; 8-hydroxyquinoline and 6-mercaptoethanol were added to final concentrations of 0.1% (w/v) and 0.2% (v/v), respectively and stored in the dark at 4 °C. Just prior to use, a small volume was transferred to a glass test tube and extracted several times with 1 M Tris-Cl pH 8.0 and used for periods of up to one week. Formamide Analytical grade formamide was purchased from BDH Chemicals and deionized by stirring with Bio-Rad mixed bed resin AG501-X8(D) (15 g/100 ml) overnight. The resin was dried in vacuo overnight before use. After deionization, the resin was removed by filtration through Whatman glass microfiber filters (934-4H) and the deionized formamide was stored in small aliquots in the dark at -20 °C.  31 Acrylamide Acrylamide (Eastman Kodak) was stored at 4 °C in brown bottles as a 40% aqueous solution. Just before use, bis-acrylamide (Eastman Kodak) was added as required. The acrylamide:bisacrylamide solution was deionized overnight as described (see Formamide above) and filtered through Whatman 3MM paper. Agarose Agarose (ultra PURE grade) used for most analytical gels was purchased from BRL. For isolation of DNA from preparative gels, occasionally low melting agarose purchased from BioRad Laboratories was used. Galactosides Isopropyl-B-D-thiogalactoside (IPTG)was purchased from BRL. It was dissolved in distilled water to a final concentration of 100 mM and stored at -20 °C. 5-Bromo-4-chloro-3-indolyl B-Dgalactoside (X-gal) was purchased from Sigma.  It was used as a 2% solution in  dimethylformamide and stored at-20 °C in the dark. Autoradiography Curix RP1 X-ray film (Gaevert) and Dupont Cronex Lightning-Plus intensifying screens used for autoradiography were purchased from local suppliers. Supplies For Culture Media Bacto-tryptone, Bacto-yeast extract, Bacto-agar were purchased from Difco.  Type-A  hydrolysate of casein (NZ amine) was purchased from Humko Sheffield Chemical (division of Kraft). Soy flour and live yeast for Drosophila cultures were purchased from local stores.  BACTERIAI. STRAINS The following strains of £ col/were used as hosts for recombinant DNA molecules: LE392  F", A»/R5l4(r"k, m"k), supUA, sup?)*, lac\\ or J(lac IZY)6, gal K2 gatYll, meftA, trpJ05,JC, is a derivative of the £ coli strain ED8654 (Borck etal., 1976; Murray etat, 1977).  RR1  F". hsdS20, ara-ii.prokl. lacYX.gaKl. rpsUO, jryl-5. mtl-\, sup  E44,fc"(Bolivar etal.1977). j(lac, pro), svpV, thi, strk shcW, eadk, hspRAJ truVtf, prokB,  JM101  lad, LaclMVj (Messing et al.. 1981). <J(lac, pro), thi, strk, supl, endk, sbcb, hsdR', F /raD36, prokB,  JM103  lacl%  et at, 1981).  ZAM15 (Messing  Q358  hsdR ~ , hsdM\ sttpZ, *80 (Karn et al, 1980).  Q359  hsdR \ , hsdU \ , supl, <t«0 , P2 (Karn el al, 1980).  DH1  F", reckl, endk\, gyrk<X>, thi-\,hsdR\l (r\, m\), supZM,%T (D.  r  k  r  Hanahan, 1983).  DH5a  F-, reckl, endkl, gyrk96, thi-l, hsdRXl (r\, m\), svpZAA, K~ ret Al,4*0<//arZAM15 (D. Hanahan, 1985).  NS428  N205ai4amll, bl, redl, clts857, Jam7) (Sternberg et. at, 1977).  NS433  N205(Xiam4, bl, redl , rlts857, Jam7) (Sternberg et. al, 1977).  JC8111  recbZl. recQl, sbcBYi, recYXAl (Horiiand Clark, 1973).  SF8  hsdR'' hsdW recBC. lop-11 (ligase overproducer) .«ypE44 (si/2*). gal-96, SmR, team, / A / ' - K B D , thr  SMR10  (Davis et at,, 1980).  £ c o l i C-la (XcosZ, db, redl, xisl, gam&m2\0, rlts857, ain'), sam 7)/l (Rosenberg et al, 1985).  CULTURE MEDIA AND CONDITION The following media were used for growth of E coli. LB  1.0% Bacto-tryptone, 0.5% Bacto-yeast extract, 0.5% NaCl  YT  0.8% Bacto-tryptone, 0.5% Bacto-yeast extract, 0.5% NaCl  2x YT  1.6% Bacto-tryptone, 0.5% Bacto-yeast extract. 0.5% NaCl (Sanger  et. al, 1980) LB-glucose  1.0% Bacto-tryptone, 0.5% Bacto-yeast extract, 0.5% NaCl. 1% glucose  33 TB ("Terrific Broth") 1.2% Bacto-tryptone, 2.4% Bacto-yeast extract, 4% glycerol, 17 mM KH2PQ4,72 mM K2HPO4 (Tartof and Hobbs, 1987). M9 Salts  50 mM Na2HPQ4,25 mM KH2PO4,8.5 mM NaCl, 20 mM NH4CI, 1 mMMgS04,0.1 mM CaCl2,10 mM glucose, 0.001% thiamine (Miller. 1972).  SOB  2% Bacto-tryptone, 0.5% Bacto-yeast extract, 10 mM NaCl, 2 5 mM KC1,10 mM MgCl2,10 mM MgS0 (Han ah an, 1983). 4  SOC  2% Bacto-tryptone, 0.5% Bacto-yeast extract, 10 mM NaCl, 2.5 mM EC1,10 mM MgCl2,10 mM MgS0 ,20 mM glucose (Hanahan, 1983). 4  NZYM  1% NZ-amine, 0.5% Bacto-yeast extract, 0.5% NaCl, 10 mM MgCl2 (Leder et al., 1977).  LKB  1 % Bacto-tryptone, 0.5% Bacto-yeast extract, 1 % NaCl, 4 mM NaOH (Rosenberg et al., 1985).  For plates, Bacto-agar was addedtothe liquid medium to a final concentration of 15 g/1- For top agar overlays. 7.5 g/1 of Bacto-agar was used. In experiments where plaque lifts were anticipated, agarose was used in the top overlays instead. Strains harbouring plasmids were grown in media containing 25 ug/ml to 100 ug/ml of ampicillin depending on the health of the host. To screen for £ coll hosts (JM101, JM103 and DH5ot) harbouring vectors (M13, pUC and pEMBL) exhibiting the a-complementation phenotype at the lad locus, 50 ul of a 2% X-gal and 10 ul of a 100 mM IPTG solution were either added to the soft overlay before plating cells (M13 transformants) or applied evenly onto the surface of the plates with the aid of a bent glass rod before spreading cells (pUC and pEMBL transformants). It was fortuitously observed that at least two other £ coli hosts (JC8111 and SF8) could also be screened by a "pseudo-acomplementation phenotype". If the colonies of these strains were kept small (<0.2 mm), those harbouring recombinant pUC or PEMBL plasmids containing inserts remained pale-green in colour on X-gal selection plates for a few hours longer than those harbouring wild-type  54 vectors (dark blue-green).  If these small colonies were stored at 4 °C, this "pseudo-ct-  complementation" phenotype can be prolonged and be reliably applied as a selection scheme to these, and also possibly to other as yet untested £ coli strains. For routine experiments, cells were usually grown at a temperature of 37 °C. Hosts used for plating Jl phages the next day were usually cultured at 30 °C overnight with moderate shaking to prevent the cells from overgrowing.  £ coli strains carry the temperature sensitive  mutation clts857 (NS428, NS433 and SMR10) were propagated at temperatures at or below 32 °C. Growth was monitored by measuring A550 using a Cary 210 spectrophotometer. Fruitflies Wild-type Drosophila melanogaster  isogenic for all the major chromosomes  was  constructed by Dr. G. M. Tener (UBC). The D, melanogaster mutant bearing deletion from 12A-12E on the X-chromosome, DftDgifB/IntDAM, was obtained from Dr. D. A. R. Sinclair (UBC).  The Drosophila sibling species D. erecta, D. yakuba, D. teissieri and D. mauritiana w obtained from the Pasadena Drosophila stock center (Pasadena, California). D. simulans was obtained from Dr. T. A. Grigliatti (UBC). Fruitflies were cultured on Drosophilasoy food containing the following ingredients in one litre of tap water (Dr. G. M. Tener, unpublished): 100 g full fat soy flour, 20 g yeast extract. 17 g agar, 1 g citric acid. 9 g trisodium citrate, 40 g glucose, 40 g sucrose. 15 ml of 10% methyl p-hydroxy benzoate in 95% ethanol. 20 mg streptomycin and 10 mg tetracycline. Prior to DNA isolation, adult flies were cultured under non-crowded condition and transferred to fresh food every 3-4 days. Live yeast was frequently seeded on the surface of the medium to increase the fecundity of the flies.  MASS COLLECTION OF EMBRYOS FROM D melanogaster For mass isolation of Drosophila embryos, flies were cultured at 25 °C in standard cages under high humidity and constant 12 hr light/dark cycles. Weigh boats containing 2% agar  55 with a thin layer of yeast paste on top were placed inside the cages as collecting vessels. Embryos were collected every 12 hr by flushing the yeast paste into a small metal screen under luke-warm tap water.  The retained embryos were rinsed free of debris and  dechorionated in 50% bleach for 3 minutes, followed by several quick rinses under running tap water.  The dechorionated embryos were transferred and stored at -70 °C in 1.5 ml  Eppendorf polypropylene tubes.  PLASMIDS AND BACTERIOPHAGE VECTORS The following list of vectors were used routinely in cloning;  Ml 3 vectors  mp8  (Messing. 1983)  mp9 mplO mpll mpl8  pEMBL vectors  (Dente et al. 1985) 8-  cosmid vectors  X vectors  pUCB  (Messing, 1983)  PJB8  (Ish-Horowicz and Burke, 1981)  cosPneo  (Steller and Pirrotta, 1985)  EMBL3  (Frischauf et. at., 1983)  EMBL4  (Frischauf et. al., 1983)  56 (Karn et, al., 1984)  2001  INTRODUCTION OF PLASMID AND DOUBLE-STRANDED BACTERIOPHAGE M13 DNA INTO  Escherichia coli Reagents Routine transformation of  £ colivras performed using a solution of 50 mM CaCl2 For £ coli  strain RR1J00 mM CaCl2 salt solution was required (Dagert and Ehrlich, 1979). Both solutions were sterilized by autoclaving. For preparation of frozen competent cells, the modified reagents derived from Hanahan (1983 and 1985) were used.  The salt solutions were made into individual 10 x stocks and  sterilized by autoclaving. Hexamine cobalt (III) chloride was sterilized by filtration.  FB  100 mM KC1.50 mM CaCl2,15% glycerol (v/v), 10 mM potassium acetate, adjusted to final pH 6.2 (Hanahan. 1985).  MHB  45 mM MnCl2,10 mM CaCl2,100 mM RbCl2,3 mM hexamine cobalt (III) chloride, 10% (v/v) glycerol (modified from Hanahan, 1983; M. Fettes, personal communication).  BACTERIAL TRANSFORMATION For standard bacterial transformation, competent cells of £ coliwere prepared using the CaCl2 mediated method described by Dagert and Ehrlich (1979).  Cells were usually starved in  CaCl2 for 12-16 hours at 4 °C before use to enhance their competence. High efficiency competent cells of the  £ coli strain DH5a were prepared using protocol I  of "Frozen Storage of Competent Cells" described by Hanahan (1985). Ten colonies picked from an overnight SOB plate were used to inoculate 100 ml of SOB. The culture was grown to an A550 of 0.5 to 0.7 at 37 °C with good aeration and rapidly chilled by swirling on ice for 5 minutes. The cells were harvested by centrifugation at 2,000 rpm for 15 minutes at 4 °C in a  57 clinical centrifuge. The pellet was resuspended in 1/3 volume of ice-cold FB and incubated on ice for 30 minutes. The ceils were pelleted again as described and then gently dispersed in 1/12.5 volume of FB. Aliquots of 200 ul of the competent cells were dispensed into pre-chilled 1.5 ml Eppendorf polypropylene tubes. The cells were quick-frozen in an ethanol/dry-ice bath and stored at -70 °C until needed. Competent cells of the E <w//strain JC8111 were prepared as described by Hanahan (1983) with minor modifications (M. Fettes, personal communications). Several colonies from a fresh plate were inoculated in 100 ml of 2x YT supplemented with 20 mM MgCl2 and grown to an A550 of 0.7-0.8. The cells were chilled on ice for 15 minutes and pelleted by centrifugation at 3000 rpm for 5 minutes in an SS34 rotor. The pellet was resuspended in 1/3 volume of MHB and stored on ice for 30 minutes. The cells were pelleted again as described and resuspended in 1/12.5 volumes of MHB. Two aliquots of DMSO (280 ul total) were added 10 minute apart with cells kept chilled on ice. After a further 5 minute incubation, 200 ul aliquots of the competent cells were flash frozen as described. Frozen competent cells of strain DH5a or JC8111 were thawed at room temperature just prior to use. Plasmid DNA (< 20 ul) was added and the cells were stored on ice for 10 minutes. They were heat shocked at 42 °C in a heating block for two minutes (or 37 °C for 5 minutes for the weaker strain JC8111) and then cooled rapidly on ice for one minute. SOC was added to the cells to a final volume of one ml and incubated for 30 minutes in a 37 °C water-bath with occasional agitation by tube inversion. Aliquots of 10 ul to 100 pi of the transformed cells were plated on appropriate selective and indicator media. When the bacteriophage M13 was used as the cloning vector, transformation was performed as described above using the  E coli strains JM101 or JM103 made competent by the  CaCl2 method. Except after heat shock at 42 °C, 3 to 4 ml of soft agar overlay containing X-gal and IPTG were added to the ceils. The content was quickly poured onto plates that had been pre-warmed at 37 °C.  58 ISOLATION OF PLASMID AND DOUBLE-STRANDED Ml 3 DNA Two methods were used to isolate supercoiled plasmid and phage DNA from £ coli The first method uses Triton X-100 to gently lyse cells (Davis et al., 1980) and was adopted exclusively for isolating DNA from large cultures (1 to 2 litres). The second method (Birnboim and Doly, 1979; Maniatis et al., 1982) employs SDS and alkali to lyse the cells, and has been generally used for rapid isolation of DNA from "mini-preps" of 1 to 2 ml cultures. A scaled up version of this method has also been applied successfully in isolating DNA from large cultures. Both the Triton and alkaline lysis methods were satisfactory, but the latter was more convenient to use and was generally preferred. LARGE SCALE DNA ISOLATION Plasmid DNA A single ampicillin resistant colony of £ coli was inoculated into 25 ml of LB or YT containing appropriate concentration of ampicillin. The cells were grown overnight at 37 °C with vigorous shaking to ensure good aeration. The cells (10 ml) were inoculated into 1 litre of M9 medium and growth was continued until an A550 of 0.6 was reached. Chloramphenicol was added to a final concentration of 100 mg/ml (dissolved in 5 ml of 95% ethanol) and incubation was continued for 12-16 hours. The cells were harvested at 4 °C by centrifugation in a Sorvall GSA rotor at 6000 rpm for 10 minutes. Double-stranded Ml3 DNA The method employed here is adopted from a procedure by Dr. Mark Zoller (Cold Spring Harbor). A single colony of the £ coli strain JM101 or JM103 was inoculated into 2 ml of M9 salts and incubated overnight at 37 °C. Approximately 200 ul of the overnight culture was inoculated into 5 ml of 2x YT and growth was continued for 2 hr at 37 °C. The cells were diluted 10 fold with 2x YT. A small volume (1-2 ml) of the cells was infected with a single plaque of M13 and incubated at 37 °C for 6 hr. Another aliquot (4-5 ml) of the cells was inoculated into 500 ml of M9 salts and was grown to a cell density of A550=0.7.  The small  volume of M13 infected cells was then added to the large culture and incubation was continued at 37 °C for 90 minutes. The cells were collected by centrifugation at 4 °C in a GSA rotor at  59 6000 rpm for 15 minutes.  They vere lysed by using the Triton X-100 method and the  bacteriophage DNA purified by two passages through CsCl gradients.  Both of the lysis  methods and the DNA purification procedure are described below. Lvsisbv Triton X-100 The cell pellet was resuspended in 2.5 ml of 50 mM Tris-Cl (pH 8.0). 25% sucrose by gently pipetting. EDTA was added to a final concentration of 250 mM followed by 2.5 mg of lysozyme. The cells were mixed by vortexing and then stored at 4 °C for 20 minutes. Lysis was achieved by adding 3 5 ml of 2% Triton X-100 followed by a further 10 minute incubation on ice. The lysate was cleared by centrifugation at 4 °C in a Beckman type-30 rotor at 25,000 rpm for 60 minutes. The lysate was transferred to sterile 30 ml Corex tubes or polypropylene tubes. An equal volume of redistilled 1:1 phenol/chloroform (v/v) was added and agitated by gentle vortexing.  The phases were separated by a brief centrifugation in an SS34 rotor for 5  minutes. The phenol extraction procedure was repeated and then followed by two washes with chloroform. The aqueous phase was carefully transferred to a clean Corex tube. Sodium acetate (pH 6.0) was added to a final concentration of 0.3 M followed by 0.6 volume of isopropanol. The DNA was precipitated by centrifugation at 7,000 rpm in an SS34 rotor for 30 minutes at 4 °C. After briefly drying the pellet, it was resuspended in TE (10 mM Tris-Cl. pH 8, 1 mM EDTA). The DNA was treated with ribonuclease A (100 ug/ml) at 37 °C for 30 minutes and was further purified by CsCl equilibrium centrifugation. Lysis by Alkali Large Scale DNA Preparation A scaled up alkali lysis procedure was used essentially as described by Maniatis et al. (1982) with minor modifications. A single ampicillin resistant colony was inoculated into 5 ml of LB or YT containing 50 ug to 100 ug/ml of ampicillin and grown overnight at 37 °C. A 2.5 ml aliquot of the culture was inoculated into 500 ml of the same medium and allowed to grow with vigorous shaking at 37 °C until the culture is almost saturated (A550=1.0 to 1.5). Cells were harvested by centrifugation at 4 °C in GSA rotor at 5000 rpm for 10 minutes. The pellet was resuspended in 5 ml of 50 mM glucose, 25 mM Tris-Cl (pH 8.0), 10 mM EDTA. Solid lysozyme was  60 added to a final concentration of 5 mg/ml and mixed with the cells by gentle vortexing. After 5 minutes at room temperature or 20 minutes on ice. 10 ml of 0.2 M NaOH. 1% SDS was added by rapid ejection from a pipet. An ice-cold solution of 3 M potassium acetate (7.5 ml) was added and the contents mixed gently by inverting the tube 2-3 times. After 10 minutes, the lysate was cleared by centrifugation in an SS34 rotor at 10,000 rpm for 30 minutes at 4 °C. After this step, the DNA was treated identically as above in preparation for CsCl gradient centrifugation. CsCl Gradient Purification of DNA The DNA pellet was dissolved in TE and solid CsCl (1.13 g/ml) was added. Ethidium bromide was added to the DNA in the dark to a final concentration of 0.6 mg/ml. The contents were transferred to Beckman "quick-seal" tubes with a Pasteur's pipet and sealed with heat sealer. The tubes were centrifuged at 20 °C in a VTi65 rotor at 65,000 rpm for four hours or at 50,000 rpm for 14 hours. Plasmid DNA was identified with the aid of a long wave UV lamp (365 nm) and removed with a 3 cc B-D syringe equipped with a 26 gauge needle. The DNA was extracted several times with equal volumes of water-saturated n-butanol in the dark to removed the ethidium bromide. The CsCl was subsequently removed by dialysis in several changes of 20 mM Tris-Cl (pH 7.4), 1 mM EDTA at 4 °C. Alternatively, two volumes of distilled water was added to the DNA and then precipitated with two volumes of 95% ethanol. The concentration of DNA was determined by absorbance at 260 nm using a Cary-120 spectrophotometer, assuming 1 A260=50ug (Davis et al, 1980). Purification of DNA bv Column Chromatography When  absolute  purity was  not  required, plasmids  were  prepared  by  column  chromatography based on a procedure developed by Dr. Ian Gillam (UBC). The agarose matrix, A-15 m, was equilibrated in 100 mM acetic acid (pH 5-0). 0.02% sodium azide by repeated washing and the slurry was packed into an "upward flow" column (Pharmacia. 90 cm X 2.5 cm).  DNA samples were applied as a 5% sucrose solution and cushioned into the column  bottom by a "chase" consisting of 100 mM acetic acid pH 5.0, 0.02% sodium azide, 10% sucrose. The DNA was eluted upward at a flow rate of 10 ml/hr for 16 hr. Fractions of 1 mi volume were collected and they were monitored for the presence of plasmid DNA by absorbance at A260.  61 Fractions (usually between 34 and 43) containing plasmid DNA were pooled and concentrated by precipitation with ethanol.  This procedure is economical but it suffers from the  disadvantage that the plasmids are usually contaminated by trace amounts of chromosomal DNA. In addition, the quality of DNA is less predictable with respect to the amount of nicking compared to the CsCl gradient procedure. Small Scale Mini-Preparation This procedure has been described by Maniatis et al. (1982) and is a modification of the method developed by fiirnboim and Doly (1979), Cells from a 2 ml overnight culture were harvested in a 1 5 ml Eppendorf tube in an Eppendorf micro-centrifuge (as for all subsequent centrifugation steps) and the pellet was resuspended in 100 ul of 50 mM glucose, 25 mM Tris-Cl (pH 8.0), 10 mM EDTA. Lysozyme was omitted as it is unnecessary with most £ colistrains (except perhaps DH1). Two volumes of a freshly prepared solution of 0.2 M NaOH, 1 % SDS were added and the lysate was briefly incubated on ice. A 150 ul volume of potassium acetate (pH 4.8) was added and the contents were gently mixed by inverting the tube several times. The tube was allowed to sit on ice for 15 minutes and then centrifuged at 4 °C at 16,000 rpm.  The  supernatant was transferred to another clean tube and extracted once with equal volumes of 1:1 phenol xhloroform (v/v) as described. The phases were separated by a five minute centrifugation at room temperature and the aqueous phase was transferred to a clean tube followed by the addition of two volumes of 95% ethanol (room temperature). After a 5 minute incubation, the DNA was precipitated by a 15 minute centrifugation. The pellet was rinsed once with 70% ethanol and dried briefly  in vauco. The DNA was resuspended in 50 ul TE and 1-  3 ul were used for restriction analysis.  ISOLATION OF TEMPLATE DNA FOR SEQUENCING Double-Stranded DNA Plasmids for sequencing were prepared by the small scale alkali lysis method as described by Birnboim and Doly (1979) and modified according to Birnboim (1983), Pelham (1985) and Hattori and Sakaki (1986). After the bacterial debris was precipitated with the aid of potassium  62 acetate, 1/4 volume of a 10 M LiCl solution was added to the cleared lysate (Pelham, 1985; Birnboim, 1983). After incubating the tube on ice for 15 minutes, the rRNA was precipitated by centrifugation at 4 °C for 15 minutes in an Eppendorf centrifuge (as for all subsequent centrifugation steps).  The supernatant was extracted once with an equal volume of  phenolxhloroform (v/v) and the DNA precipitated by ethanol as described. After the pellet was resuspended in TE, ribonuclease A was added to a final concentration of 40 ug/ml. The digest was incubated at 37 °C for 30 minutes, then 0.6 volume of 20% polyethylene-glycol (PEG-8000), 2.5 M NaCl was added. The tube was chilled on ice for 30-60 minutes and the DNA was then precipitated by a 5 minute centrifugation at room temperature (Hattori and Sakaki, 1986). The supernatant was removed with a drawn-out Pasteur's pipet and the DNA pellet was rinsed once with 70% ethanol. After drying  in vacuo, the pellet was resuspended in 50 ulTE.  Single-Stranded DNA Single-stranded DNA templates were prepared from either the bacteriophage Ml 3 (Messing. 1983) or the pEMBL plasmids (Dente et al, 1983). DNA fragments of less than 1.0 kb were usually propagated in M13 vectors for sequencing. unstable and frequently suffered from deletions.  Larger fragments, however, were  It was this observation that led to the  alternative use of pEMBL plasmids for cloning larger DNA fragments (reviewed by Dente et al, 1985). When cells harboring such plasmids are superinfected by the helper bacteriophage IR1, one strand of the plasmid can be packaged and extruded into the medium as virion capsids. Thus in theory, DNA can be stably maintained in double stranded form until the single-stranded template is needed. Unfortunately, colonies stored on plates (LB or M9 salts) at 4 °C overnight can often become erratic in both the efficiency of superinfection and the yield of virions. Bacteriophage M13 A single plaque of Ml3 was picked with a sterile Pasteur's pipet and inoculated into 1.5 ml of 2x YT with moderately shaking at 37 °C to elute the phage. Host cells (either £ tv//strain JM101 or JM103) were freshly prepared by inoculating a single colony into 10-25 ml of YT. When the cell density of A55Q=0.6 was reached, a 20 ul aliquot was added to the eluted M13  63 above. The tube was shaken vigorously for 8-14 hours at 37 °C to ensure good aeration for phage growth.  The culture was transferred to a clean 1.5 ml polypropylene tube and  centrifuged for 5 minutes in an Eppendorf microcentrifuge (as for all  subsequent  centrifugation steps). The supernatant containing M13 virions (1.3 ml) was removed and added to 200 ul of 20% polyethylene-glycol (PEG-8000), 2.5 M NaCl.  The tube was inverted  several times to mix and then incubated at room temperature for 15 minutes. The virions were then precipitated by centrifugation for 5 minutes. The supernatant was removed completely with a flame-drawn Pasteur's pipet and the pellet dispersed in 200 ul of TES (200 mM Tris-Cl, pH 8.0, 200 mM NaCl, 1 mM EDTA). The bacteriophage was extracted with an equal volume of 1:1 phenol:chloroform (v/v) twice and the DNA precipitated by ethanol as described.  The  template was resuspended in 25-50 ulTE. The oEMBL Plasmids A single ampicillin resistant colony was inoculated into 1.5 ml of 2x YT containing 50-100 ug/ml of ampicillin. The tube was shaken vigorously at 37 °C until a cell density of A550=0.10.2 was reached. Assuming that 1.0 A550=7.5X 10 cells/ml, a twenty fold excess of the helper phage IR1 was added to ensure efficient infection. The culture was shaken vigorously for a further 4-6 hours to allow packaging and extrusion of single-stranded DNA as virion capsids. They were collected from the medium by precipitation with 20% PEG-8000, 2.5 M NaCl and the template DNA was prepared as described for Ml3.  Since the packaging process is  indiscriminate, half of the capsids should theoretically contain the single-stranded pEMBL plasmid (Dotto et. al., 1981) In practice however, this efficiency is often drastically reduced by the size of the insert (>5.0 kb) and the age of the cells before infection (>24 hr). In the latter case, the preparation of fresh transformants appears to be the only viable alternative (this laboratory and Luck, 1986). Preparation of the Helper Phage IR1 A single colony of JM101 was inoculated into 5 ml of LB and incubated at 37 °C with vigorous shaking until  A^o=0.2.  Approximately 1.5 x 10^ IR1 (obtained from Dr. Andrew  Spence) was added to the culture and growth was continued until an A550=0.6 was reached.  64 The culture was then inoculated into 250 ml of LB and allowed to grow at 37 °C with good aeration until saturation. The cells were removed by centrifugation in a GSA rotor at 10,000 rpm for 10 minutes. The supernatant, containing the free virions, was transferred to sterile bottles and stored at -70 °C without glycerol.  DNA SEQUENCING Most of the DNA sequences were determined by the chain terminator method as originally described by Sanger et. al. (1977). This method has been generally applied to sequence determination using the single-stranded templates.  Recently, this method has also been  adopted for use in sequence determination involving double stranded DNA molecules (Chen and Seeburg, 1985; Hattori and Sakaki, 1986). This latter development eliminates the necessity for obtaining inserts cloned into both orientations as in the case for both M13 (Sanger et. al, 1980; Messing, 1983) and pEMBL templates (Dente et. al., 1983). Sequences can now be readily obtained from both strands of the DNA by using both the universal forward and reverse sequencing primers. The different procedures involved in template preparation for singlestranded and double-stranded DNA have been discussed above. While the actual sequencing conditions are almost identical, different treatments are required for annealing the sequencing primer to the two different types of templates. These procedures are discussed below. The chemical degradation method of Maxam and Gilbert (1980) was also used in sequencing DNA fragments providing detailed restriction mapping information was available. However, this method is much more labor intensive and generally require 5 to 10 times the amount of radiolabelled nucleotides to label a unique end of a restriction fragment for sequencing. High density polyacrylamide gels (12-20%) are often necessary to resolve sequences close to the labelled end and therefore, cannot be easily dried down. The wet gels lead to more scattering of  the radioactivity resulting in a decrease in band resolution on the X-ray  film.  Furthermore, even with the aid of intensifier screens, the sequencing gels frequently require a much longer exposure time, particularly with sequences farther from the  63  radio labelled end. Chain-Terminator Method Sincle-Stranded DNA Templates Ml3 and pEMBL Plasmids About 0,5 ug of the template (5 ul), 0.5 to 1 pmole of sequencing primer (1 ul) and Z ul of 10 x Hia buffer (100 mMTris-Cl pH 7.6, 500 mM NaCl and 50 mM MgCl2; Messing et. al., 1981) were mixed in a 1.5 ml Eppendorf polypropylene tube. The mixture was heated in a 65 °C water bath for 10 minutes and then placed inside a small test tube containing water at 65 °C. The annealing mix was allowed to cool slowly to room temperature for 15-20 minutes, Double-Stranded DNA Templates PUCH  and Double-Stranded pEMBL Plasmids  Plasmid DNA (1-2 ug) was denatured in 0.2 M NaOH at room temperature for 5-20 minutes. It was then neutralized with the addition of 2.5 M ammonium acetate (pH 7.5) and precipitated with the addition of two volumes of cold 95% ethanol. The pellet was rinsed once with cold 70% ethanol and then dried briefly under vacuum. The denatured DNA template was then annealed with 1 pmole of sequencing primer and 1 pi 10 x Seeburg buffer (70 mM Tris-Cl pH 7.5, 70 mM MgCl2, 50 mM B-mercaptoethanol and 1 mM EDTA; Chen and Seeburg, 1985) in a final volume of 10 ul at 65 °C for 10 minutes and treated identically in subsequent steps as single-stranded DNA templates. Chain Termination Reactions Sincle-Stranded Templates The sequencing reactions were performed as droplets inside a sterile petri plate to facilitate the handling of a large number of templates (Courtesy of Dr. David Goodin). The  £ coli  Klenow fragment of DNA polymerase I (0.8-1.0 units, about 0.25 ul straight from stock tube), 1.5 ul of [ce32p]dATP and 1 ul of 15 uM solution of dATP were added to the annealed primertemplate mix on ice. Aliquots of 2.5 ul were added to 2 ul of pre-distributed mixes of dideoxyand deoxyribonucleotides (see Table II, top) and the reactions were initiated by incubation at 37 °C. After 15 minutes, 1 ul of a chase solution containing 0.5 mM of all four dNTP was added  6b to each reaction. After a further 15 minute incubation at 37 °C, the reactions were stopped by adding 4 ul of a stop mix (90% formamide, 0.07% bromophenol blue. 0.07% xylene cyanol). The sequencing products were denatured by heating at 90 °C in a water bath for 3 minutes prior to loading onto the sequencing gels. Double-Stranded Templates For sequencing double-stranded templates, at least 2 units of Klenow polymerase enzyme and2ul of [«32p]dATP were used. In addition, the reaction temperature of 42 °C appeared to reduce artifactual bands over long tracts of A/T rich sequences. However, at this higher temperature, 2 ul of a half-diluted chase solution was used to compensate for the increased evaporation of the sequencing droplets. To exploit the high GC nucleotide content within the structural tRNA genes, la32p]dGTP was occasionally  used as  oligonucleotides  the  targeting  radiolabelled nucleotide internally to the  and custom  designed  tRNA genes were used.  sequencing The relative  concentrations of the deoxyribonucleotides and dideoxyribonucleotides in the sequencing mixes adjusted for using la32p]dGTP are listed in Table II (bottom). Purification of Radiolabeled Restriction Fragments For Maxam-Gilbert Sequencing The 3' ends of restriction fragments for sequencing were labeled using 1-2 units of the Klenow fragment and 50-80 uCi of the appropriate la32p]dNTP with respect to the restriction enzyme recognition site. The restriction fragments were resolved by gel electrophoresis in a 5% polyacrylamide gel. The appropriate bands were localized by autoradiography and excised from the gel matrix with a scalpel. The DNA fragments were eluted from the gel strip by soaking in 0.6 ml of 500 mM ammonium acetate, 10 mM magnesium acetate, 1 mM EDTA, 0.1% (w/v) SDS and 10 ug/ml £  coli tRNA carrier (Maxam and Gilbert, 1980) at 65 °C overnight in a  1.5 ml Eppendorf tube. The tube was briefly centrifuged and the supernatant was transferred to a clean 1.8 ml Eppendorf tube. The DNA was recovered by precipitation in ethanol. When the restriction fragments were labeled at both of their 3' ends, the two labeled ends were separated by either cutting with a second restriction enzyme or strand separation by heating at 90 °C for 2 minutes in 30% DMSO (v/v), 1 mM EDTA, 0.07% xylene cyanol and 0.07%  67 T A B L E Il-DEOXY-DIDEOXYRIBONUCLEOSIDE  TRIPHOSPHATE MIXES FOR CHAIN-  TERMINATION SEOUENCING-  lg32pldATP as the Radiolabelled Nucleotide G mix ddGTP ddATP ddTTP ddCTP  89 -  dGTP dATP dTTP dCTP  1.5 30 30  A mix  Tmix  8.0  Cmix  13 13  21  30  30  21 21  1.5 30  30 2  This protocol is provided by Dr. Joan McPherson.  fg32pl dGTP as the Radiolabelled Nucleotide Gmix  A mix  Tmix  ddGTP ddATP ddTTP ddCTP  20 -  300 -  750 -  200  dGTP dATP dTTP dCTP  . 113 113 113  .  160 8 160  . 160 160 10  . 8 160 160  Cmix -  The protocol is provided by Lawrence Shitnin. All concentrations in the two protocols are in pM. For longer sequences, the final concentrations of the dideoxyribonuceoside triphosphates in the sequencing mixes were decreased by either 50% or 67%.  68 bromophenol blue, followed by quick chilling in an ethanol-dry ice bath, The separated ends were then resolved again by gel-electrophoresis in a non-denaturing 5% polyacrylamide gel. The resolved products uniquely labeled atone end were recovered as described above. Maxam and Gilbert Sequencing Reactions The sequencing reactions were performed according to Maxam and Gilbert (19S0) with minor modifications by Dr. A. Delaney. The procedure is summarized in Table III.  TREATMENT OF GLASSWARE AND PLASTICWARE Glass test tubes, capillaries, Pasteur's pipets and Eppendorf tubes for use in the construction of genomic libraries were all treated with dichlorodimethylsilane as described in Maniatis et. al. (1982).  They were then rinsed exhaustively in distilled water, dried and sterilized.  Eppendorf tubes and pipet tips for general DNA manipulations were used from unopened packages without sterilization.  ISOLATION OF GENOMIC DNA FROM  Drosophila  Three different methods have been applied to isolate genomic DNA from  Drosophila For  most purposes, a quick method has been adopted from the procedure by de Cicco and Glover (1983) designed for isolating DNA from a single fly, such as mutants that are difficult to grow. However, it suffers from the limitation of producing DNA that is too small for the construction of genomic libraries, but remains adequate for routine genomic Southern blots. The other two large scale methods (I and II) are more labor intensive but produce DNA that are sufficiently large (>120 kb) for the construction of both lambda or cosmid libraries. Since protocol II of the large scale method is more expedient, even though it gives dirtier DNA, it is still generally preferred. Quick Method A small number of adult flies (1-100) were placed in a 1.8 ml Eppendorf tube and homogenized in 100-200 ul of 10 mM Tris-Cl pH 7.5, 60 mM NaCl, 10 mM EDTA, 0.15 mM spermine, 0.15 mM spermidine and 5% sucrose. An equal volume of 1.25% SDS, 300 mM Tris-Cl  T A B L E III-DNA Sequencing Reactions by the Maxam-Gilbert Method  [32pj DNA (pi): Carrier DNA: (1 mg/ml)  Add: Incubate: Add: 95% ethanol: (-70 °C)  5 lui  10 lui  10 lui  10 ul 200 ul cacodylate dH20 buffer 3ul lui 10% formic acid DMS 6',RT 15*. 37 °C HZ-stop G-stop 230 ul 50 ul 1 ml 1ml  Incubate:  10 ul dH20 30 ul HZ 10', RT HZ-stop 200 ul 1ml  15 ul 5M NaCl 30 ul HZ 10, RT HZ-stop 200 ul 1ml  -70 °C, 15 minutes  Microfuge:  5 minutes  0.3 M sodium acetate: 95% ethanol: Reprecipitate  300 ul lml  Wash pellet and dry in vacuo:  5 lui  2 x in 1 ml 95% ethanol  Resuspend in:  27 ul dH20,3 ul 10M piperidine  Strand scission:  90 °C, 45 minutes  Dessicate:  overnight over P2O5 to collect piperidine  Resuspend in. Repeat dessication  100 ul dH20  70 pH 9.0,100 mM EDTA and 5% sucrose was added and the homogenate incubated at 65 °C for 30 minutes. A volume of 30-60 ul of an 8 M potassium acetate were added and the mixture was chilled on ice for 45 minutes, then centrifuged at 12,000x g for 10 minutes at 4 °C. The supernatant was transferred to a clean Eppendorf tube, extracted with an equal volume of 1:1 phenolxhloroform (v/v) and two volumes of 95% ethanol were added to precipitate the DNA, The pellet was washed twice with 70% ethanol, briefly dried ia vacuo'and then redissolved in 20-50 ul of TE. Ribonuclease A was added to a final concentration of 10-20 ug/ml and incubated at 37 °C for 30 minutes. Usually 5-10 ul was sufficient for a single restriction digest. Large Scale Method I Method I is a compiled adaptation from Holmgren (1984), Ish-Horowicz (1979), Kidd et a/. (1983), Scott et ai. (1983) and Maniatis et aJ. (1982). Adult fruitflies (1-2 g) were homogenized in 5 ml of 10 mM Tris-Cl pH 7.5, 60 mM NaCl, 10 mM EDTA. 0.15 mM spermine and 0.15 mM spermidine (4 °C in aDounce Homogenizer). The homogenate was filtered through two layers of Nitex screen to remove the large debris and the filtrate was centrifuged for 10 minutes at 4 °C in an SS34 rotor at 4000 rpm. The supernatant was discarded and the pellet dispersed thoroughly in 5 ml of 20 mM Tris-Cl pH 8.0, 100 mM NaCl and 10 mM EDTA. The cells (and nuclei) were lysed by rapid mixing with 1 ml of 10% SDS and then proteinase K was added to a final concentration of 200 ug/ml. The tube was incubated at 37 °C for one hour and then gently extracted three times with 1:1 phenol-chloroform (v/v), followed by two chloroform washes with minimum agitation. The aqueous phase was transferred into a dialysis tubing by pouring and dialyzed extensively against 50 mM Tris-Cl pH 8.0. 10 mM NaCl. 10 mM EDTA until the A270 of the dialysis buffer was less than 0.05 (Maniatis et al, 1982). The DNA was treated with ribonuclease A (100 ug/ml) at 37 °C for 1-3 hours and then with 100 ug/ml proteinase K for a further 60 minutes. The DNA was phenol extracted and dialyzed as described above. It was concentrated by adding 1/5 volume of 2.5 M ammonium acetate, 100 mM MgCl2,1 mM EDTA and 2 volumes of 95% ethanol. The pellet was washed once or twice with 70% ethanol and dried ia vacuo.  It was then rehydrated in a small volume of TE (0.5-1.0 ml) without  disturbance for 2-4 days at 4 °C.  71 Large Scale Method II High molecular DNA produced by this method modified from McGinnis and Beckendorf (1983) contains trace amounts of eye and cuticle pigments (brown color) and is contaminated with some low molecular weight RNA. However, the impurities appear to be innocuous with respect to restriction digests and ligation efficiencies. About 1 g of adult flies were frozen in liquid nitrogen and homogenized to a fine powder using a mortar and pestle (Blin and Stafford, 1976). It was quickly transferred with a prechilled spatula or a small paint brush into 5 ml of solution A (30 mM Tris-Cl pH 8.0, 100 mM NaCl, 10 mM EDTA, 10 mM 2-mercaptoethanol and 0.5% Triton X-100 (w/v)) and vortexed vigorously for about 20 seconds. The debris was pelleted at 4000 rpm at 4 °C in an SS34 rotor for 10 minutes and then washed in 5 ml of solution B (100 mM Tris-Cl pH 8.4, 20 mM EDTA, 100 mMNaCl). The debris was pelleted again and then dispersed in 3 ml of the same buffer. The ceils and nuclei were iysed by rapid ejection of 0.3 ml of 10% SDS and then proteinase K was added to a final concentration of 100 ug/ml. The tube was incubated at 50 °C for 1 hour and then gently extracted with an equal volume of 1:1 phenolxhloroform (v/v) as described. The aqueous phase was transferred to a sterile 30 ml Corex tube with a wide bore pipet and 2 volumes of 95% ethanol (-20 °C) were layered gently on top. The aqueous phase was extracted by rotating the tube gently at a 30° angle on ice until a stringy precipitates formed at the interphase (D. Jones, personal communications). The ethanol was replaced intermittantly with fresh aliquots and the process was repeated until the aqueous phase almost completely disappeared. The matted ball of DNA was retrieved by a siliconized glass hook and washed by repeated dunking in fresh 70% ethanol. Excess ethanol was removed by dabbing the ball of DNA along the inside of a clean Corex tube but with due caution not to dry the DNA completely. The DNA was then dispersed in 0.5-1.0 ml of TE at 65 °C for one hour. The recovery of DNA from both large scale methods is approximately 0.5-1.0 mg/g of flies.  Drosophila embryos frozen in liquid nitrogen were difficult to homogenized manually to a fine powder. In this case, they were homogenized directly in 5 ml of solution A on ice in a Dounce homogenizer and subsequently treated identically as described in method II above.  72  The recovery of DNA is usually 1,5-3 mg/g (wet weight) of embryos.  PARTIAL DIGESTION OF GENOMIC DNA FOR LIBRARY CONSTRUCTION  D. melanogaster Genomic DNA Libraries suitable for chromosomal walking were constructed by partial cleavage of D.  melanogaster genomic DNA with the restriction enzyme Mbol, This enzyme recognizes and cuts at the tetranucleotide sequence -GATC- which, in theory, should occur once in every 256 base pairs. This sequence was assumed to occur with sufficient frequency within the D.  melanogaster genome to permit the generation of a pseudo-random set of overlapping fragments representative of the entire genome. The rate of Mbol digest was ascertained emperically in a series of preliminary experiments consisting of 25 ul of genomic DNA and 0.5 units of Mbol in a final volume of 100 ul of 1 x Mbol restriction buffer. Aliquots of the digest (20 ul) were removed at various time intervals and the reactions were terminated by adding 1 ul of 0.5 M EDTA and 5 ul of 25% Ficoll, 0.07% bromophenol blue, 0.07% xylene cyanol.  The extent of the digest was analyzed by  electrophoresis in a0.2-0.3% agarose gel. For the preparative reaction, 750 ul of the genomic DNA was digested in 3 ml of 1 x Mbol restriction buffer maintaining the same ratio of enzyme units to DNA volume. However, as a precaution to prevent over-digestion, the predetermined time points for obtaining the optimal size range of DNA were reduced by 50% (Seed et al., 1982). Three aliquots of 250 ul were collected and 10 ul from each were analyzed as above by gel electrophoresis to ensure that the DNA was digested to the correct extent. The three aliquots were pooled and gently extracted three times with 1:1 phenol-chloroform (v/v) and then precipitated with 2 volumes of 95% ethanol. The pellet was then slowly redissolved in 0.5 ml of TE.  D. erecta and D. yakuba Genomic DNAs Approximately 30-45 ul (~35 ug) of genomic DNAs from the two  Drosophila species were  digested with 5 units of Bam.HI in 100 ul of 1 x BamHI buffer at 37 °C. Aliquots of 25 ul were removed every 30 minutes and the reaction stopped by adding 1 ul of 0,5 M EDTA. The four  73 aliquots were pooled, extracted with an equal volume of 1:1 phenol : chloroform (v/v) and ethanol precipitated as described. The pellet was dispersed in 100 ul of 1 x CIP buffer (50 mM Tris pH 9.0,1 mM MgCl2,0.1 mM ZnCl2 and 1 mM spermidine) and incubated with 0.3-0.5 units of CIP per ug of DNA at 45 °C for 30 minutes. phenolxhloroform (v/v)  The DNA was extracted with 1:1  and ethanol precipitated, and redissolved in TE at a final  concentration of 0.125-0.5 mg/ml as estimated by gel electrophoresis against X standards. The DNA was then ligated to XEMBL3 arms without prior size fractionation.  SIZE FRACTIONATION OF D. MELANOGASTER DNA NaCl Linear Gradient The  D. melanogaster genomic DNA partly digested with Mbol was fractionated on 13 ml  NaCl linear gradients (1.25-5 M NaCl in TE) formed by a Hoefer multiple sucrose gradient maker (Dillelo and Woo, 1985). The gradients were centrifuged at 39,000 rpm in a SW40.1 rotor for 35 hr at 18 °C. One ml fractions were collected and diluted with an equal volume of TE. The DNA was precipitated by adding two volumes of 95% ethanol and centrifugation for 1 hr in a SW40.1 rotor at 20,000 rpm. The pellets were resuspended in 250 ul of TE and 10 ul from each fraction was analyzed by gel electrophoresis in a 0.2% agarose gel (fig. 7). The appropriate size fractions were pooled (15-23 kb for lambda and >35 kb for cosmid libraries) and dialyzed against 4 liters of TE. The DNA was then precipitated by adding 50% volume of 7,5 M ammonium acetate and 2 volumes of 95% ethanol, The pellet was solubilized in 20-30 ul of TE and a small amount (1-2 ul) was used to determine the concentration by A260 or by gel electrophoresis against A standards of known concentrations. Gel Fractionation DNA has also been sucessfully fractionated by agarose gel-elecrophoresis: but this method is not applicable in constructing cosmid libraries where gentler treatments are preferred due to the stringent requirement for large DNA. After digestion with restriction enzyme, the DNA was extracted with 1:1 phenol: chloroform (v/v) and ethanol precipitated, and the pellet was resuspended in 100 ul TE. The DNA was loaded into several slots in a 0.5% mini-agarose gel  74  F i g . 7 . Fractionation of Mbol partial digest of Oregon-R DNA by NaCl gradient. A typical example is shown here where the density of the gradient increases from fraction 1 to fraction 12. A small amounts of the DNA from 1 ml fractions were analyzed by electrophoresis in a 0.2% agarose gel. DNA fragments in the range of 15 kb to 35 kb were pooled for the construction of A. libraries (fractions 6 and 7) and fragments larger than 35 kb were pooled for the construction of cosmid libraries (fractions S.9 and 10). The exact amounts of DNA recovered from the fractions were determined by A 6o and/or by electrophoresis against a known amount of standard, usually J. DNA. The two lanes "A" are uncut il (50 kb), and lane "B" is HindIII cut A. and the sizes of the fragments liberated are shown on the right edge of the figure. Not shown in the figure is addition size markers generated by cutting 31 clts&57 with Sail, which liberates two fragments of 35kband 15 kb. 2  -50 -23  -9.5 -6.5  76 containing 1 ug/ml ethidium bromide. Electrophoresis was carried out in the dark and the DNA was inspected with the aid of a long wave (365 nm) UV lamp. The gel region containing DNA in the 9.5-23 kb range was excised; the gel slice was placed inside a dialysis tubing and the DNA eluted by electrophoresis for 2 hours. The supernatant was extracted several times with n-butanol to remove the ethidium bromide and to concentrate the DNA.  It was then  passed through siliconized glass wool to remove debris, 1:1 phenol : chloroform (v/v) extracted and then ethanol precipitated as described. The pellet was resuspended in 10 ul of TE and the final concentration was estimated by agarose gel-electrophoresis comparing to known concentrations of X. standards.  CONSTRUCTION OF GENOMIC LIBRARIES Choice of Cloning Vectors Both cosmid and lambda vectors have been utilized to construct "walking" genomic libraries. The 9.2 kb cosmid vector, cosPneo, designed specifically as a Drosophila shuttle vector was used to construct the cosmid walking library (Steller and Pirrotta, 1985). The cosmid contains the transposable P-element to permit direct germline transformation, a feature which is extremely useful in identifying genes by mutant rescue even when the gene products are unknown (Haenlin et. al., 1985). The selectable marker, neo, confers neomycin (or its analog G418) resistance to the larval progeny of transformed flies (Steller and Pirrotta, 1985); since this phenotype is a gain of novel function, an unconstrained variety of fly strains (a convenient feature in mutant rescue) or perhaps species could theoretically serve as recipients (fig. 8A). In contrast, vectors utilizing either ry (Rubin and Spradling, 1983)  Adh (D. A. Goldberg et. al., 1983) or w(Klemenz et. al, 1987) as selectable markers, only the corresponding mutants can be used as recipients. For a smaller cosmid library, the 5.2 kb cosmid pJB8 developed by Ish-Horowicz and Burke (1981) was used to clone BamHI digested D. melanogaster DNA. Replacement vectors XEMBL3 and AEMBL4 (Frischauf et al., 1983) and 312001 (Karn et al., 1984) containing polylinkers flanking the middle "stuffer" fragment were used as lambda  77  F i g . 8. Restriction maps of cosPneo and XEMBL3. ( A ) The polylinker cloning sequence in cosPneo is shown as a series of elevated restriction sites. Note that not all of them are unique (e.g. Hindlll). There are three cos sequences in the vector, allowing efficient cloning of a greater size range of genomic DNA. bla indicates the p-lactamase gene. The two P-element terminal repeats are indicated as boxes with darkened triangles, ori indicates the origin of replication in £. coli. hsp?0 indicates the Drosophila heat shock gene and the marker, neo, indicates the neomycin phosphotransferase gene used to select for transformants based on their resistance to G41S. [After Stellar and Pirrotta (1985).) The two arms are generated by cutting with BamHl within the polylinker sequence and Hpal nestled among the rorsites. The Hpal site may be substituted by other convenient unique restriction sites as long as both arms retain at least one cos sequence. (B) Most of the overlapping Oregon-R genomic clones from 31 were obtained from a library constructed using the versatile vector 3LEMBL3. The polylinker sequence is shown as elevated restriction sites. In the 3LEMBL4 cloning vector, the polylinkers are in the opposite orientation. In the new version, known as A.2001, the polylinkers contain sites for Xbal, BamHl. Hindlll. EcoRl. SstI and Xbal. L arm-left arm and R arm-right arm of phage respectively. [After Frischauf et. al., (1983).]  78  S  C03  B RI  RI B S  cos L arm  staffer  R arm  79 cloning vehicles (fig. 8B). These improved vectors are designed to select against wild-type phages at two levels during library construction and thus circumvent the need for purifying the vector arms (Karn et al, 1983). First, two different restriction enzymes are used to cut within the polylinkers. The religation of the "stuffer" fragment to the vector arms can then be prevented by the selective removal of the excised linker by isopropanol precipitations. Second, the "stuffer" fragment in the JL vectors contains gam* function and gam* phages are unable to form plaques on £ coll strains lysogenic for phage P2 (Sp/*). Although this phenomenon is not well understood, it has been exploited as an additional selection step against the wild-type vector. If the central fragment is replaced by genomic DNA, the phage would become insensitive to P2 inhibition (Sp/~) and should grow on P2 lysogens.  Bacteriophage Lambda Large-Scale Lambda Preparation Large-scale growth of bacteriophage lambda was based on the method developed by Yamamoto et al. (1970) and modified by Maniatis et al. (1982). £ coll host strains LE392 or Q358 were grown overnight at 30-37 °C in 20-50 ml of NZYM. The cells were harvested by centrifugation at 4000 rpm in an SS34 rotor for 10 minutes and then resuspended in 0.4 volume of 10 mMMgCl2. Approximately^ cells were mixed with 10* bacteriophages and the suspension was incubated at 37 °C with intermittent agitation. After 20 minutes, the infected cells were inoculated into 500 ml of NZYM medium pre-warmed to 37 °C. The culture was shaken vigorously at 37 °C to allow concomitant growth of both the cells and the bacteriophage.  After the cells lysed to release the phage particles (characterized by  considerable bacterial debris),  1-2 ml of chloroform was added to the culture and the  incubation continued for another 15 minutes to complete the lysis.  The cell debris was  removed by centrifugation at 7000 rpm in a GSA rotor for 15 minutes at 4 °C and the phage supernatant was transferred to clean centrifuge tubes containing 29.2 g of NaCl (final concentration 1 M) and 50 g of PEG-8000 (final concentration 10%). The tube was mixed at 37 °C with slow shaking until the contents were dissolved and it was left overnight at 4 °C. The  80 bacteriophage was precipitated by centrifugation at 7000 rpm in a GSA rotor as described and the pellet was dispersed in 6 ml of SM buffer (10 mM Tris-Cl. pH 7.5,100 mM NaCl, 10 mM MgCl2 and 0.02% gelatin). The suspension was digested with DNase I (10 ug/ml) and RNase A (20 ug/ml) for 30 minutes at 37 °C. An equal volume of chloroform was added and the suspension was cleared by centrifugation at 10,000 rpm in an SS34 at 4 °C for 5 minutes. CsCl Gradient Purification of Live Lambda Bacteriophage To purify the bacteriophage further, 0.6 g of solid CsCl was added to each ml of supernatant. The phage suspension was transferred to 13 mm x 51 mm Beckman Quick-Seal tubes and centrifuged at 65,000 rpm in a VTi65 rotor at 4 °C for 60 minutes. The particles (bluish band) were retrieved with a 3 ml syringe equipped with a 21 gauge needle and the CsCl was removed by dialysis against a 1000 volume of 10 mM Tris-Cl pH 8.0, 10 mM NaCl, 10 mM MgCl2 (two changes, one hour each). Bacteriophage protein was removed by adding 20 mM EDTA, 0.5% SDS and 50 ug/ml proteinase K. After incubation at 65 °C for one hour, the supernatant was extracted several times with 1.1 phenolxhloroform (v/v) and the phases were separated by brief centrifugations as described. The aqueous phase was transferred to a dialysis sac with a wide bore pipet and dialyzed extensively against TE. Preparation of Lambda Vector Arms Approximately 20 pg of AEMBL3, AEMBL4 or 12001 vector was digested with 10 units of BamHl in 50 ul of 1 x BamHl buffer for 1 hour at 37 °C. Another 10 units of enzyme was added and DNA was digested for another 30 minutes. Small aliquots of the digest were analyzed in agarose gel to ensure that the digestion was complete.  The DNA was extracted with 1:1  phenol.chloroform (v/v) and ethanol precipitated as described. The pellet was resuspended in 50 ul of 1 x EcoRI buffer and 10 units of EcoRI was added to the tube, After 1 hour at 37 °C, another 10 units were added and incubation was continued for 30-60 minutes. The digest was phenolxhloroform (1:1 v/v) extracted as before and sodium acetate was added to a final concentration of 0.3 M followed by 0.6 volume of isopropanol. The tube was incubated on ice for 15 minutes and then centrifuged for 5 minutes in a Eppendorf microfuge, The pellet was resuspended in 200 uiTEand the isopropanol precipitation step repeated. The excised linkers  ,81 (<10 bp) should remain in the supernatant and therefore be selectively eliminated during the precipitation steps (Frischauf et, al, 1983). The pellet was washed 1-2 times with 70% ethanol and dried  in vacuo, and then resuspended in 40 ul of TE.  Preparation of Cosmid DNA Single ampicillin-resistant colonies of the F. co/r'strain DH1 transformed with cosPneo (a gift from Dr. V. Pirotta) and pJB8 (purchased from Amersham) were inoculated into 10 ml of LB supplemented with 100 ug/ml of ampicillin and grown overnight to saturation at 37 °C. Approximately half of the overnight cultures were used to inoculate a 500 ml culture grown in the same medium. The large scale alkaline isolation and purification of the cosmid DNA by CsCl gradient centrifugation were performed as described (see CsCl Gradient Purification of DNA, P. 60). Preparation of Cosmid Vector Arms Approximately 40 ug of cosPneo was linearized at the unique Hpal site in 200 ul of 1 x Hpal buffer (Steller and Pirrotta, 1985) (see fig. 8A). To prevent religation at this site during subsequent steps, 2.5 units of calf intestinal phosphatase were added directly to the restriction digest and the tube was transferred to 45 °C for 30 minutes.  The DNA was extracted with  phenol : chloroform (1:1 v/v) and ethanol precipitated as described, and the pellet was resuspended in 200 ul of 1 x BamHI buffer. The DNA was cut at the BamHI site within the polylinker with about 40 units of BamHI for 2 hours to generate two cosmid arms of 4.2 and 5 0 kb in size.  The efficiency of each step above was ascertained by either agarose gel  electrophoresis or ligation and transformation of  £. coli strain DH1 (fig. 9, left panel).  Cosmid arms from pJB8 were prepared essentially as described by Ish-Horowicz and Burke, 1981), The vector (20 ug) was linearized at either the Hindi 11 or Sail site in 100 ul of the appropriate restriction buffer with 20 units of enzyme.  The reaction was terminated by  heating at 68 °C for 15 minutes and the DNA was dephosphorylated with 5 units of calf intestinal phosphatase at 45 °C for 30 minutes to prevent the formation of tandem vectors. The DNA was extracted with phenolxhloroform (1:1 v/v) and ethanol precipitated, followed by resuspension in 100 ul of 1 x BamHI buffer. The vector was cleaved with 10 units of BamHI for  82-<  F i g . 9 . Systematic testing of intactness of the restriction ends i n both vector and genomic DNAs before packaging. In the left panel, the cosPneo vector is tested for efficiencies of BamHl cutting and phosphatase treatment at the Hpal ends. Lane 1: cosPneo cut with Hoal and treated with calf intestinal phosphatase. Lane 2: as above except the DNA was incubated with 2 units of T4 DNA ligase overnight. The inability of the vector to form concatamers shows that the phosphatase treatment is essentially complete. Lane 3: as in lane 1, except the vector is further cleaved with Baml to liberate the two vector arms of 4.2 kb and 5 kb. Lane 4: as in lane 3, except the DNA was incubated with 2 units of T4 DNA ligase overnight. Approximately 95% of the DNA was re-ligated at the BamHl site. Residual amount of vector arms refractory to ligation was consistently observed over several tries, cautioning that the BamHl probably contained trace amounts of nuclease contaminant. Lane 5: approximately 0.2 |tg of vector arms were mixed with 0.2 molar equivalents of tester "insert" DNA generated by Mbol digest of JIL47.1. Lane 6: as in lane 5, except the DNA mixture was incubated overnight with 2 units of T4 DNA ligase in a 10 pi volume. The results showed that under these conditions, ligation is efficient. The right panel shows the results testing the intactness of the Mbol ends in the genomic DNA. Lane 7: 0.25 ug of fractionated genomic DNA was mixed with 10-fold molar excess of vector arms. Lane 8: identical to lane 7. except 2 units of T4 DNA ligase was incubated with the DNA overnight in a 10 ul volume. The conversion of the genomic DNA to high molecular weight material shows that the Mbol sites of the DNA remained intact through the several steps of preparatory manipulations. The 9.5 kb band is religation or the excess cosmid arms. The two lanes "A" are Hindlll generated Jl size markers, which are shown on the right edges of both panels. The ligation of Mbol genomic partial digest to Jl vectors was also treated similarly as in this example before packaging (data not shown).  83  8^ 6 hours and the reaction terminated by extraction with an equal volume of phenolxhloroform (1:1 v/v) and precipitation with ethanol. The pellet was redissolved in a final concentration of 0.5 mg/ml.  An equimolar mixture of the BamHI* Hindi 11 cleaved and the BamHI + Sail  cleaved pJB8 was used for cloning. Ligation of Lambda Vector Arms to  Drosophila DNA  D melanogaster Libraries Two molar excess of JLEMBL3 vector arms were mixed with 15-35 kb Mbol partially digested  Drosophila DNA (Maniatis et. al., 1982). The final concentration of DNA was adjusted to approximately 400 ug/ml in 1 x T4 DNA ligase buffer and 2-3 units of T4 DNA ligase were added to the reaction, The ligation mix (10-20 ul) was withdrawn into a siliconized glass capillary and the ends were sealed by heating with a flame. The ligation reaction was carried out at 1416 °C usually for 16-24 hours and the extent of the reaction was analyzed by electrophoresis of a small aliquot in a 03% agarose gel. Successful ligation was characterized by conversion of the discrete lambda arms and genomic DNA into high molecular weight DNA of more than 100 kb (concatameric form) and should be viscous when pipetting. A small aliquot of the ligation mix (0.5 ug) was test packaged with extracts prepared as described below.  The relative  efficiency of recombinant phages compared to religated wild-type vectors was measured by infecting the £ colistrains 0358 and 0359 (P2 lysogen). Typically, efficiencies ranging from 75-90% were routinely obtained. The ligation of unfractionated EcoRI digested genomic DNA to JLEMBL4 arms was performed under similar conditions as those described above.  Drosophila Sibling Species Libraries Approximately 0.5 ug and 0.25 ug of BamHI partially-cleaved genomic DNA from and  D. erecta  D. yakuba respectively, were ligated to 1 ug of AEMBL3 arms with 2 units (0.5 ul) of T4 DNA  ligase in a final volume of 10 ul of 1 x ligase buffer as described. Ligation for the  D. teissieri library was performed in a 5 ul volume of 1 x ligase buffer  containing 2 ug of 312001 vector arms, 0.25 ug (2 ul) BamHI insert DNA and 2 units of T4 DNA ligase.  *85 Ligation of Cosmid Arms to  JP. melanogaster DNA  cosPneo Vector Approximately 10 molar excess of cosPneo arms were ligated to 35-50 kb Mbol partially digested genomic DNA. The final DNA concentration in the ligation reaction was adjusted to 225 ug/ml in 1 x T4 DNA ligase buffer. A small aliquot (1-2 ul) was removed and stored at 4 °C as a control. T4 DNA ligase (2-3 units) was added to the rest of the ligation and gently mixed by pipetting. The ligation mix was transferred to a siliconized glass capillary and the reaction was performed as described above. Unlike ligation of genomic DNA to X vector arms, viscosity is not a reliable indicator of ligation efficiency. The efficiency here must be ascertained by electrophoretic analysis of 1-2 ul of the reaction as compared to the control in a 0.3% agarose gel (fig. 9, right panel).  p IBS Vector The ligation conditions were similar to that described above except in this case, the D.  melanogaster DNA was completely sheared with BamHl prior to ligation step.  IN VITRO PACKAGING OF BACTERIOPHAGE X AND COSMID DNA Two slightly different systems have been used interchangeably throughout this work to regenerate phage particles  in vitro from recombinant X and cosmid molecules. A popular  method involves preparation of whole cell extracts from pairs of  £ coli K-derived strains of X  lysogens that have complementary defects in the X packaging proteins. When the lysogens are combined, the added X or cosmid recombinant molecules (and a low level of endogenous prophage DNA) are packaged by the full complement of bacteriophage proteins. A recent report however, showed that extracts prepared from these strains are contaminated with the  £coK restriction system, a hidden variable which can contribute to the loss of up to 80% of the unmodified recombinant molecules during much improved "  in vitro packaging (Rosenberg, 1985). Recently, a  cos' system" utilizing only a single £. coliX lysogen B-derived strain that  lacks this packaging bias has been constructed (Rosenberg et. al., 1985). Furthermore, the endogenous X prophage is disarmed by a deletion in the cos sequence, and thus cannot be  86' packaged by the crude extract. Small aliquots of the ligation mixes were always test packaged using whole cell lysates prepared as described below before embarking on a full scale experiment using the more efficient (and expensive) Gigapak cell-free extracts, which are usually 10 to 30 times more efficient (1-3 x 10 pfu/ug JlDNA). However, the pJB8 and XEMBL4 9  D. melanogaster libraries  composed of approximately 150,000 recombinant clones each were collected only from test packaging experiments without resorting to the Gigapak system. Freeze-Thaw Two-Strain Packaging Extracts The procedure is adopted from the "freeze-thaw protocol I" in Maniatis et. al. (1982) with minor modifications described below. Single colonies of the two £. ro//strains NS428 (.4 am) andNS433 (iam) were inoculated into 10 ml of M9 medium (supplemented with 2% casamino acids) and incubated at 32 °C overnight (Sternberg et al, 1977). A small aliquot from each overnight culture was inoculated to separate flasks of 100 ml of M9 to an initial A600=0.1, and the cells were grown at 32 °C with vigorous shaking until the A600 ° f  e a £  b culture was  approximately 0.3 (midlog phase). The lysogenswere induced by immersing the flasks in a 65 °C water bath until the internal temperature, as measured by submerging an alcohol sterilized thermometer, reached 45 °C. The flasks were transferred quickly to a 45 °C shaker for 15 minutes and then the cells were incubated at 39 °C until they approached stationary phase (2 hours or about A600=l 0). A small sample of the iysogens was tranferred to a glass test tube and a drop of chloroform was added.  Successful induction of the Iysogens was  indicated by rapid cell lysis (Maniatis et al., 1982). The two cultures were then mixed together and rapidly cooled by swirling in an ice bath for 5 minutes. The cells were pelleted at 4000 rpm in a GSA rotor for 10 minutes at 4 °C and then redispersed in 100 ml of ice cold M9 lacking casamino acids. The cells were harvested again by centrifugation as described above and then thoroughly resuspended in 1 ml of CH buffer. Aliquots (50 ul) were dispensed into prechiiled 1.5 ml Eppendorf tubes and flash frozen in liquid nitrogen. Efficiencies of these extracts were routinely between 0.5-1.0 x 10 plaques from 1 ug of input "wild type" X DNA (clts857). 8  87' In vitro Packaging Using the Two Strain System A tube of the freeze-thaw extract was slowly thawed on ice for 3 minutes; <0.5 ug of DNA (Jl or cosmid) in 66 mM Tris-Cl pH 7.9, 10 mM MgCl2. 1.5 ul of 0.1 M ATP and an empirically determined volume of CH buffer (20-25 ul) were added and mixed thoroughly with a glass rod. The packaging reaction was incubated in a 37 °C water bath for 60 minutes. Another tube of packaging extract was thawed on ice;l ug of DNase I and 2.5 ul of 1 M MgCl2 were added and 20 pi of this second extract was added to the packaging reaction. The addition of a second portion of extract improved the efficiency of packaging 2-5 times.  After a further 30 minute  incubation, 0.9 ml of SM buffer and drops of chloroform were added. The tube was vortexed gently and then centrifuged for 2-3 minutes in a microfuge. The supernatant was transferred to a clean 1 5 ml Eppendorf and stored over a few drops of chloroform at 4 °C.  cos' Packaging Extracts The  £. coli B-derived strain SMR10 (a gift from Dr. F. Stahl) was grown in 100 ml LKB  medium and induced similarly to that described above (also see Rosenberg et. al, 1985). After induction, the cells were harvested by centrifugation as above and the pellet resuspended in 9 ml of TSP (40 mM Tris-Cl pH 7.8, 10 mM spermidine and 10 mM putrescine) in the cold. The cells were pelleted again and then dispersed in 0.1 ml of TSP with the aid of a pipet. Aliquots of 20 ul of the concentrated cells were distributed to sequentially numbered 1.5 ml Eppendorf tubes containing 5 ul of 50% DMSO, 7.5 mM ATP, pH 7.0 and flash frozen in liquid nitrogen. The tubes were stored at -80 °C for up to two weeks. The in vitro packaging efficiencies of the cos' extracts ranged from 0.4-1.0x10** per ug of input Jl wild-type DNA (rlts857).  However, it  was noted that the efficiency was reduced somewhat with extracts that were flash frozen last. Attempts to stabilize the extracts by including 5% ,10% or 25% sucrose in the TSP were not successful. In vitro Packaging Using cos' Extracts DNA to be packaged was resuspended in 10 mM Tris-Cl pH 8.0, 50 mM KC1 and 1 mM EDTA. After addition of the DNA to the just thawed extract, the tube was transferred to a 37 °C water bath for 60 minutes.  0.5 ml of SMC buffer (0.7% Na2HP04, 0.3% KH2PO4, 0,05% NaCl, 0.01%  88 NR4CI, 1 mM MgCl2,0.1 M CaCl2,50 ug/ml DNase I) was added and the tube was gently vortexed to disperse the pellet.  The cell debris was removed by a brief centrifugation and the  supernatant stored at 4 °C over a drop of chloroform.  AMPLIFICATION OF GENOMIC LIBRARIES F. coli strain Q359 for propagating the JL libraries was grown overnight in NZYM supplemented with 0.4% maltose to induce high level expression of the Jl receptor gene lamb. The cells were then concentrated 2.5 fold in 10 mM MgCl2 as described (see Large-Scale Lambda Preparation, P. 79). A small volume of host cells (1.0 ml) were infected with 15,000 to 20,000 plaque forming units at room temperature for 5 minutes to coordinate attachment of the phage and then transferred to a 37 °C water bath for 20 minutes to allow injection of the phage DNA into the host. Soft agar (7.5 ml) was added and the suspension was plated onto 150 mm NZYM plates, and the phages were grown overnight at 37 °C. SM buffer (10-12 ml) was added to the plates and the phage eluted by slow diffusion at 4 °C for several hours. The supernatant was collected and debris removed by centrifugation in an SS34 rotor at 7000 rpm for 20 minutes. Aliquots were stored at -70 °C in the presence of 7% DMSO, or at 4 °C with a few drops of chloroform as preservative. For the JIEMBL3 "walking" library, a total of at least 300,000 unique plaques were collected.  For the Drosophila sibling species libraries approximately 250,000, 40,000 and  250,000 unique plaques were obtained for D. erecta, D. yakubaxbt D. teissieri, respectively. The  reck £. ccZ/strain DH1 was used to propagate the cosmid libraries. They were grown  overnight and concentrated 20 fold in 10 mM MgCl2 as described. Twenty-five ul of the cells were adsorbed with 10,000 - 20,000 packaged cosmids at room temperature and then at 37 °C as above. After infection, 0.5 ml of LB medium was added and the cells were incubated at 37 °C for 45 minutes to allow expression of the f>-lactamase gene. The cells were then concentrated by centrifugation (30 seconds in the microfuge), resuspended in a small volume (100 - 200 ul of 0,5% NaCl) and plated on LB-glucose plates supplemented ¥ i t h 40  ng/ml of ampicillin, The  presence of glucose helps to inhibit the growth of fortuitously packaged endogenous X DNA.  89; The ampicillin-resisitant colonies were pooled by washing the plates with 0.5% NaCl, centrifugation in an SS34 rotor at 6,000 rpm for 10 minutes and resuspended in LB-glucose supplemented with 40 ug/ml ampicillin and 15% glycerol. For the cosPneo "walking" library, at least 500,000 unique colonies were obtained at an efficiency of 4 x 105/ug genomic DNA.  PREPARATION OF RADIOLABELLED PROBES Nick-Translation Radiolabelled probes were prepared by the method of nick-translation (Rigby et. al., 1977) with minor modifications (Dr. Ross MacGillivray, personal communications).  DNase I (1  mg/ml) was freshly diluted in 10 mM Tris-Cl pH 7.5, 5 mM MgCl2 and 1 mg/ml BSA to a final concentration of 10 ug/ml and incubated on ice for 20 minutes. DNA (0.5 ug -1 ug) was added to a 50 ul cocktail containing 50 mM Tris-Cl, pH 7,5, 5m M MgCl2, 10 mM B-mercaptoethanol, 0.02 mM dGTP, 0.02 mM dTTP, 14 mM dATP, 14 mM dCTP, 0,02 mM CaCl2, 35-70 pmoles each of [a32p]dATP and lct32p]dCTP, 50 pg of the activated DNase I and 10 units of  £ coli DNA  polymerase I (Kornberg enzyme). The reaction was incubated at 16 °C for 2.5 - 4,0 hours and terminated by adding 75 ul of 1% SDS, 10 mM EDTA and heated at 68 °C for 15 minutes.  £. coli  tRNA was added (25 ug) as a carrier and the nick-translated DNA probe separated from unincorporated radiolabelled nucleotides in a small column containing AcA 54 resin (LKB) pre-equilibrated in 10 mM Tris-Cl pH 7.5. 0.2 M NaCl and 0.25 mM EDTA. Fractions of ~400 ul were collected and the specific activity determined by Cerenkov radiation, which is usually in the range of 10^ cpm/ug of input DNA. The probe was denatured with 0.1 M NaOH at 65 °C for 15 minutes then neutralized with 0.15 M NaH2P04 and added to the hybridzation mix, Oligonucleotide Probes About 10 - 20 pmoles of the oligonucleotide was incubated in 10 ul 1 x kinase buffer containing 100 mM Tris-Cl pH 8.0, 10 mM MgCl2, 50 mM DTT, 30-50 uCi of [^Pl-ATP and 2-5 units of T4 polynucleotide kinase. The reaction was incubated at 37 °C for 30 - 45 minutes and stopped by heating at 68 °C for 10 minutes.  The radiolabelled probe was used without  separation from the unincorporated radiolabelled nucleotides.  90. Construction of tRNA4 7  S e r  - and t R N A  A r g  - Specific Probes bv Strand-Svnthesis  Approximately 7 ug of pDt5. a recombinant plasmid containing a single tRNA7$  er  gene  (Newton, 1984; Cribbs et, al, 1987b), was digested with HaeJII and Ddel in combination. From the available DNA sequence data (Newton, 1984), the combination of enzymes should produce a 133 bp fragment containing a truncated t R N A 7  Ser  gene, starting at the HaelH site at  nucleotide 9 within the gene, The Ddel restriction ends of the restriction fragments were repaired by filling with all four dNTP using the Klenow enzyme and they were resolved by electrophoresis in a 5% polyacrylamide gel. The t R N A  A r a  probe was constructed similarly by cutting pDt27R with HaelH and Ddel,  From sequence analysis, this plasmid contains four duplicated tRNA 8 genes sharing Ar  different extents of almost perfect sequence homology in their flanking regions (Newton, 1984),  The restriction digest should produce four overlapping 70 bp fragments each  containing a partial t R N A 8 gene truncated at the HaeHI site (see above) and 8 bp 3' to the Ar  gene. Restriction fragments  corresponding to the  predicted sizes from the  above  two  experiments were excised from the gel and the DNA eluted overnight in 1 ml of M+G elution buffer (P. 66). The fragments were recovered by precipitation with ethanol and cloned into the Smal site of M13mp9. Single-stranded DNA templates were prepared from randomly chosen bacteriophage plaques, and their nucleotide sequences determined. One clone each containing the coding strand of the predicted partial tRNA7$  er  and t R N A  Arfi  genes were  obtained and were used as templates to generate primer-extended hybridization probes. Hybridization probes were constructed by annealing the oligonucleotide primer, Pex, to the template  as in the initial step in sequencing by the chain-terminating method. [ct32p]dATP  (10"2 pmoles) and 1 ul of a "primer-extension mix" were added (0.5mM dGTP, dCTP and dTTP each), and the reaction was started by incubation of the mix with 0.5 units of Klenow enzyme at room temperature for 10 minutes. The reaction was stopped by heating the tube at 68 °C for 10 minutes and unicorporated radiolabelled  nucleotides were eliminated by  chromatography  through a small column containing AcA 54 resin as described. The extended probe was  91 denatured in 0.1M NaOH, neutralized in 0.15 M NaH2P04 and added to the hybridization mix. For increased detection sensitivity, 5 units of HaelH was added to release the doublestranded insert from the template. After 30 minutes at 37 °C. solid urea was added to the digest to a final concentration of 7-8 M and the probe was denatured by heating at 90 °C for 3 minutes. The reaction mixture was applied to a 5% polyacrylamide gel containing 8 M urea and electrophoresis was carried out at 800 volts for 2-3 hours.  Radiolabelled bands were  detected by autoradiography and the single-stranded probe was excised, eluted from the gel and concentrated by ethanol precipitation as described.  Alternatively, after the extension  step, the probe was denatured from the template by adding 0.1 M NaOH and purified by passage through a Bio-Gel A-5m column equilibrated in 0.1 M NaOH (Dr. D. Cribbs, personal communications).  EMPIRICAL EVALUATION OF GENOMIC LIBRARIES BY SOUTHERN BLOTTING All  newly  representation.  constructed X libraries were evaluated for completeness in Approximately 10**  sequence  £ coli host cells were infected with 2.5 x 107  bacteriophage and the infected cells were inoculated into 50 ml of NZYM prewarmed to 37 °C (see Large Scale Lambda Preparation). The phage particles released from lysed cells were precipitated with 1 M NaCl and 10% PEG-8000 as described and the pellet was dispersed in 1 ml of DNase I buffer (50 mM Tris-Cl pH 7.5, 5 mM MgCl2 and 0.5 mM CaCl2). The suspension was digested with DNase I (100 ug/ml) and ribonuclease A (200 ug/ml) at 37 °C for 30 minutes. Bacterial debris was then pelleted by a brief centrifugation at 12,000 x g and the supernatant transferred to a clean 1.5 ml Eppendorf tube. SDS (1%), EDTA (5 mM) and proteinase K (150 ug/ml) were added and the tube was incubated at 68 °C for 60 minutes. The supernatant was phenol extracted and the bacteriophage DNA precipitated with ethanol. The DNA was then digested with various enzymes and the fragments resolved in a 0.5-0.6% agarose gel followed by Southern blotting onto a sheet of Hybond nylon membrane. The treatment of the filter and hybridization with various radiolabelled probes were performed as described (see Southern Blotting).  A similar protocol for evaluating genomic DNA libraries has been independently  92 developed by Phillips et al. (1985).  SCREENING GENOMIC LIBRARIES Plating Bacteriophage I Libraries Approximately 2 x IO cells were infected with 50,000 bacteriophages and the infected cells 9  were plated on 150 mm NZYM plates as described (see AMPLIFICATION OF GENOMIC LIBRARIES, P. 88). When the phage plaques were nearly confluent, the plates were placed at 4 °C for at least 30 minutes to allow the top agarose to harden. A dry Hybond nylon circle (137 mm) was placed onto the surface of the top agarose for about 1 minute to wet, allowing diffusion of the bacteriophage and free DNA onto the membrane (Benton and Davis, 1977). The orientation was recorded by puncturing holes in three to four asymmetric locations in the membrane and into the agar beneath with an 21-gauge needle. After the membrane was evenly wetted, it was peeled off with a pair of blunt-ended forceps and a replica was made with a second filter following the same outline above, except that it was left on the surface of the top agarose for 30 seconds longer. If the original plaques were small, the filter bound bacteriophages were amplified by incubating the filter with the phage side up on a fresh NZYM plate overnight (modified from Woo, 1979). Plating Cosmid Libraries £. colt  cells harboring cosmids were plated on several 150 mm LB-glucose plates  supplemented with 40-50 ug/ml of ampicillin at densities between 20,000 - 40,000 cells per plate. The plates were incubated at 37 °C until the colonies were barely visible and they were stored at 4 °C for 1-2 hours to allow the colonies to harden. The colonies were transferred onto a moistened sterile Hybond nylon membrane by blotting as above. A replica copy of the colonies was made by pressing a second moistened sterile membrane to the first and their orientations relative to one another and to the plate were marked with asymmetrically located holes as described above. The membranes were then placed onto fresh LB-glucose plates (supplemented with 40-50 ug/ml ampicillin) with the colonies side up, and along with the original plates were incubated for approximately 6 -8 hours at 37 °C to allow the colonies to  grow to 1-2 mm in diameter.  The cosmids were amplified overnight by transferring the  membrane-bound colonies onto LB plates supplemented with 250 ug chloramphenicol/ml (Hanahan and Meselson, 1980 and 1983). Lvsis of Membrane Bound Bacteriophages or Bacterial Colonies Membrane bound phages or bacterial colonies were lysed by floating the filters on a shallow pool of 10% SDS for 5-10 minutes. The membranes were immersed in a denaturing solution (0.5 M NaOH, 1.5M NaCl) and then neutralized in 1 M Tris-Cl pH 7.5 for 2 to 10 minutes at each step. They were then washed briefly in 0.5 M Tris-Cl pH 7.5, 1.5 M NaCl and then dried in air for 30 minutes. The DNA was immobilized onto the membrane by irradiation with UV (254 nm) for 2-3 minutes as described by the supplier's manual (Amersham). Prehvbridization The membranes were washed in several changes of 3 x SSC, 0.1% SDS at 65 °C and gently scrubbed with a toothbrush to remove bacterial debris. The membranes were prehybridized in 10-20 ml of 1 x Denhardt solution, 6 x SSC, 0.1% SDS and 1 mM EDTA for 5 minutes to overnight at 65 °C. If an oligonucleotide was to be used as a probe, the membranes were prehybridized at temperatures between 37-60 °C in 10 x Denhardt solution, 6 x SSC and 0.1% SDS (Zoller and Smith, 1983). Prehybridization was carried out in a petri plate and a circular piece of Mylar plastic cut to size was then placed on top of the stack of membranes to ensure that they remained submerged. The petri plate was then placed inside a Pyrex dish with a few sheets of moist paper towels and the assembly was sealed with Saran wrap. Occasionally, carrier DNA (calf thymus or salmon sperm) was also included in the prehybridization buffer.  Hybridization For nick-translated or primer-extended probes, the membranes were hybridized in 1 x Denhardt solution, 6 x SSC, 0.1% SDS and 1 mMEDTA in a petri plate at 65 °C for 8-14 hours. The volume of the hybridization was kept to a minimum containing about 10& cpm of radiolabelled probe per filter. After hybridization, the filters were then washed 3-4 times in 1 x SSC, 0.5% SDS (w/v) at 65 °C to remove excess probes. If the background remained unacceptably high after the initial washes, as determined either by autoradiography or by monitering with a  94 Geiger counter, the membranes were re-washed in 0.2 x SSC, 0.1% SDS (w/v) at 68 °C several times aided by gentle scrubbing with a gloved hand. When an oligonucleotide probe was used, it was first heated at 65 °C for 5 minutes to denature any secondary structures before adding to the hybridization mix containing 10 x Denhardt solution, 6 x SSC and 0.1% SDS. The hybridization was conducted at temperatures determined by the formula Td (°CM4(G+C)*2(A+T)]-5, where Td equals the hybridization temperature in °C (Meinkoth and Wahl, 1984). After 1-3 hours, the membranes were washed once briefly in 6 x SSC, 0.1% SDS (w/v) at room temperature to eliminate most of the excess probe and then the washings were repeated twice more with the same solution at temperatures contingent upon the nucleotide content of the probe (see above). After the washes, the filters were placed on a sheet of expired X-ray film as support (bleached to remove the film coating) and the hybridization signals were detected by autoradiography at -70 °C. Cronex enhancing screens were used whenever possible. Isolation and Purification of X Clones Usually the positive signals on the film could not be assigned to a single phage plaque due to the high plating densities and thus a plug of agar was removed from the corresponding area of the plate with the wide end of a sterile Pasteur's pipet. The mixture of phages were eluted in 1 ml of SM buffer, and the suspension was replated at lower densities by 10 fold serial dilutions.  Plaque lifts and screening of filter bound phage DNA by hybridization to  radiolabelled probes were conducted as described. Isolated positive plaques were removed with a sterile Pasteur's pipet and eluted in 100 ul of SM buffer. The phage eluate was added to 100 ul of the appropriate £ co//host resuspended in 10 mM MgCl2 and incubated for 15 minutes at 37 °C. The infected cells were inoculated into 50 ml of NZYM and grown at 37 °C until the ceils began to lyse (6-8 hours). The purification of released bacteriophage and the isolation of DNA have been described (see Large Scale Lambda Preparation on P. 79 and Empirical Evaluation of Genomic Libraries on P. 91). Isolation and Purification of Cosmid Clones Cosmid colonies recovered on the agar plug were resuspended in 1 ml of 0.5% NaCl. The  resuspended cells were replated at lower densities on LB-glucose plates (+40 pg/ml ampicillin) and rescreened within radiolabelled probes as described. Well-isolated positive colonies were inoculated in 5 ml of LB-glucose (40 ug/ml of ampicillin) in a 125 ml Erlenmeyer flask and shaken vigorously at 37 °C for 14-16 hours.  The culture was transferred into a 14 ml  graduated polypropylene tube and the cells were harvested by centrifugation for 15 minutes in a clinical centrifuge. The pellet was dispersed in 100 ul of 50 mM glucose, 25 mM Tris-Cl pH 8.0 and 1 mM EDTA and the plasmid DNA isolated by the alkaline lysis method as described (see Small Scale Mini-Prep, P. 61).  RESTRICTION ENDONUCLEASE DIGESTS For general analytical restriction digests, five different buffers have been found to adequately accommodate the spectrum of restriction enzyme requirements throughout this work. All restriction buffers are stored at -20 °C as 10 x stocks containing 100 mM Tris-Cl pH 7.8,100 mM MgCl2,10 mM EDTA, 60 mM B-mercaptoethanol in addition to one of the following salt requirements: 0 mM NaCl, 0.6 M NaCl, 1.0 M NaCl, 1.5 M NaCl and 60 mM KC1. For genomic library construction or for restriction endonucleases with more fastidious requirements, a separate set of restriction buffers were prepared as 10 x stocks according to the specifications of the suppliers, filter sterilized and stored as small aliquots at -20 °C (see Table IV). Restriction digests were performed in 1.5 ml Eppendorf tubes in accordance with the supplier's instructions. The final DNA concentrations in the reaction were never more than 200 ug/ml and usually carried out with 2-5 fold the recommended enzyme units.  Plasmids  isolated from "mini-prep's" containing large amounts of contaminating RNA were digested with inclusion of ribonuclease A at a final concentration of 40 ug/ml. Restriction digestions requiring more than one endonuclease were always performed sequentially with small aliquots of the reactions removed for analysis by gel electrophoresis in between steps. Digestions involving enzymes requiring different concentrations of the same salt, the enzyme with the lower salt requirement was used first; when the first reaction was complete, the salt concentration of the digest was appropriately adjusted before adding the second  SPECIFIC BUFFERS FOR RESTRICTION ENZYMES USED IN LIBRARY CONSTRUCTION T A B L E IV-  Buffer  *Final Comoosition in mM Tris-Cl  MgC12  NaCl  KC1  DTT  PH  BamHI  10  10  100  -  10  7.5  EcoR I  10  10  150  -  10  7.5  Hpal  10  10  -  50  10  7.5  Hind III  10  10  60  -  10  8.0  Sail  8  6  150  -  10  7.6  Mbol  50  10  50  _  _  8.0  *AI1 buffers were made as 10 x stocks  97 enzyme. If different salts were required by the restriction enzymes (e.g. NaCl and KC1), after completion of the digest with the first enzyme, the mix was dialyzed as a droplet on Millipore VM filters (0.05 um in pore size) floating on the surface of 5 ml of TE inside a small petri plate (Dr. Robert Devlin, personal communications). After 30 minutes, the droplet was recovered inside a clean 1.5 mi Eppendorf tube, one-tenth volume of the second restriction buffer was added and the reaction continued with the addition of the second enzyme.  GEL ELECTROPHORESIS Agarose Gels Agarose was dissolved in the appropriate volume of 45 mM Tris-Cl pH 8.3, 45 mM boric acid and 1 mM EDTA (0.5 x TBE) by boiling. After cooling to 55-60 °C (warm to touch), ethidium bromide was added to a final concentration of 1 ug/ml and the solution was poured into Plexiglass trays; sample slots were moulded by inserting a plastic comb at one end and the gel was allowed to solidify at room temperature. For routine analytical gel-electrophoresis such as monitoring the progress of a restriction digest, "mini-gels" measuring 6.5 cm x 10 cm x 0.4 cm were cast; for preparative gels, dimensions measuring 20 cm x 25 cm x 0.5 cm were used. For low percentage agarose gels (0.2-0.3%) required for the analyses of large molecular weight genomic DNA, a supporting frame of 0.5% agarose was cast with a Plexiglass mould and allowed to set before a 0.2%-0.3% agarose solution was poured into the center. Electrophoresis was carried out horizontally with the gel submerged in 0.5 x TBE at 2-5 volts/cm^. The gel was photographed over a UV transilluminator or by shadowing with a hand held UV lamp using Polaroid type 667 film in a Polaroid MP-4 camera. Acrylamide Gels A stock containing 40% acylamide and 2% bis-acrylamide in deionized water was stored in a brown bottle at 4 °C. Non-denaturing gels were prepared by mixing appropriate volumes of the acrylamide stock and 10 x TBE (final 0.5 x TBE) and filtering through Whatman glass microfibre filters (934-AH). The gel solution was degassed, ammonium persulfate was added to 0.06% from a 10% stock and N,N,N',N-tetramethylethylenediamine (TEMED) to 0.05 to 0.1%.  98 The preparation of denaturing gels was identical except solid urea was added to a final concentration of 8.4 M before addition of the catalysts. Glass gel plates, measuring 20 cm x 35 cm, were scrubbed clean with scowering powder and rinsed with water. After air drying, the inner surfaces of the plates were washed with 95% ethanol and then 2% dimethyldichlorosilane dissolved in heptane was liberally applied. A Kimwipe saturated with 95% ethanol was used to remove excess dimethyldichlorosilane. Mylar spacers between 035 to 0,5 mm thick were placed between the plates and then they were assembled together with electrical tapes. The acrylamide solution was poured slowly down one side of the space between the plates to avoid trapping air bubbles. Sample slots were cast by inserting a gel comb into the top of the gel solution and the sides of plates were tightly clamped to ensure good contact between the plates with the spacers and the gel comb. After polymerization (1 hour), the slot former and electrical tape along the bottom of the plates were removed and the gel was clamped into a vertical electrophoresis apparatus. Both the top and the bottom reservoir were filled with 0.5 x TBE. The slots were flushed clean just before the samples were loaded and the gels were run at 2-10 V/cm. After electrophoresis, the tape along the edges of the plates was removed and the plates separated with the aid of a thin spatula. Distilled water containing 1 ug/ml ethidium bromide was poured onto the the gel and distributed across the surface evenly with a glass spreader. After 20 minutes, the DNA bands were visualized and photographed under UV illumination. If the DNA was radiolabelled, the gel was wrapped in Saran wrap and autoradiographed directly. "Wedge" shaped sequencing slab gels were cast in siliconized plates as described by Chen and Seeburg (1985) The plates were initially separated by 0.35 cm thick Mylar spacers as in regular thin sequencing gels. However, progressively shorter strips of 0.17 cm thick Mylar (1/3,1/7 and 1/10 of gel length) were inserted into the bottom edge of the gel to increase the thickness to 0.86 cm. After the acrylamide solutions (6% or 8%) were poured between the plates, a slot former with 0.25 cm wide teeth was inserted at the top. Sequencing reactions of up to 1 ul were loaded into each slot immediately after it was flushed clean of urea and electrophoresis was carried out at 30-32 watts (constant power). The variable gel thickness  99 causes the DNA to migrate slower as it approaches the thicker bottom. This results in even spacing of all adjacent DNA fragments throughout the gel.  To prevent the "smiling" of  samples close to the edges, an aluminum plate (20 cm x 20 cm) was clamped to the gel assembly to maintain an even temperature distribution across the surface. After electrophoresis, the gel was transferred onto a sheet of Whatman 3MM paper by blotting and completely dried by a slab gel dryer (Bio-Rad).  The gel was then covered with a sheet of Saran wrap and  autoradiography was performed by placing a sheet of X-ray film in direct contact with the dried gel and exposed at -70 °C. With double-stranded sequencing, exposure time can be as short as one hour. Polyacrylamide gels for resolving Maxam and Gilbert sequencing reactions were regular "non-wedged" gels, and at concentrations between 8 to 20%. After electrophoresis, the gels were protected in Saran wrap without drying. Autoradiography was performed at -70 °C with the aid of intensifying screens whenever possible.  RECOVERY OF RESTRICTION FRAGMENTS FROM GELS Agarose Gels Small restriction fragments under 4 kb were recovered using DEAE membranes according to Dretzen et. al. (1981). After the DNA fragments from a restriction digest were sufficiently resolved by electrophoresis, a small piece of DEAE membrane (rinsed in distilled water) was inserted into the gel through a slit cut perpendicular to the desired DNA band. The gel was turned through 90°, with the membrane now nearest to the positive electrode and the DNA was transferred onto the membrane electrophoretically. It was then rinsed in 1 ml of 0.15 M NaCl, 0.1 mM EDTA and 20 mM Tris-Cl pH 8.0 by vortexing to remove any adhering agarose. The DNA was eluted from the membrane by incubation in 100 ul of 1.0 M NaCl, 0.1 mM EDTA and 20 mM Tris-Cl pH 8.0. The supernatant was extracted several times with n-butanol to remove the ethidium bromide. After adding one-half volume of 7.5 M ammonium acetate, the DNA was recovered by precipitation with two volumes of ethanol. The recovery of DNA from the DEAE membrane was approximately 60-70%, but precipitously less efficient  with  100 fragments above 4 kb. Larger restriction fragments above 4 kb were recovered either by electroelution into a dialysis sac as described (Size Fractionation of melting point agarose (LMP).  D. melanogaster DNA, P. 73) or by using low  The LMP agarose was cast as normal agarose except  solidification was at 4 °C. Restriction fragments were resolved electrophoretically at room temperature and specific fragments were excised with a scalpel. The gel slice was placed in a 1.5 ml Eppendorf tube and melted by heating at 70 °C for 10 minutes and sufficient TE was added to a final volume of 0.7 ml. The supernatant was extracted twice with equal volumes of phenol. The aqueous phase was transferred to a clean 1.5 ml Eppendorf tube and re-extracted twice with equal volumes of 1:1 phenolxhloroform (v/v), once with chloroform alone and the volume of the aqueous phase was then reduced by repeated extraction with n-butanol. Ammonium acetate was added (50% volume), mixed by vortexing and the DNA precipitated by addition of two volumes of ethanol. The pellet was rinsed several times with 70% ethanol, dried briefly in vacuo and redissolved in a small volume of TE. The concentration of the fragment was approximated by gel-electrophoresis against known marker DNA, Polvacrvlamide Gels The recovery of specific DNA fragments by eluting from acrylamide gel slices have been described, (see Maxam and Gilbert DNA Sequencing, P. 66).  SOUTHERN TRANSFER Transfer was performed as essentially  described by Southern (1975) with minor  modifications discussed below. To facilitate transfer, the DNA was partially depurinated by submerging the agarose gel in 250 ml of 0.25 M HC1 for about 30 minutes and then briefly rinsed several times in tap water (Alwine et. al, 1979). The DNA was denatured and cleaved at the depurinated residues  in situ with 250 ml of 0.5 M NaOH, 1.5 M NaCl for 30 minutes followed  by repeated rinses in tap water. The gel was then neutralized in 250 ml of 1.5 M ammonium acetate, 0.02 M NaOH for 30 minutes (Frei et al., 1983) and was then placed upside down on 2-3 sheets of Whatman 3 MM paper saturated with the same buffer (Wahl et al., 1979). A piece of  101 Hybond nylon cut to size was rinsed in the 1.5 M ammonium acetate, 0.02 M NaOH solution and then placed on top of the gel followed by several sheets of dry Whatman 3MM and a stack of paper towels. Transfer of the DNA onto the nylon membrane was essentially complete after approximately 2 hours (Meinkoth and Wahl, 1984). For acrylamide gels, the DNA fragments were usually small enough to obviate the acid treatment. The gel was denatured and neutralized with the same solutions and then it was transferred onto a dry sheet of Whatman 3MM paper to facilitate handling. To prevent the gel from adhering irreversibly to the nylon filter during transfer, the gel was covered with a thin layer of 0.5% agarose just prior to overlaying with the membrane (Gergen et al, 1981). The assembly was then placed on top of several sheets of Whatman 3MM wicks connected to a reservoir containing approximately 500 ml of 1.5 M ammonium acetate, 0.02 M NaOH. Transfer was carried out for approximately 16-24 hours. After transfer, the filter was dried in air and the DNA immobilized onto the filter by irradiation with UV for 3 minutes (Amersham). The filter was washed in 0.1% SDS at 65 °C for 20 minutes as a substitute for prehybridization. Hybridzation to radiolabelled probes and conditions for washes to remove excess probe were performed as described (see Screening Libraries, P. 93).  RESTRICTION MAPPING Low Resolution Restriction Endonuclease Mapping Low resolution mapping were routinely performed by single and multiple restriction digests as described by Danna (1980). In most cases, JL and cosmid mapping data were derived from composite maps based on subclones of smaller restriction fragments. Electrophoresis in agarose or polyacrylamide gels were used routinely to display the digestion products. Restriction Endonuclease Mapping by Partial Digestion Higher resolution mapping for smaller DNA fragments was performed by partial restriction endonuclease digestion as described by Smith and Birnstiel (1976). DNA fragments to be mapped were  gel  purified and radiolabelled at  their  3" ends with  [a^^P]-  102 deoxyribonucieoside-5'-triphosphates  and  the  Klenow  enzyme  (see  Maxam  Gilbert  Sequencing). The fragments were suspended in 50 ul of the appropriate 1 x restriction buffer with about 1 ug of unlabelled calf thymus DNA as a carrier.  Approximately 1 unit of the  appropriate restriction endonuclease was added, mixed and incubated at 37 °C or 65 °C for Taql. Aliquots of 10 ul were removed at various times and the reaction stopped by the addition of 1 ul of stop mix (0.25MEDTA, 12.5% Ficoll, 0.05% bromophenol blue and 0.05% xylene cyanol). The partially digested products were resolved by electrophoresis in a 5% polyacrylamide gel and the radiolabelled fragments were detected by autoradiography.  A Novel Restriction Endonuclease Mapping Method Bv Indirect Labelling with Sequencing Oligonucleotide Primers I have developed an alternative restriction mapping method based on the "indirect endlabelling" technique used by Wu (1980) to map the DNase I hypersensitive sites 5' to the  Drosophila heat shock genes hsp70. The advantage of the Oligonucleotide Indirect Labelling method (OIL) is that purification of end-labelled fragments "mapping" vectors are required such as those utilizing  is unnecessary and no special  sequences (Little and Cross, 1985)  or SP6,T7 andT3 promoters (Wahl et. al., 1987) as reference points. A recombinant plasmids, pUC or pEMBL, was digested with one of the restriction enzymes having a rare recognition site 5' to the sequencing primer annealing site at nucleotides 379395 (Yanisch-Perron et. al., 1985). It was then redigested with another endonuclease which cleaves 3' to the cloned insert. Changes in restriction buffers were accomplished by dialysis on Millipore filters (see Restriction Endonuclease Digests, P. %). The DNA was then divided into several aliquots, and one-tenth volume of the appropriate 10 x restiction buffers was added to each tube. Respective endonucleases for mapping (1-3 units) were then added to each tube, mixed and incubated at 37 °C for most of the enzymes used or 65 °C for Taql, Aliquots of the digests were removed at time points between 3-30 minutes and transferred to 1.5 ml Eppendorf tubes containing several pi of a stop mix (0,25 M EDTA, 12.5% Ficoll, 0.5% bromophenol blue and 0,5% xylene cyanol). The digested products were resolved in 0.8 to 1.0%  103  Fig. 10. Restriction mapping by oligonucleotide indirect labelling method. The insert (thin line) is first released from the vector (thick line) by restriction cutting, in this example, with Pvull at the 5-end including the priming site and with PstI at the 3-end within the polylinker cloning sequence. It is important to note that the 5' cutting site must be upstream from the sequencing priming site. A variety of such ideal sites are available, which are otherwise very rare cutters (consult Yanisch-Perron et. at, 1985). The 3' cutting site can usually be conveniently found in the polylinker sequence, but sites internal to the cloned inserts can also be used, as long as the region to be mapped is included in the released fragment. The distribution of sites for a particular restriction enzyme within the released insert is then determined by partial digestion (Hpall sites in this particular example), and the mixture of fragments (including both insert and vector) are then resolved by gel-electrophoresis, followed by Southern transfer onto a sheet of membrane. Fragments specific to the insert spanning from the fixed Pvull site to the various Hpall sites are then indirectly labelled by hybridization to the sequencing primers, either F l or Pex (table I). These specific fragments are depicted as thick rectangular blocks in the autoradiography cartoon. Since the hybridization is specific to the insert, fragment purification away from vector sequences would be absolutely unnecessary. As in all other published restriction mapping methods, the only limitation is the extent of resolution of the fragments by gel-electrophoresis.  104  P s t l P v u l l  P v u l l  Release insert  P s t l  P s t l Pvull  P v u l  Hpal I partial digestion  Resovled by gel electrophoresis  Autoradiography  F1 or Pex  >  Southern Hybridization  105 agarose gels containing 1 ug/ml ethidium bromide.  After electrophoresis, the DNA was  transferred onto Hybond nylon for about 2-4 hours by the method of Southern (1975) as modified by Meinkoth and Wahl, (1985). The filter was then hybridized to the forward sequencing, F l , primer radiolabelled with [y^PJATP andT4 polynucleotide kinase. Since F l does notabuttthe end of the released restriction fragment, care was taken to choose mapping enzymes that only cleave 3' to the priming site. The partially digested products specific to the cloned insert were detected by autoradiography for 6 hours to overnight (fig. 10). The entire procedure can be accomplished in 2-3 days, which is slightly faster than the well-established Smith and Birnstiel method. The latter method generally required more manipulations and a longer exposure time for autoradiography.  MOLECULAR CLONING IN PLASMID AND DOUBLE STRANDED Ml 3 BACTERIOPHAGE VECTORS Restiction Endonuclease Digestion of Vector DNA Approximately 5 ug of vector DNA was digested with the 2-4 fold excess of the appropriate restriction endonuclease in 50 ul of 1 x restriction buffer at 37 °C. After 2 hours, additional 1-2 units of the restriction endonuclease were added and the digest was incubated further at 37 °C to ensure that the vector was cut to completion. The enzyme was inactivated by heating at 68 °C for 15 minutes. Dephosohorvlation of Vector DNA To prevent religation of the vector DNA, the 5' phosphate was removed by calf intestinal phosphatase (CIP). It was noted that CIP did not absolutely require CIP buffer for activity but can  also function  efficiently  in  all restriction  buffers  (C. H. Newton,  personal  communications); however, the relative efficiencies under the two different sets of conditions have not been systematically explored. In general, about 0.5 units of the enzyme were added directly into the restriction mix for each ug of DNA. The dephosphorylation reaction was conducted at 45-55 °C for 30 minutes (BMC data). The enzyme was then inactivated by adding trinitriloacetic acid to lOmM (aZn chelator) and heating at 68 °C for 15 minutes (Frishauf et.  al., 1983).  Undesirable salts and monophosphates, which can inhibit DNA ligase, were  106 removed by dialysis on Millipore filter discs over TE for 30 minutes (see Restriction Digests, P. 96) and the vector DNA was diluted to a final concentration of 0.1 mg/ml with TE and stored as a stock at-20 °C. DNA Ligation Bacteriophage Ml3 vector DNA was used at concentrations between 20-50 ng, while plasmid vectors were used at concentrations between 100-200 ng per ligation reaction. The efficiency of recovering recombinant clones was empirically determined by mixing various quantities of insert DNA to a constant amount of vector and ligating them in a 10 ul volume containing 1 ul of 10 x T4 ligase buffer (66 mM Tris-Cl pH 7.5. 5 mM MgCl2, 5 mM DTT) and a final concentration of 0.4 mM ATP. For intermolecular ligation involving "sticky-ends", 0.1 units of T4 DNA ligase was added and the reaction incubated at 12-16 °C overnight; if the ligation involved blunt-ends, 1 unit of T4 DNA ligase was then added instead and the reaction incubated at 4 °C overnight. A recently published method was also used to enhance the efficiency of both "sticky-end" and blunt-end intermolecular ligation (Hayashi et. al., 1986). The T4 DNA ligase buffer was made as a 5 x stock (0.33 M Tris-Cl pH 7.6, 33 mM MgCl2, 50 mM DTT, 0.5 mM ATP, 50% PEG-8000 and 0.75 M NaCl) and the reaction was carried out at 16 °C for 30 minutes to 4 hours with either 0.6 units ("sticky-ends") or 7 units (blunt-ends) T4 DNA ligase. used to directly transform £.  The ligation products can be  coll with satisfactory results without further manipulations.  PREPARATION OF t3'-32p] tRNA Synthesis of Cvtidine 3'. 5'-Diphosphate The labelling of tRNA molecules at the 3' end was conducted as described by Tanaka et. al. (1980) and modified according to Dr. D. L. Cribbs (unpublished). Cytidine 3-monophosphate (6,1 nm) was phosphorylated at the 5' end with approximately 3 pmoles of [y32p]-ATP and 2 units of T4 polynucleotide kinase in a 10 ul mixture containing 10 mM Tris-Cl pH 8.3. 10 mM MgCl2 and 10 mM DTT. The reaction was incubated at 37 °C for 60 minutes and then inactivated by heating in a boiling water bath for 1 minute.  107 RNA Ligase-Catalvzed Addition of [5'-32pi- Cp P  1-2 ul of the product above, l5'-3 P]-pCp, was used to radioiabel the 3' end of 1-2 ug of tRNA 2  (purified tRNAs were a gift from Dr. I. C. Gillam and total 4S RNA was a gift from V. Dartnell) by using T4 RNA ligase in a 30 ul reaction volume containing 50 mM HEPES pH 8.3, 10% DMSO, 15% glycerol, 10 mM MgCi2,3 mM DTT and 5 mM ATP. The ligation reaction was conducted at 4 °C for 16-24 hours and terminated by adding 1 ml of 2 x SSC.  DNA DOT BLOTS Nylon membranes were washed in distilled water and then rinsed in 1 M ammonium acetate. They were then placed on a platform consisting of dry paper towels on the bottom and moist Whatman 3MM paper on top. The assembly was covered in Saran wrap to prevent drying of the membranes. Plasmid DNA (both single- and double-stranded) was denatured in 0.3 to 0.4 M NaOH for 10 minutes at room temperature and then chilled on ice.  Just prior to spotting onto the  membrane, the DNA was diluted with an equal volume of cold 2 M ammonium acetate. The samples were taken up with a Pipetman and the DNA was delivered manually onto the membranes as small spots of 2 to 3 mm. The DNA spots were rinsed with drops of 1 M ammonium acetate and then the membranes were washed in 200 ml of 4 x SSC to remove dust particles followed by drying in air (Kafatos et. al., 1979). The DNA was then immobilized onto the nylon by UV irradiation for 3 minutes (Amersham). The conditions for prehybridization, hybridization and washes to remove excess probes were identical to those described for screening genomic libraries (P. 93).  ORIENTATION OF tRNA* 7Ser GENE TRANSCRIPTION DNA fragments containing tRNA47^  er  genes were cut with two different restriction  endonucleases and cloned into either Ml3 or pEMBL vectors. The two different restriction ends would permit cloning of the DNA fragment in only one orientation with the same compatible ends within the vectors. tRNA4,7Ser genes cloned into bacteriophage M13 were  108 transformed in  £. coli strains JM101 or JM103 and the transformants were plated on YT plates.  In the case with pEMBL vectors, the transformants were superinfected with the helper phage IR1 before plating the  in vivo packaged virions. Either plaque lifts or DNA dot blots were  prepared from the virions and probed with tRNA47^ -specific oligonucleotides GTg and GT7, er  which correspond to nucleotides 1 to M of the non-coding strand and nucleotides 40 to 58 of the coding strand, respectively.  The direction in which the t R N A 4 7  transcribed can be deduced from the hybridization results with the  Ser  genes were  strand-specific  oligonucleotides coupled with the orientation of the cloned insert in question (fig. 11).  109 EcoRI  Xbal  EcoRI M 111 r T 11111 i 11111111111  Xbal  EcoRI  GT7  EcoRI  Xbal  3"  5"  5'  GTCTflflGGGflGflCCCTCGC CRGRTTCCCTCTGGGflGCG •  — GCRGTCGTGGCCGfl +1  3"  +14 •  GT6  EcoRI  Xbal  3'  5*  5'  flGCCGGTGCTGRCG —CGCTCCCRGRGGGRRTCTG +58  3  -  TCGGCCRCGflCTGC• + 10  +40  +  1  ^  F i g . 11. Transcription orientation of tRNA genes. Restriction fragments containing tRNA genes are excised with two different enzymes (for example, EcoRI and Xbal), The fragment is then force-cloned in one orientation in vectors capable of producing singte stranded DNA (Ml3 and pEMBL). One strand of the DNA is extruded into the growth medium as virions. The purified DNA is used as templates in either sequencing reactions or dot blot hybridization using the two tRNA47^ gene-specific oligonucleotide primers GT$ and GT7. At the bottom of the figure are two possible orientations of a hypothetical tRNA gene. The possibilities can be differentiated by their hybridization behaviour with respect to the two oligonucleotides. er  HO CHAPTER I  Characterization of the Entire t R N A  Sef  Gene Cluster at Polvtene Bands 12DE bv Chromosomal  Walking The two hybrid tRNA$  er  gene sequences, 474 in pDt73 and the 774 in pDtl6R, that were  studied by Cribbs (1982) have been hypothesized to be products of gene conversion between the bona fide 444 and 777 genes. However, based on these limited results, the alternative possibility implicating standard reciprocal exchanges between the two bona fide gene types cannot as yet be dismissed. The two alternative possibilities can be distinguished by the fact that gene conversion involves unidirectional transfer of genetic information, and hence, the hypothetical and reciprocal hybrid sequences (747 and 447) would not be expected. On the other hand, reciprocal exchanges involve bidirectional transfer of genetic information and barring differential selection on the recombinants, the process should issue an equal number of reciprocal hybrid genes. As an initial step in determining whether gene conversion can be sustained as a viable hypothesis, the entire gene cluster at 12DE was characterized by a chromosomal walk. The results, along with those obtained for the autosomally linked tRNA47$ genes (D. A. R. er  Sinclair, unpublished observations), should provide the critical insight into whether these hybrid sequences are reciprocal products or not. The walk at bands 12DE was initiated by using plasmids pDt27R, pDt73, pDtl6R and pDtl7R as entry probes ("R" for these small plasmid clones designates reclones of single Hindlll  Drosophila inserts as described in Dunn et al,  1979b). Each entire plasmid, or a purified equivalent restriction fragment, was radiolabelled by nick-translation and then used to screen genomic lambda or cosmid libraries by hybridization to filter-bound recombinant clones (See Methods and Materials). DNA isolated from such putative genomic clones were initially characterized at low resolution by mapping with hexanucleotide recognizing restriction endonucleases.  Unique restriction fragments  that were the furthest from the initial probes were purified and then were in turn used as radiolabelled probes to isolated adjacent DNA segments in both directions further along the  {11  chromosome. The identification of an overlapping clone would thus represent a single step in the chromosomal walk. The walk was continued stepwise as such, and the final nested set of overlapping genomic clones should, when properly aligned with their restriction sites coincident, yield a composite map representative of a large chromosomal region. The validity of the chromosomal organization, as impressed by the ensemble of recombinant molecules, can be assured to a high degree of confidence by the consistency in the restriction maps in the overlapping segments and by genomic Southern blots. All I and cosmid recombinant molecules derived from the Oregon-R libraries are suffixed with "R" (not to be confused with the designation as in the above entry probes); otherwise, they are derived from the Canton-S library (Maniatis et al., 1978). One of the reasons for using recombinant libraries from both the Oregon-R and Canton-S  Drosophila strains is that the genes residing in pDt73 (474) and pDtl6R (774 and 777) represent three of the four permutations of the tRNA4,7^ structural gene sequences thus er  far encountered at 12DE. I have compared these homologous genes between the two different fly strains to empirically ascertain the dynamics of genetic exchange at this chromosomal site. It was reasoned that if the elapsed time separating the different strains is sufficiently long and if the genetic exchange process is dynamic, then perhaps other permutational forms may exist at the homologous sites. The results showed that this gene cluster is composed of eight t R N A 4 7  Ser  and six tRNAArg  genes. Despite repeated efforts, this chromosome site remained as four separated domains and has not been joined sucessfully as a single coherent region. One reason could be that this region is dense with repetitive sequences, which may not be easily maintained in most  £. coll  hosts used in library construction. None of the X-linked tRNA^ genes recovered showed the er  expected configurations anticipated from reciprocal exchanges and those homologous genes from Canton-S and Oregon-R remained identical in their structural sequences. Portion of the results have been contributing data published in Cribbs et al., (1987b).  112  RESULTS 1 (A). Chromosomal Walk in the pDt?3 Region At least 335 kb of contiguous sequence was isolated, almost all from the Canton-S X library (Maniatis et. a/, 1978). Each positive phage clone occurred at the expected frequency of one in 12,000 (or one per genome) before impasses in both directions were encountered (fig. 12). In an attempt to overcome the impass, the 2.0 kb EcoRI fragment was isolated from JC736 as a probe (coordinate 4.5 to 6.5) to screen a cosmid library. This library contained BamHl cut Oregon-R DNA cloned into the cosmid vector pJB8. Of 40,000 colonies screened, one isolate was obtained. However, this isolate cos40.1R overlaps with the Jl clones by "25 kb, and giving only 1.5 kb of new sequences extended to the left (coordinate 1 to 1.5).  An attempt to establish  further overlapping clones using the 0.5 kb BjtmHI+HjncII fragment (from coordinate 0 to 0.5) to screen both the Canton-S library and a newly constructed Oregon-R lambda library failed to reveal any positive phage. A single t R N A  Ser  gene in the molecular walk was subcloned from J1731 as a 4.2 kb EcoRI  fragment into pUCl3 and sequenced with the gene specific primers GT6 and GT7. GT6 is identical in sequence to nucleotides +1 to +14 of the non-coding strand of the tRNA47^  er  genes (that is, corresponding to the tRNA sequence), while GT7 corresponds to nucleotides +40 to +58 of the coding strand (template). The primers yield complementary sequences within the structural genes (except the priming sites) in addition to both 3' and 5' flanking sequences, respectively. To confirm the sequence data, the 3 7 kb HindHI+EcoRI restriction fragment containing the tRNA gene was also subcloned into pEMBL8+ and sequenced using oligonucleotide F l . As in the original Oregon-R isolate in pDt73, the Canton-S gene is also a hybrid 474 gene based on the three diagnostic nucleotides (fig. 13). The gene is designated as pCS474. Comparison of 5-flanking sequence in the two different fly strains show only 2.8% divergence. Diffferences are predominantly single nucleotide substitutions or small deletions of one to two nucleotides. A poly-T putative termination signal occurs at 19 base pairs 3' to the structural gene in pCS474 but was removed in pDt73 (Cribbs, 1982) during cloning as the result of a fortuitous Hindlll site (beginning at nucleotide 250 in fig. 13) between the  !  1.3  F i g . 12. Molecular walk in the pDt73 domain. Approximately 33.5 kb was obtained before impasses were encountered. The coordinates of the walk are shown in the top line, measured in kilobases (kb). The dashed line shows the relative location of the entry probe, pDt73, in the walk. The single hybrid tRNA * gene, 474, is depicted as an arrow head pointing in the direction of transcription. The restriction sites, BamHI (B), EcoRI (E), and Hind 111 (H) sites in the chromosome region are shown above the thick line. Below, the series of overlapping A. phages from the Canton-S and the cosmid clone from the Oregon R libraries are individually identified with numbers. The tick marks at the bottom represent distribution of the sites for Pstl. Hindi. Sstl and Xbal. The reiteration for these four restriction sites, beginning at coordinate 24. marks the left-most boundary of the Stellate sequences described in the text. 5  1  114  15 _1_  B  I  H  E  EH  ii  I I  30  20 I  EH  _kb  _)  E  I  B  I  733  Pstl Hindi Sstl Xbal  II  I  III  I  I I  II I  I  I I  I  II I  I I  I  I  I I  I  I I  I  I I  I I  I I  I I I  H5  +50 t t t a t a t t t a gttateagtt  etgsaattee a a t t t a t a t t  ateagcttag  +100 a t t t t g c a c a agatatggaa a a t a c t t t t t g t t t t t g t a a  attaatataa  +150 tactcttaac tttatattag tttcttaaat tttattgata  ttttttttgc  +200 gcatatatca agGCflGTCGT GGCCGRGIGG TTflflGGCGTC TGflCTflGRflfi  +250 TCflGRTTCCC TCTGGGRGCG TRGGTTCGRfl TCCTRCCGfiC TGCGtttgta  +300  agcttaattt  g t a t t t t t a c aaacaaaaaa aaatactatt  tagcctcacc gcggaaattg tatatgtaag  ataattatag  tgcatt  F i g . 1 3 . DNA sequence of t R N A 474 gene from Canton S is identical to its homologue i n Oregon-R. The structural sequence of the gene is shown in capital letters. The three diagnostic nucleotides at positions 16, 34 and 77 within the gene are underlined. The Hindlll site (AAGCTT), situated in the trailer sequence is at nucleotide 250 in the diagram (dotted underline). S e r  116 structural gene and the poly-T sequence.  Hence, no comparison of 3-flanking sequence  diversity is possible.  (B). Interspersed and Tandemlv Repeated Elements: For convenience, many of the original walking probes were simply prepared as EcoRI fragments.  However, probing with some of these fragments almost always retrieved a  confounding background of "false positive" clones bearing varying degrees of crosshomology with the original probes, suggesting that the pDt73 region contains many sequences that are present elsewhere in the genome. A subset of these sequences, particularly those in the immediate vicinity of the 474 gene, were usually short (<1.0 kb) but highly redundant or shared only a limited degree of homology with other sequences in the genome. When probes containing these repeated elements were radiolabelled and hybridized to Southern blots of restriction enzyme cleaved genomic DNA, autoradiography revealed only the bands corresponding to the unique sequence DNA fragments present in the probe. The repeated sequences present in the probe hybridized weakly throughout the genome and so contributed only a feeble background to the autoradiograph (for an example, see fig. 29). Repeated sequences fitting this description have been characterized by Pirrotta et al. (1983) and were termed "repetitious", as opposed to the longer "repetitive" elements sharing extensive and more conserved homology. In order to obtain unique sequences as walking probes, each recombinant clone was subjected to fine structure mapping using Pstl. HiacII, SstI and Xbal (fig. 12, bottom) and repetitious elements localized by "reverse" Southern (Pirrotta et a/,1983).  In this procedure, the restriction enzyme cleaved recombinant  molecules were resolved by gel-electrophoresis and the DNA fragments transferred onto a sheet of cellulose nitrate or nylon membrane.  Total genomic DNA radiolabelled by nick-  translation, serving as the probe, was hybridized to the filter-bound DNA; fragments containing repetitious elements gave a stronger signal than expected due to the additive contribution from many genomic sites (for an example, see fig. 60). From such mapping and hybridization studies, the walk can be partitioned into two zones based on the organizational  117 pattern of the repeats with an abrupt transition boundary established approximately at coordinate 24. The repeated sequences to the left of the boundary appear to be interspersed within the walk and elsewhere in the genome; whereas, to the right of the boundary, the repeated elements are tandemly arranged. The interspersed repeats are illustrated by two cases below, which show opposite reiteration patterns. The 1.5 kb EcoRI* BamHl fragment from coordinate 0 to 15 contains sequences that are dispersed throughout left of the boundary as shown by its intense hybridization encompassing the entire cos40.1R clone (fig. 14, panel C, lane 7). In contrast, the distribution of the 2.0 kb EcoRI fragment (coordinate 4.5 to 6.5) appears to be restricted to an S.l kb region in the cosmid clone (coordinate 14.5 to 23.6) (fig. 14, panel B, lane 7).  Both fragments, in addition, also share limited homology with JL746  derived from a chromosomal walk in a separate region at 12DE (fig. 14, panels B and C, lanes 16). To the right of the boundary, the region is composed of 1.3 kb tandem repeats characterized by the recurrence of restriction sites for the enzymes H i n d i . SstI and Xbal (fig. 12), but refractory to cutting by the more commonly employed enzymes EcoRI. Hindlll. BamHl or Pjtl. When one repeating unit was isolated as an Sstf fragment from J1735 and hybridized to genomic DNA cleaved with EcoRI. Hindlll and BamHl. predominately one intense band larger than 23 kb was detectable within an hour of exposure (data not presented).  There are also  some minor diffused bands of reduced intensity in approximately the 30, and in the 4.5 kb range in both the Hindlll and the BamjII digests. When the genomic Southern hybridization was repeated using DNA from the  D. melanogaster mutant Df(l)glfB/In(l)AM, heterozygously  deficient for the chromosomal bands 11F10-12F1, the hybridization intensity of the large molecular weight band was decreased by 50% (fig. 15) Thus, a large fraction of these repeats appears to exist as a large contiguous cluster on the X-chromosome. This mapping experiment also shows that the smaller 30 and 4.5 kb hybridization bands are probably Y chromosomespecific, since they are only evident in the male DNA Hn(l)AM/Y]. repeat unit was used to probe genomic DNA from distantly  related  sibling  species  of  When the same SstI  D, erecta, D. teissieri and D. yakuba, thre  melanogaster (see  phylogenetic  tree,  fig.  118  F i g . 14. Interspersed repeated sequences shared between pDt73 and pDtl7R domains. Panel A is restriction digests of V746, derived from the pDt!7R molecular walk. (Lane 1): Hindlll; (lane 2): Hind 111 + BamHI: (lane 3): icoRl: (lane 4): EcoRI + BamHI: (lane 5): BamHI: (lane 6): Hindlll • EcoRI. (Lane 7) is cos40.1R cut with BamHI + EcoRI. Some faint bands present in the digests are probably the result of minor cross-contamination by 7.1722 DNA (a clone overlapping with 31746, see fig. 16) during loading. Panel B shows the hybridzation pattern of the 2.7 kb EcoRI fragment isolated from the pDt73 chromosome walk at coordinates 1.5 to 4.2. This fragment is repeated at least once internally in the same domain, as shown by its hybridization to the 3.9 kb EcoRI fragment at coordinate 1S.6 to 22.5 (lane 7), but it also shows very strong homology to a region in the pDtl7R walk mapped to coordinate 25 (see fig. 16). Panel C shows the hybridization pattern of the 1.5 kb BamHI+EcoRl fragment from coordinates 0 to 1.5 in the pDt73 chromosome walk. This fragment has a hybridization pattern just the reverse of the probe described in panel B. It contains sequences that are reiterated many times within in the pDt73 domain (lane 7), but only show limited homology with sequences in the pDtl7R walk. The localization of these sequences in the pDt!7R walk has not been precisely determinined but there are at least two copies, one of which occurs near the 777 gene in 31746 (4.0 kb EcoRI band in lane 3) and the other, with much stronger homology, is located in a 3-5 kb EcoRI fragment >20 kb away downstream from the 777 gene (lane 3). This smaller band was later shown not to be derived from A.746, but from the next overlapping phage 311722. present as crosscontaminant in the gel lane during loading.  119  JQ  O  06  IS. CN*  1 1 *  1  Q  CN CN I  120  F i g . 15. Hybridization of a 1.3 kb SstI fragment corresponding to one repeat unit of the Stellate sequences to f l y strains deficient for polytene bands 12DE. Top panel: lane 1 is DNA from ln(l)AM/Y (male) Inote that ln(l)AM is a rearranged ^-chromosome but has no deletion); lane 2 is DNA from Df (1 )g'fB/ln( 1) AM (female heterozyous for the deletion at I2DI-12FJ) (see fig. 52 for cytology map); lane 3 is DNA from In(l)AM/In(l)AM (female with no deletion). A l l genomic DNAs were cut with Hindlll prior to resolution by gel electrophoresis. The size markers are Hind 111-generated X DNA, as shown in the left-most gel lane (kb-kilobases). After gel electrophoresis, the DNAs were transferred to a sheet of Hybond nylon membrane and hybridized to a nick-translated SstI probe, corresponding to one unit of the tandem repeats. After removal of excess probe by washing (see Methods and Materials), the bands were detected by autoradiography for approximately 5 hours at -70 °C and was then reprobed with nicktranslated pDt5 as an internal control to monitor the total amount of DNA loaded in each lane. This second probe contains a 4.2 kb Hindlll segment of Drosophila DNA derived from polytene bands 23E on chromosome 2L and should not be affected by the deletion (Cribbs et. at., 1987b). After removal of excess probe as above, the membrane was exposed for 5 hr, 10 hrs and 15 hours. The figure shown here is from the to hr exposure. The two bands labelled as "A" and "B" are the main Stellate band and pDt5, respectively. Middle panel shows the quantitation of hybridization bands in the Df(l)g'fB/In(l)AM mutant as determined by scanning with a Bio-Rad Video Densitometer (Model 620). Bottom panel is a similar scan as described above, except the mutant genotype is In(l)AM/In(l)AM. The shaded areas indicated in the densitometer tracings were cut out and the areas compared by weighing (the actual peaks used in the weighing were about 4-fold the areas as shown in the figure). The results showed that approximately 50% of the hybridization intensity in band A was removed by the deletion at the 12D -12Fj region, retative to the internal control pDt5. Since the response of the X-ray film may not be in the linear range, it is not certain whether the entire cluster of Stellate sequences has been removed on one homologue in the mutant Df (1 )g'fB/ln( 1) AM, but it does suggest that a large proportion of the Stellate cluster overlaps with the deleted region. In lane 1 containing male DNA, unique band clusters are observed at positions corresponding to "3.0 kb and 4.5 kb to 5-0 kb, which are almost certainly Y-chromosome specific regulatory sequences as discussed in Livak (1984). t  122 28), there was total absence of hybridization (data not shown).  The simplest explanation  would be that these repeats have been acquired only recently in the  melanogaster group.  The above observations bear some resemblance to those obtained by Livak (1984) with a similarly cloned sequence known as  Stellate thai is tandemly reiterated 200 times at polytene  bands 12F. Due to the difference in the choice of restriction enzymes in the present mapping studies, it was difficult to determine if the sequences were identical to Stellate. However, both sets of sequences share a 1.3 kb repeated pattern of H i n d i sites and also both show the conspicuous absence of restriction sites for the enzymes EcoRI. Hindlll, BamjU and Pstl well, the sequences in his studies are also species-specific, confining to the  As  melanogaster  group. Recently, a clone, pSX1.3, containing one unit of the Xbal repeat was obtained from Dr. K. Livak. Hybridization experiments showed that it is homologous to the SstI fragment of X735 (data not shown).  Hence, it appears that the X-chromosome segment 12E to 12F in D  mehwogasteris occupied by a large block of Stellate sequences.  2. (A) Chromosomal Walk in the pDtlTR Region pDtl7R was originally cloned as a 10 kb Hindlll fragment propagated in the  recBC £ coll  hostSF8 as reported by Dunn et al. (1979b). Subsequent culturing of the plasmid in another hostC600  (rec*) yielded a deleted variant containing only a 4.7 kb Hindlll insert reported by  Cribbs (1982). It is possible that the original insert may have contained inverted repeats that were recognized and cleaved by the functional  recHC* restriction system in the latter £ coll  strain (Boissy and Astell, 1984; Leach and Stahl, 1983; Nader et al., 1985). Nevertheless, using the deleted variant as a starting point, overlapping phages were initially obtained from the Canton-S library.  Three lambda clones U71, 71746 and 311722 were obtained but at a low  frequency of approximately 1 in 20 to 40  Drosophila genome equivalents of phages before  impasses were met (fig. 16). Many of the sequences contained within X746 (see fig. 14 a and b) and particularly end fragments in M722 at coordinate 40 to 45 are repeated and share homology with cos40.1R in the pDt73 region discussed above. No more authentic overlapping clones were obtained from the Canton-S library despite exhaustive screening.  F i g . 16. Chromosomal walk in the pDtlTR domain. At least 45 kb of DNA was collected from two different X libraries before impasses were encountered. The coordinate line in kilobases is indicated at the top of the diagram. The residual DNA segments in the deleted version of pDtl7R is shown as dashed lines "a" and "b", which have been fused during the deletion events. Hence, there were at least two independent deletion events, between coordinates -12 to -16 and between coordinates -17 to -17.5, that consequently led to the deleted version. The restriction sites for the chromosomal domain are indicated above the thick line with the restriction sites B-BamHI; E-EcoRI; H-Hindlll; S-Sstl: X-Xbal. Three tRNA genes, 444*. 774, and 777 are found in this region, and their directions of transcription are indicated by small arrow heads. In addition, one tRNA^gene is associated with a BamHl site, approximately 200 bp upstream from the 444* gene. The direction of its transcription is shown by the wide stemmed arrow. The 31 phages representative of the chromosomal region are shown at the bottom as overlapping lines and are denoted individually by numbers. A i l coding regions analyzed were derived from A. 171. Ser  I2*f  |Aigi2.5  E  H H  H  X  BE  X E E E  F  H  EXSHBE  BE  171  EE  E E  BE  H  125 The phage M730R was obtained from an unampiified EcoRI partial library cloned into JIEMBL4. The other phage, U731R, was obtained from a Mbol partial library cloned into the BamHl site of 3LEMBL3.  Further screening of the XEMBL3 library using the EcoRI»Sail  fragment from coordinate 0 to 1,5 as a probe cross-reacted with 17 different putative phages that are unlikely to be overlapping clones based on limited restriction analyses and Southern hybridization. However, some or all of them could represent rearranged variants of different extents from authentic overlapping clones as a result of unstable sequences; although they were not analyzed further. (B) Localization of t R N A  Ser  Genes Within the Walk  Of the three overlapping phages (M731R, JU71 and M722) representing the entire 44,5 kb, only 31171 contained tRNA^ genes clustered within an expected 10 kb Hindlll fragment (see er  summary fig, 51, lane 8). Digests with either EcoRI or EcoRI and Hindlll in combination showed three bands of hybridization, suggesting that at least three genes are contained within this phage (data not presented). The deleted fragments within pDtl7R were mapped by comparing restriction sites within the walk or plasmid subclones of relevant regions and further confirmed by hybridization studies. The results are summarized in the chromosomal walk depicted in fig. 16, Of the two similar sized residual fragments ("a" and "b") that escaped the deletion events, only fragment "b" hybridizes to both pDtl7R and tRNA47S  er  gene-  specific probes (using both plasmids and gene-specific oligonucleotides). This is assumed to represent the remnant of pDtl7R that contains the original tRNAySer gene. The other genecontaining EcoRI fragments of 4.6 kb and 1.8 kb that have been deleted in pDtl7R were subcloned into pUCl3 and designated as pE4.6 and pE1.8, respectively.  Both plasmids were  digested with Haelll, Ddel and Hpall and probed with oligonucleotide GT7.  The Southern  hybridization results were consistent with each plasmid containing only a single t R N A  Ser  gene. The tRNA genes within the  subclones were mapped by an improved strategy,  Oligonucleotide Indirect Labelling method (OIL) as described (consult Methods and Materials, P, 102), pE4.6 was cleaved to completion with Pvull and JBstI in combination to release the  126  F i g . 1 7 . Restriction mapping of p£4.6 by the oligonucleotide indirect labelling method. The Drosophila DNA was first released by complete digestion with Pvull and £§il. The mixture of fragments were then partially cleaved with "mapping" enzymes Hinfl. Taql and Hpal I. A l l of these enzymes are known to cleave within the structural tRNA^^er except the Hoall. the other two also cut within the structural tRNA7^ gene as well. The partially digested products from various time points (3' to 30') were then resolved by electrophoresis in a 1.5% agarose gel. Panel A shows a typical result after gel electrophoresis, where a complex pattern of bands comprising of both Drosophila and vector DNA is seen. The DNA was then transferred onto a sheet of Hybond nylon filter and probed with F l , the universal sequencing primer. Panel B shows the hybridization pattern with the sequencing probe, which would only anneal specifically to and indirectly label the Drosophila specific DNA fragments close to the Pvull site. Panel C shows restriction map of this 4.6 kb fragment reconstructed from the above hybridization pattern in (B). RI-EcoRI are the actual cloning sites flanking the Drosophila insert. The small dotted line to the bottom left of the restriction map indicates the Fl hybridization site. Hinfl ( • ) , Taql ( • ) , and Hoall ( A ) g e n C i  er  sites are shown above the map. The small arrow head indicates the site of the tRNA^ gene, as revealed by almost overlapping sites for the mapping enzymes, and points in the direction of transcription from 5' to 3'. Size standards used were generated from Hindlll cut X and Hinfl cut pBR322 DNAs. The sizes corresponding to these fragments are shown on the right of panel B. er  128 Drosophila insert. After inactivation of the enzymes by heating at 68 °C for 15 minutes and the salts removed by micro-dialysis over TE, the digest was divided into three aliquots. Restriction buffers were appropriately adjusted and then 1-2 units of "mapping" enzymes Hinfl, Hpall and Taql. which cleave within the tRNA4,7 each of the three respective reactions.  Ser  structural gene, were added to the  At various time increments, small aliquots were  removed and the restriction endonuclease inactivated by adding EDTA to a final concentration of 50 mM. The DNA fragments were resolved by agarose gel-electrophoresis, transferred onto a sheet of Hybond nylon membrane for about 3 hours (Wahl etal., 1979; Meinkoth and Wahl, 1985) and then probed with 32p- diolabelled universal sequencing forward primer, F l . The ra  sizes of the hybridization bands would thus reveal the spatial distribution of the restriction sites in question, relative to one end point of the restriction fragment indirectly labelled with F l . The results of such a mapping experiment are displayed in fig. 17.  A putative tRNA$  er  gene is localized at approximately 0.5 kb from the EcoRI site (or 0.6 kb from the Pvull end) based on the diagnostic Hinfl. Taql and Hpall cleavage patterns. Similarly, pE1.8 was digested to completion with Pvul and PstI 5' and 3' to the cloned insert, respectively.  The released insert was partially digested with Hinfl, Hpall and Taql,  transferred onto nylon membrane and hybridized with F l as described above. The putative tRNASer gene is estimated to be "i .0 kb from the EcoRI site (or 1.12 kb from the Pvul end) and it is displayed in fig. 18 (bottom). Both of the above gene-localization studies have also been independently corroborated by employing the well-established, but more cumbersome, Smith and Birnstiel method (Smith and Birnstiel, 1976). The 1.5 kb and 0,9 kb EcjjRI+Ibjtl fragments from pE4.6 and pE1.8 respectively, predicted to contain tRNASer genes were purified from agarose gels. They were then end-labelled at either the EcoRI or Xbal site with [c<32p]-dATP and [ct32p]-dCTP, respectively, with the Klenow enzyme.  The HaelH. Hinfl. Taql and Hpall restriction  endonuclease sites within the fragments were then mapped by partial digestion under conditions as described in Methods and Materials (P. 101). The localization of the tRNA$  er  genes by this latter method entirely agrees with predictions based on the OIL data obtained  129  F i g . 18. Localization of the t R N A ^ gene i n pEl,8 by the oligonucleotide indirect labelling method. The 1.8 kb EcoRI Drosophila insert was first released by cutting with Pvul at "80 bp 5' to the sequencing primer site and at the Esll site 3' to the insert within the polylinker cloning site. The mixture of DNA fragments were partially digested with "mapping" enzymes Hinfl. Taql and Hoall. exploiting the fact that all three sites occur within the coding sequence of tRNA4^ gene (except with the Hoall site which does not occur in tRNA7^ gene). Aliquots were removed at various time points as indicated at the top of the autoradiography (from 3' to 30) and the products resolved by gel electrophoresis in a 21 agarose gel. The DNA fragments were transferred onto a sheet of Hybond nylon sheet and the 5' end of the Drosophila insert was indirectly and specifically labelled by the universal sequencing primer. The location of the tRNA^ gene within the EcoRI insert is shown as a small arrow head at the left edge of the autoradiography. At the bottom is the restriction map constructed based on the sizes of the partial products. The ends of the Drosophila insert are indicated by the EcoRI cloning sites (RI). Hinfl ( • ) , Taql (n), and Hoall ( A ) sites are indicated. The first Hinfl and Taql sites (closest to the priming site) are located within the polylinker and are omitted from the restriction map. The overlapping Hjn.fi, Taql. and HMII sites representing the gene is indicated by the arrow head below the restriction map, pointing in the direction of transcription. The dotted line to the left below the map indicates the primer site. Markers are Mindlll cut l and Hinfl cut pBR322. Their sizes are shown on the right of the top panel. e r  er  er  er  130  RI 1  1  1  i  .  i  i  RI II  . . . . 200 bp  I  131 above (data not presented).  (C) Sequence Analyses of pE4,6 and pE1.8 Small aliquots of the two EcoRI* Xbal fragments used in the Smith and Birnstiel mapping studies described above were also cloned into the bacteriophage vector Ml3mpl8. Since the fragments contain two different restriction ends, they can be inserted in a predictable orientation within the polylinker sequence, This experiment served two purposes: first, the smaller size of the insert should reduce the probability of deletions during the propagation of templates in the  £ co/Arec*) strains JM101 or JM103 for DNA sequence determination, since  the DNA segments were proven unstable previously when propagated in C600 (Cribbs, 1982); second, the direction of the tRNA gene transcription can be deduced by hybridization with strand-specific oligonucleotide probes GTg and GT7 homologous to different parts of the tRNA^jSer genes (see Methods and Materials, P. 107), and thus provides further supporting evidence to the mapping studies. The 1.5 kb EcoRI*Xbal fragment of pE4.6 can only be primed in dot blot hybridization and DNA sequencing with GT7, while GT6 gave only weak smears resulting from non-specific hybridization in both cases. Both sets of results here, along with the mapping studies above, provided conclusive evidence indicating the direction of transcription of the tRNA gene is from the EcoRI site towards the Xbal site as depicted in the molecular walk in fig. 16. The initial DNA sequence data obtained above also revealed a Rsal site 28 bp 5' to the gene and this enzyme was used to generate a 200 bp fragment cloned in pEMBL8-. Singlestranded template was prepared from the plasmid by superinfection of IR1 and sequenced using the reverse primer, RI. Subsequent confirmatory data were obtained by the supercoil sequencing method using both gene-specific primers. The sequence shows a tRNA4^ -type er  gene based solely on the three distinguishing nucleotides.  However, the gene sequence  deviates slightly from that expected by a C-T transition at the tip of the extra arm at nucleotide 213 (designated as 444* gene) as shown in fig. 19. Also, a fortuitous box B promoter sequence, CGAAT, at position 229 is repeated three times at the 3' end of the structural gene (starting at  132 position 245). Whether this duplicated promoter sequence has any influence on transcription rate is not known. Within this EcoRI*Xbal fragment, there is also aBamH I site corresponding to a tRNAArg gene with the restriction site constituting part of the coding sequence immediately 3' to the anticodon (coordinate 15 in fig. 16). This gene, designated as pArgl2.5. was characterized by C. Newton (personal communications) and will not be reported here. Similarly, in both dot blot hybridization and DNA sequencing, the 900 bp EcoRI*Xbal fragment from pE1.8 can only be primed with GT7. This again indicates the direction of transcription is from the EcoRI site towards the Xbal site. In this case however, it is in the opposite orientation relative to the 444* gene. An Mbol site conveniently located 25 bp 5' to the gene was identified from the initial sequencing and was used to generate small Mbol inserts cloned into pEMBL8-. Positive clones were isolated and the plasmids were converted into single-stranded templates and sequenced using primer RI. This sequence was confirmed by the supercoil sequencing method described by Chen and Seeburg (1985) using the original pE1.8 as the template and the two gene-specific primers. The sequence shows a hybrid 774 gene and is designated as pl7-774 (fig. 20).  3. (A) Chromosomal Walk in the pDt27R Region The initial screens of the Canton-S and two different Oregon-R 3LEMBL3 libraries (which contained Mbol partially sheared DNA from either adult flies or tissue culture cells) with pDt27R were unsuccessful despite over 30 genome equivalents of phages being screened from each.  However, one positive X272R was eventually obtained from an unamplified EcoRI  library cloned into JLEMBL4 (fig. 21). "Walking" probes prepared from this phage again failed to obtain any positives from any of the aforementioned libraries.  420R is a plasmid clone  containing a polymorphic 17 kb Hindlll fragment isolated from the  D. melanogaster strain  420 (a gift from C. Newton), which extended the walk to the right for another 10 kb. CosP273R contains a 42 kb Mbol insert extending to the left for another 30 kb. It was obtained after screening more than 20 genome equivalents of cosPneo clones. Since sequences from this  +50 agtatgttaa t c c t t t t a t t a t c c U c a a t g g a t a t t t c a atattggcaa + 100 t a a t t a t t g t agcatcattt gatagttaca a a t t a t g l a a a t t t t a g c g a Rsal  +150  cagtggaaaa gtgaaagtgg c t c g a c t t t c aagtacgtaa t t t g a c a c c a +200 gctataacaa gaaGCflGTCG TGGCCGflGIG GTTflflGGCGT CTGRCTCGRR +250 RTCRGRTTCC CTTTGGGRGC GTRGGTTCGR RJCCTRCCGG CTGCGgatcg +300  aatcgaattt t t t a c a c t t c gcatagagct  ctcaattaaa cttgatgaca aattaaagtc  a c c a t a t t t t ttatgtgcgc  cgtcagtggg  F i g . 1 9 . Nucleotide sequence of the 444* gene i n p£4.6 from the Oregon-R strain. The Rsal site (GTAC) at nucleotide 143 is indicated above the sequence, which has been exploited to generate further subclones. The structural gene is depicted in capital letters. The T mutation at nucleotide 213 is actually a C in the tRNA, corresponding to position 50 in the tRNA structural gene (non-standard nomenclature). The Box B promoter sequence, CGAAT, at position 228 within the gene is also repeated three times starting at nucleotide 244 (dotted underline).  134  t l c a a t a t t a atgaaaaatc tgaaaaaatt aaccgagtca c g a c t t t a a a +100  tcacttgaat taatcgaatg aatgaactgc gattttggtc tataaattga Mbol  +150  acgtgtggaa gggggcacag aaaaatttct ggatctggat ggcaaatgtc +200 ttcgccaaGC flGTCGTGGCC GRG£GGTTRH GGCGTCTGRC TflGflflRTCfiG +250 RTTCCCTCTG GGRGCGTRGG TTCGfiflTCCT flCCGGCTGCG t t t a a t g c t a +300 t a a t t t t a g c t t a a t t t a g a t a c t t a c a c t gagaaaaaaa accgcaatga +350  tgcaatatca t t t a a a a a t a aataaaacag aaagtaatta a t t t t t t c a a  ccaaatcaga c t a a t c t t a g t  F i g . 20. Nucleotide sequence of the 774 gene in pEl ,8 from Oregon R. The Mbol site (CATC) used in subcloning is shown above the sequence. The structural gene is in capital letters and the three diagnostic nucleotides are underlined.  vis region appear to be rare and contain many repeats (see summary, fig. 51 lane 6), no more attempts were made to extend the walk further. (B). Localization of tRNA Genes Within the Walk Almost the entire  Drosophila insert contained within pDt27R was sequenced by C. Newton  (1984). It has been shown to contain two identical tRNA4$ genes clustered near one end of er  the insert. From alignment of restriction sites and Southern blotting, the two tRNA^ genes er  in the corresponding chromosomal walk have been localized to the HindlH+BamHI fragment approximately at coordinate 40 (fig. 21). Since the equivalent genes have not been sequenced, they can only be assumed to be tRNA4$ genes. From his sequence analysis, Newton also er  showed the cluster of four BamHI sites approximately 600 bp downstream from the tRNA4$  er  genes. As discussed in the pDtl7R chromosome walk, these also correspond to four duplicated t R N A S genes (designated as pArgl2.1 to pArgl2.4 in fig. 21 and Newton, 1984). The similar Ar  arrangements of BamHI sites within the walk is assumed to reflect  a coincidental  arrangement of the four tRNAArg genes. From Southern analysis of cosP273R, an additional tRNA 6 gene (pArgl2.6) has been Ar  localized within a 360 bp Hindlll fragment at coordinate 28 (fig. 21),  This fragment was  subcloned into p£MBL8- and sequenced by the modified double-stranded method described by Hattori and Sakaki (1986). The sequence is displayed in fig. 22. The structural sequence is identical to that in pArgl25 in the pDtl7R walk, but both differ from the cluster of four duplicated tRNAArg genes at coordinate 40 by aC-T transition at position 13 within the A block promoter region.  However, the recent cloning and sequencing of the entire family of  tRNA 8-related genes showed that the C13 nucleotide is actually the exception, peculiar only Ar  to those duplicated genes within pDt27R (C. Hunter Newton, manuscript in preparation).  4. (A) Chromosomal Walk in the pDt!6R Region Only three 31 phages were isolated from the Canton-S library before impasses were met. 311161 occurred at the expected frequency of one per genome equivalent of phages, while 311162 and Xi163 occurred at frequencies of one per 40 genome equivalents of phages (fig, 23)  136  F i g . 21. Chromosomal walk i n the pDt27R domain. More than 56 kiiobases of genomic DNA from this region were collected before impasses were met. The coordinate line in kb is shown at the top of the figure. The interruption in the map denoted by two slashes to the left indicate unmapped DNA, but it is known to be composed of largely repeated DNA (see fig. 60, lane 6) and no detectable hybridization using total 4S RNA (a gift from V. Dartnell). The entry probe for initiating the walk is indicated by the dashed line. The chromosome region is shown as a thick line with their restriction sites, B-BamHI; E-EcoRI, H-Hindlll. above. The two t R N A ^ genes are indicated by small arrow heads pointing in the direction of transcription. Five t R N A ^ genes have been identified (wide stemmed arrows). Four duplicated copies of a t R N A ^ gene are associated with the cluster of four BamHI sites 600-bp downstream from the t R N A ^ genes. These genes are expanded below the chromosome map and the relative sizes of the duplicated units are delimited by ticks. At the DNA sequence level, these boundaries are composed of a short direct repeat TAGCCCAA (see fig. 55). The Arg 12.1 and Arg 12.2 genes are 600 bp in size, while Argl2.3 and Arg 12.4 are 200 bp with a single large deletion in the 5-flank relative to the larger units. Another solo copy is associated with a BamHI site at coordinate 28 to the left, which is transcribed in the opposite direction. The duplicated tRNAArg genes are distinct from the solo copy and the rest of the gene family at nucleotide 13 as described in the text. At the bottom are the overlapping recombinant clones, none of which was present in the Maniatis library (Maniatis et at, 1982). The inverted triangle in the clone 420 indicates a frequent Hindlll restriction polymorphism detectable in numerous commonly used lab Oregon-R stocks, including those used in deficiency localization of the tRNA genes (see fig. 52).  1-  137  444-1  AH rg E B iHB  BH BBB B E H  12.6  1  ' • '  444-2  E EE  B  '  "  Arg »  —1  . •  1  1  Arg Arg Arg 12.1  12 2  »  12.3  12.4  »  •  1—I  I  .COSP273R  1  •  •  -  BH •  138  +50 aagcttcgtt tcgcgttgaa actgaatttt ttgcaattca acccttccca +100 cttattatag ttttcgttct gttctcacta gcaaatgttc tcactccagt + 150 ttctctcgcc tctccctctt tatatttgtt gttacggcct ggtaatccaa +200 ctGHCCGTGT GGCCIflflTGG flTRflGGCGTC GGflCTTCGGR TCCGflflGRTT +250 GCflGGTTCGfl GTCCTGTCflC GGTCGaccgc tctatctttt  ttttaatatt +300  catattttcc ttgagctatg aatattacag cttttattaa ttggccaagt  caattgctgc  F i g . ZZ. Sequence of pArglZ.6 of D, melanogaster. The fragment was cloned as a 360 bp Hindlll fragment and sequenced with universal primers Ft, RI and also the two gene-specific oligonucleotides Arg5' and Arg3'. The structural gene is depicted in capital letters and the T (at nucleotide 165) is emphasized by underlining. The characteristic BamHI site (GGATCC), constituting part of the coding sequence 3' to the anticodon, begins at nucleotide 18S. 1 3  139 Using pDtl6R again as the probe, X2161R was isolated from the Mbol XEMBL3 library after screening about 20 genome equivalent of phages. No more authentic clones were obtained from any of the JL or cosmid libraries using various probes spanning the entire Hi 161. As predicted from the pDtl6R (Oregon-R strain) restriction map (St. Louis, 1985), each tRNA  S e r  gene is located on Hhal fragments of identical 980 bp in length.  This was also  confirmed by digesting U161 from the Canton-S strain with Hhal and Southern blotting using GT7 as the probe. To determine whether other permutational forms of tRNA4 7^ genes exist er  (  at the equivalent site in this fly strain, the Hhal fragments were isolated from agarose gel and their ends were trimmed with SI nuclease, then ligated into the Smal site of pUC13 Sequencing results showed that one corresponds to a 777 gene (fig. 24) while the other corresponds to a 774 gene (fig. 25), identical to those originally identified in pDtl6R. Comparison of corresponding flanking sequences between the Canton-S and Oregon-R fly strains showed the expected 3-4% sequence divergence. Again, they consist mostly of single nucleotide substitutions and insertion or deletion of one to two base pairs. The Hhal fragments between the two strains were presumed to be identical based solely on the criterion of size. In fact, they are not. From restriction mapping (St. Louis, 1985) and sequence analysis of pDtl6R (Cribbs etal, 1987b), it was predicted that the Hhal site should be 70 bp downstream from the 777 gene.  In the Canton-S sequence, however, the expected  recognition site GCGC has been altered by an insertion of an extra nucleotide to give GCTGC (fig. 24). A new Hhal site is probably created by a nucleotide substitution 154 bp downstream in the sequence GTGC to GCGC in the Canton-S strain. This is inferred from the other cloned Hhal fragment containing the 774 gene which shows the sequence adjoining the Smal cloning site to be AAACCAATTT (nucleotides 1-10 in fig. 25). This block of nucleotides occur three base pairs downstream from the hypothetical Hhal site. It was probably removed along with the proceeding three base pairs by the SI or other contaminating nucleases during the trimming step.  Since both tRNASer genes in Canton-S were also localized to two Hhal  fragments of identical size, there would necessary have been two other fortuitous and compensating changes which created new Hhal sites to maintain the parity in size for the two  140  F i g . 23. Chromosomal walk i n the pDtl6R domain. At least 22 kilohases of genomic DNA were collected. As in the previous cases, impasses were met repeatedly in all genomic libraries and no further attempts were made to extend the walk. The coordinate line, measured in kb, is indicated at the top of the figure. Dashed line represents pDtl6R, the entry probe used to initiate the walk. The two tRNA genes, 774 and 777, are shown as small arrow heads pointing in their direction of transcription. The chromosome region is displayed as thick line with the restriction sites BamHI (B), EcoRI (E), Sail (SI) and Hindlll (H) marked above the chromosome region. The polymorphic sites E and H are indicated as inverted triangles in the recombinant phages 2161R and 1162, respectively. Ser  141  10  774  E BH 1 1 1  E 1  20  15  777  BH 1 1  H 1  SI  B  •  i  E i  2161R  1161 1163 -  1162  142  +50 gctaccactt ggcgtaataa aatcaaatta gtggaaacag aaaatatttc +100 gagtttatga agataaaaaa attcattgaa caaacgtcaa ctattttcac + 150 cttcatagcc attatcatcg accactcatt gcttactcag ctttttatgc +200 ctatatctta caatagacgc cccgatcctc aaaagcgatc caatcttctt +250 ttcatgccaa cttgacgatc cgcgatcatt aaGCRGTCGT GGCCGRGCGG +300 TTRRGGCGTC TGRCTRGRRR TCRGRTTCCC TCTGGGRGCG TRGGTTCGRfl +350 TCCTRCCGRC TGCGaatagt aatctgtttt ttggaagtcc agaaaataga  +100 tcgacagaag atcagaaaaa gtattaagaa gctgctctct  tataatgctt +450  aaaaaatatt tcgtagtaaa agagtgaagt gtgtggcaaa taaaatcatg  cacctttgta aagttactga tat  F i g 2 4 . DNA sequence of the pCS 16-777 gene from Canton S. Sequencing strategy is described in the text. The structural gene is depicted in capital letters and the three diagnostic nucleotides are underlined. The mutated Hhal site, (CCCC-GCTGC), is located at position 381, 67 nucleotides downstream from the 3-end of the structural gene (dotted underline).  143  +50 anaccaaUt  a a c t t t t t t g oGtttaatca t t a t c t a t t g UoaGooagt  +100  gatattaata  gttatacgat  c g a c t t t t c g ctataaaaag  atcagtgata  +150 ttaatgtagc t a g a g t c g g g taataaagcc tctggagtca tcaaaGCRGT +200 CGTGGCCGAG CGGTTRRGGC GTCTGRCTRG RRRTCRGRTT CCCTCTGGGfl +250 GCGTRGGTTC GRRTCCTRCC GGCTGCGgtt tataagtgcc aattttttta +300 aaataattaa gccaaactaa taaattcaaa aggtaacatc attaggaata +350  tatataaaac acaatttttt agtattaaat tagttataca atagtttttt  tgcaatcctt gtgttatgca atctgtaag  F i g . 2 5 . DNA sequence of pCS16-774 gene from Canton S. The 5-end of the insert is near the polylinker cloning site. The sequence corresponding to the mature tRNA is presented in capital letters. The three diagnostic nucleotides are underlined. The proposed new Hhal site is probably three nucleotides upstream from the 5'-end of the insert, where the sequence GTGC in Oregon-R could have been mutated to GCGC in the Canton-S strain.  144 fragments.  In summary, a total of eight tRNA$ and six tRNA 8 genes have been uncovered at 12DE er  Ar  (see Table V in Discussion). Two of the tRNA$ genes, mapping to the pDtl7R domain, have er  not been described previously. Neither of these two genes (a 444* and a 774 gene)  correspond  to the reciprocal hybrid sequences anticipated from regular exchanges. Thus, assuming no differential selection operated on the different recombinant products, the observations remain consistent with the coversion model postulated by Cribbs (1982) and Cribbs et. al., (1987b)  (also  see  Discussion  concerning  the  autosomally-linked  tRNA4$  er  genes).  Unfortunately, the equivalent genes isolated from Canton-S, in the hopes of imparting new insight into the dynamics of genetic exchange, only managed to arouse pedestrian curiosity since they are identical to their Oregon-R counterparts. It is not known whether the turn over of this gene cluster is particularly slow since the rates of turning over for other multigene families have only been reported from comparisons between the Drosophila sibling species, rather than between strains (reviewed by Dover, 1982). This idea will be further explored in Chapter II (part II), where the pDt73, pDt27R and pDtl6R homologous regions from different species will be compared. The overall organization of the two gene families appears to be typical of all known  Drosophila tRNA gene clusters (for example, see fig. 3). The 12DE site reported here is at least 157 kb in size plus the intervening DNA between the four domains which was not successfully retrieved. The most plausible reason for the failure to link the domains could be the high density of repeated sequences encountered in the walk may be poorly tolerated in most of the c o m m o n F. coli  hosts used in propagating recombinant DNA. Nevertheless, it is almost certain  that all of the t R N A  Ser  genes have been cloned from this site (Cribbs et ai, 1987b; see  Discussion) and that the 474 gene is closest to the centromere based on its proximity to the  Stellate sequences.  145 CHAPTER  Flanking Sequence Relatedness in the tRNA4 7 $  II  e r  Genes in  ft melanogaster And Sibling  Species The repertoire of hybrid genes retrieved from 12DE is entirely consistent with the hypothesis of nonreciprocal recombination (or specifically gene conversion), but does not constitute definitive proof. Thus, other hallmarks have been sought in the tRNA47S gene er  cluster to provide independent evidence to either support or confute gene conversion as a viable hypothesis. In Part 1 of this chapter, the flanking regions of all the t R N A 4 7  Ser  genes  at 12DE were compared systematically. Since conversion is known to occur predominately via intrachromosomal transmission of genetic information (see Discussion), it was expected that the hybrid genes would retain sequence signatures in their flanking regions tracing to the original parental genes in the cluster that may have participated in the conversion events. To be sure, as in Chapter I, this analysis would again only provide circumstantial evidence consistent with conversion events and would not eliminate reciprocal recombination as a plausible alternative. Therefore, a second and more robust approach was undertaken in Part II of ths chapter to distinguish between the two possible hypotheses more convincingly. To achieve this, genomic DNAs were prepared from five other  Drosophila sibling species  (simulans, mauritiana, erecta, teissieri, yakuba , see fig. 28) and probed with pCS474, pD Obviously, if the hybrid genes were derived from unequal (but reciprocal) exchanges between the bona fide tRNA4$ and tRNA7$ genes, then one would expect that at least some er  er  of these genes may have been gained or lost during the long evolutionary history of the tRNA4 7Ser g (  e n e  cluster (between 13 and 37 million years). Conversely, if gene conversion  were operative, then the gene families from the different sibling species would be unlikely to fluctuate in size or, at least, not from the actual gene conversion events themselves. Also, segments homologous to pCS474, pDtl6R, and pDt27R have been cloned from latter two segments from  D. erecta and the  D. yakuba to precisely define the nature of their genes at the DNA  sequence level, Because these two sibling species have diverged from  melanogaster for  146 approximately 13 and 37 million years respectively, they should provide more candidates for seeking alternative forms of hybrid  genes at the  suitable  homologous sites.  Furthermore, based on the extensive analyses of cohesive evolution of other multigene families, driven mainly by the mechanism of biased gene conversion coined as "sequence homogenization" (Strachan et al, 1982 and 1985; Dover, 1982), it has been well documented that the sequences flanking the gene members within a species consistently reveal a higher degree of sequence conservation than between different species.  This predicted mode of  multigene family evolution, which appears to be widespread, can be directly tested in this study for the tRNA4 7$er genes by inter- vs intra-species comparisons of their flanking (  sequences. The results obtained in Part I confirmed the presence of distinct homology blocks in both the 5- and 3-flanking sequences allied among the X-linked tRNA47^  er  genes in D.  melanogaster, strongly suggesting that they are evolutionarily related. In fact, in all of the proposed recipient genes involved in the conversion events, the multiple homology blocks always maintained proper spatial alignment relative to those in the putative donor genes. This type of patchwork homology blocks are indeed consistent with the prediction based on intrachromosomal conversion events (see Discussion). In the cross species analyses addressing the problems of fluctuation in the size of the t R N A 4 j S gene families and flanking sequence intra-specificity (Part II), the results again er  provided convincing evidence consistent with the model of cohesive evolution via biased gene conversion. No gain or loss of the X-linked tRNA4 7$er genes (and the interspersed (  tRNA^rg genes) thus far have been detected in the DNA segments homologous with the three melanogaster plasmid probes. Sequence comparisons of homolgous genes from two other sibling species showed that there were more conserved sequences shared by members of a multigene family within a species, precisely conforming to the prediction of "sequence homogenization". Thus, all of the separate lines of evidence tend to forge a strong argument for gene conversion, rather than standard reciprocal recombination, as the viable model. Portions of the results presented here have been published in Cribbs et al (1987b).  147  R E S U L T S Part I-Homologies in the 5'-Flanking Regions of the  melanogaster Hybrid  Genes: Wedded  Patchworks 5' homology elements that are diagnostic of either tRNA4$ or tRNA7$ genes were first er  noted by Newton (1984).  er  Depending on their location relative to the start (+1) of the  structural genes, they were known as either -5 or -20 boxes. Briefly, both 444 genes within pDt27R contain the consensus sequences AAPyAA at about the -5 position and TTGGGPyT at about the -20 position. In contrast, the 777 genes have the consensus sequences ATPyAA at about -5 and CAAPyTT at about -20 (Newton, 1984; Cribbs et, al.l%7b). For the hybrid 774 and 474 genes and the singly altered 444* gene, no obviously paired homology boxes belonging to the exclusive domain of either the 444 or the 777 genes can be observed. Rather, their 5'flanking regions are a wedded patchwork of sequences, embracing characteristics of both gene types.  (A) The 474 Gene is Most Closely Related to the 444-1 Gene The 30 nucleotides in the immediate 5' flanking region of the 474 gene share  strong  homology to the corresponding region in the pDt27R 444-1 gene, but show much more divergence from the 444-2 gene.  At the -5 position of the 474 gene, the sequence ATCAAG  resembles the -5 homology element AACAAG in 444-1 gene with one base pair mismatch; but no shared -20 element is detectable (fig. 26, lines la vs 2).  However, other significant  homology blocks can be observed; at about position -15 in the 474 gene, the sequence TTGCGCA again shares strong homology with the corresponding sequence TTGCGIA (1/7 mismatch) in the 444-1 gene. The sequence TATJGATATT at about position -30 in the 474 gene is held in common with a similarly positioned TAGTGiTATT in the 444-1 gene (2/10 mismatches). The major discord from the -10 to the -30 region is confined to the replacement of the core tetranucleotide sequence GGGC of the -20 box by either four (Canton-S. line la) or two (Oregon-R, line 1) T nucleotides in the two 474 genes. These latter nucleotide changes create a sequence context resembling a 777-type -20 box but they may simply reflect fortuitous  148  1. Oregon-R 474 (pDt73) l a . CS474 2. Dt27R 444-1 P  P  -30 -20 -15 -5 t aaat t tTRTTGRTRTTtt—TTGCGCRtatRTCRRG  t aaat ttTRTTGRTRTTt11 tTTGCGCRtatRTCRRG aTRGTG-TRTTgggcTTGCGTRggaRRCRRGta ccggaagattgTTgggaTTtgatccaaflfilRR  3. pDt27R 444-2  4. p0t17R-777 5. p0t16R-774  -33 -20 -13 -5 ct ctTGCRCCTctt gaact caat 111 cGCCRCccaccCRTCRR 11 aaTGTRGCTagagccgggtaat aagGCCRCt agagtCflTCRRa  5a. pCSI6-774  11 aaTGTRGCTagagtcgggt aat aaaGX£J£tggagtCRTCRRa  6,  pDt16R-777  6a.pCS16-777 7. pDt17R-774  8. pDt17R-444* 9. pDt17R-777 10. pDt27R 444-1  -27 -20 -5 11 ctTT-CRTGccaacttgaccat ccgcgaTCRTTRfl ttctTTTCRTGccaacttgacgatccgcgaTCRTTRfl aaaaJJIfhlfigat ct ggat ggcaaat gt ctTCGCCRR  -20 -15 -5 gtacgtRRTTT—GRCRCC-RgctRTRRCRRGRR gaact cRRTTTtcGCCRCCCRcccRTcaa at agt gt at t gggct t gcgt aggaRflCRRGJfl  F i g . 2 6 . 5-flanking homologies in t R N A j genes in D. melanogaster. Short 5-flanking regions of genes from chromosomal bands 12DE are aligned to highlight homology blocks (underlined). Genes derived from the Canton-S strain are denoted with "CS". The gene in line I is the same as pDt73. Note that all homology blocks are spatially aligned in linear order despite that all comparisons involved non-allelic genes. S 8 r  4  149  mutational events. Beyond -30, little homology can be detected.  (B) The 774 Gene in pDtlSR is Most Closely Related to pDt!7R-777 From nucleotides -2 to -7 in the pDtl6R-774 gene, the -5 sequence CATCAA matches exactly to that in the pDtl7R-777 gene; but again like the 474 and the 444-1 genes above, no corresponding -20 homology element is shared between them (fig. 26, lines 4 and 5). Two other tracts of nucleotide homology are evident beginning at -14 (GCCAC) and at -35 (TGTAGCT) in the 774 gene. Similar sequence tracts are also situated at almost identical positions in the 777 gene (-13 GCCAC and -34 TGCACCT, respectively). Note that the single A to T base difference in the -14 box is not between the non-allelic 774 and 777 genes (lines 4 vs 5) but is between the identical 774 genes from the two different fly strains (lines 5 vs 5a).  (C) The 774 Gene in oDt!7R is Possibly Related to oDtl6R-777 The seven nucleotides, TCGCCAA immediately 5" to the 774 gene are only remotely similarly to the sequence TCATTAA (3/7 mistmatches) in the 777 gene and may not be significant. However, another short tract in the 774 gene, TTTC-TG, beginning at -28 shows strong homology to TTTCATG, which occupies an identical position in the 777 gene (1/7 mismatches) (fig. 26, lines 6a vs 7). Hence, there is weak evidence to suggest that the 774 and the 777 genes are related. However, the 774 gene in pDtl7R shows no homology to any of the other available 5' flanking sequences including those from three of the four autosomal tRNA7$  er  genes  (Cribbs et at, 1987b; D. A. R. Sinclair, unpublished observations).  (D) The 444* Gene Has a Patchwork 5'-Flanking Region Characteristic of pDtl7R-777 and 444-1 Genes The remaining 444* is not a hybrid gene based on the three diagnostic nucleotides, but does have a single mutation at the tip of the extra arm (C50-T5O)  Nevertheless, it's  5-  flanking region has characteristics of both 777 and 444 sequences. As shown in fig. 26, the sequence from -1 to -9 in 444*, AACAAGA.A, matches almost exactly to AACAAGIA in the same  150 position of the pDt27R 444-1 gene (lines 8 vs 10). On the other hand, strong homology exists between the sequence GACACC^A beginning at position -14 and the sequence GCCACCCA beginning at position -9 between the 444* and the pDtl7R-777 genes, respectively  (2/8  mismatches, lines 8 vs 9). Another region of homology exists farther upstream between the same genes.  Note that at about the -20 position in the 444* gene, the sequence AATTT  resembles the -20 element in the 777 gene. Hence, it appears that at least one extragenic recombinatory event between the -5 and the -9 region may have engendered the hybrid homology pattern observed in the 444* gene. For convenience, the 444* will be included in the discussion of hybrid genes.  Sequence Homologies in 3'-Flanking Region Homology elements except the most prominent poly-T termination signals in the 3flanking sequences have not been previously recorded by either Cribbs et al, (1987b) or by Newton (1984). However, homologies are present in trailer sequences of some of the t R N A  Ser  genes, including all of the hybrid genes. As shown in fig. 27, the pDtl7R-774 gene contains the sequence TTT- A ATGCTAT A ATTT within the first 20 nucleotides immediately 3' to the gene that is homologous to a similarly positioned sequence TTTGTAAGCT-TAATTT of the Canton-S 474 gene (4/17 mismatches, lines 1 vs 2). Even if the poly-T termination signal was not considered to eliminate possible bias, there remains an overall homology of approximately 72%.  In  addition, both genes also share a long poly A sequence tract, albeit at somewhat different positions, that is absent in all other tRNASer 3- flanking sequences examined. However, since the poly A tracts are not positionally aligned and due to the high A-T content in the region, this apparent conservation could be fortuitous. The other 774 gene in pDtl6R also shares a truncated homology block, TTTATAAG, with the above genes starting from nucleotide +2 (fig. 27, lines 3 and 3a), With the exception of the poly T termination signal situated in almost the same position, there does not seem to be other significant homology.  151  +10 +20 +30 +15 1. pDt17R-774 TTT-RRTGCTRTRRTTTtagcttaatttagatacttacactgagRRRRRRRR 2. pCS471 TTTGTRRGCT-TRRTTTgtatttttacaaacRRRRRRRRRtQcta 3. pCS16R-774 gJJTRTflRGtgccaatttttttaaaataattaagccaaact 3a,pDt16R-774 qTTTRTRRGtgccactttttttttaataattaaaccaaQct  4. pDt27R 444-1 5. pDt17R-444*  +6 +20 RRTgaGRRTGTR-TRTTTTRtttcaaatgtttttattttctgaaat RRTc-GRRTCGRRTTTTTTRcacttcgcatagagctaccatatttttta  6. pDt27R 411-2 gaagggtattcctatattttttatgttttaaaaggtgcattcttacagt 7. pDt17R-777  atatgaagagtatcttttttatgtcagatacttttatgtatctatgggat  8. pCSI6-777 8a.pDt16R-777  aatagtaatctgttttttggaagtccagaaaatagatcgacagaaga aat agcaat ct gt11111ggaagt ccagaaaaaat agat cgat agaa  F i g . 27. 3'-flacking homologies of non-allelic genes of D, melanogasterfrom chromosomal bands 12DE. As in previous figure, "CS° denotes genes that are derived from Canton-S, as in lines 3 and 8, otherwise, they are from Oregon-R clones.  152 There is conservation between the sequence, AATG^GAATGTA-TATTTTA, just outside the pDt27R 444-1 gene and the sequence AAT£GAATCGAAITITITA at the same position in the 444* gene (6/20 mismatches, lines 4 vs 5). Except for the regular intrusion of the poly-T termination signals, the 3'-tails from the rest of the bona fide genes pDt27R 444-2, pDtl7R-777 and pDtl6R-777 (lines 6, 7, 8 or 8a, respectively, in fig. 27) are composed of unique sequences, and therefore unrelated to any of the above genes. There is also an intriguing correlation between the tRNA47S diagnostic nucleotides at er  positions 16 and 77 within the coding regions of the hybrid genes and their immediate 5' and 3' flanking sequences. If positions 16 and 77 are both diagnostic of a tRNA4$ gene (as in the er  444* and 474  genes), then  both flanking sequences would also be  Alternatively, if positions 16 and 77 are diagnostic of tRNA7$  er  tRNA4S -like. er  j tRNA4$ , respectively, er  a n (  then the 5'- and 3-flanking sequences would also be switched accordingly (as in both 774 genes). Thus, it appears that the identical hybrid nature of the two diagnostic nucleotides within the genes and their flanking sequences are inherited together.  Part II- Examination of Loci Homologous to PCS474 and pDt!6R in  Drosonn/taSibVme. Species:  (A) Detection of Homologous DNA By Genomic Southern Hybridizations:  D. metanogasteris a member of a small subgroup consisting  of eight closely related sibling  species that is virtually cosmopolitan in its geographic distribution. Morphologically they are very similar. The only source of reliable distinguishing features are the slight variations in their male genitalia. The evolutionary relationships among these sibling species, as shown in fig. 28, are primarily constructed by polytene chromosome banding patterns but have also gained support from species hybridization studies (David et. at., 1974), electrophoretic mobilities of certain enzymes (Gonzalez et. al, 1982; Ohnishi and Voelker, 1983; Eisses et. al, 1979) and from their mitochondrial, ribosomal and satellite DNA polymorphism (Barnes et al, 1978; Strachan et. al, 1982; Coen et. al, 1982). The arborescent topology depicting their relationships can be divided into two species complexes.  D. melanogaster, D. simulans,  153  Evolutionary relationship among the eight species of Drosophila species subgroup based upon their polytene chromosome banding patterns [Bodmer and Ashburner, (1984).!. The numbers on the diagram indicate the minimum number of autosomal inversions that have accompanied each cladistic event. Thus, within the D. melanogaster complex only one large inversion distinguishes the chromosomes of melanogaster itself from those of the other three species. The chromosomes of the members of the yakuba complex differ from those of the melanogaster complex by at least eight autosomal inversions. From left to right, the symbols in the diagram are: yak, yakuba; tel. teissieri; ere. erecta: ore. orena; aim, simulans; oau, mauritiana; sec, sechetlia; met, melanogaster.  F i g . 28.  154  mauritiana and D. sechellia. have similar chromosomes. chromosome 3R distinguishes the chromosomes of  Only one large inversion on  D, melanogaster from those of the other  species in this complex. The chromosomes of the remaining four species differ from those of the  D. melanogaster complex by at least eight inversions. D. yakuba, the most dissimilar from  D. melanogaster, differs by at least 30 inversions (Ashburner et al, 1984). Genomic DNAs prepared from  D. melanogaster, D, simulans, D. mauritiana D. er  teissieri'andD. yakubawere digested with the restriction enzymes Hindlll. EcoRI and BamHl and the fragments were resolved by agarose gel-electrophoresis.  After blotting by the  modified method of Southern (1975; see Methods and Materials, P. 100), the filter bound DNA was probed sequentially with pCS474 and pDtl6R, which had been radiolabelled by nicktranslation. Hybridization was initially carried out at low stringencies since there was no presumed knowledge of whether the homologous sequences existed in species other than D  melanogaster To detect specific hybridization, the probes were subsequently removed by sequential washing under increasing stringencies. The filter bound DNA was first challenged with pCS474 at 42 °C in standard hybridization buffer as described (see Methods and Materials, P. 93). After the filter was washed at 42 °C in standard washing buffer (lx SSC/ 0.5% SDS), only featureless smears appeared in all channels (data not shown).  At slightly increased stringency (58 °C wash), specific bands were  discernible above the still considerably high background. When the washing temperature was elevated further to 65 °C, bands from specific hybridization were quite evident in almost all species represented in the EcoRI and BamHl digests (fig. 29). Hybridization is weak in all Hindlll digests, presumably due to some loss of DNA during the ethanol precipitation steps just prior to loading the gel, although specific bands could still be detected.  However, for D.  yakuba, the sibling species that is the most distantly related to D. melanogaster, no specific hybridization could be observed even at this stringency except for a persistent smear ranging from about 7.0 kb to 23 kb. It is known that pCS474 contains several different types of repetitious elements; thus, it is plausible that the high background in general and in D.  yakuba in particular is due to strong cross-homology with these repeated elements.  Fig.  29.  Genomic Southern blot of Drosophila sibling species subgroups with probe pCS474,  Lane I: melanogaster; lane 2: mawitiana; lane 3: teissieri; lane 4: simutaos; lane 5:  f>: yakuba The DNAs from the various sibling species were digested to completion with EcoRI. Hindlll. and BamHI and resolved in a 0.6% agarose gel. After Southern transfer (see Methods and Materials), the filter bound DNAs were probed at 42 °C and then subsequently washed at 42 °C, 58 °C. and 65°C. The hybridization bands were visualized after each wash by autoradiography at -70 °C with enhancer screen for approximately 3-4 days. Only the exposure after washing at 65 °C is shown here. Hindlll cleaved \ DNA and Hinfl cleaved pBR322 were used as size markers. The heavy smears along the edges of the figure are the result of over-exposure of the size standards generated by Hinfl cleaved pBR322.  156  EcoRI 1  2  3  4  I 5  6  1  Hindlll 2  3  4  5  I 6  1  BamHl 2  3  4  5  6  kb  157 When pDtl6Rwas used as the probe, similar observations as above were obtained (fig. 30). Specific hybridization bands were discernible after washing at 58 °C in standard washing buffer as above (not shown). After further washes under more stringent conditions (65 °C in lx SSC/0.5% SDS and then 68 °C in 0.2x SSC/0.1% SDS), essentially all background was removed and the specific hybridization bands remained intense in all species, including D. yakuba. Two  Drosophila sibling species, D. erecta and D yakuba, were selected for further  molecular cloning studies with respect to the pDtl6R and pCS474 homologous loci. The time of separation of the two sibling species from  D melanogaster is difficult to estimate. Contingent  on the choice of mathematical treatments, it appears that  D yakuba diverged from D  melanogaster between 13 and 37 million years ago (Ashburner et. al, 1984). extrapolations (Bodmer and Ashburner, 1984) suggest that  Other  D. erecta is more closely related to  D melanogaster and probably diverged from the latter between 2 and 13 million years ago. When all three sibling species are considered together, in theory, they should present an ideal incremental time span of DNA sequence evolution represented by the two genetic sites.  (B) Isolation of pCS474 Homologous Fragment from A library containing D.  D erecta  erectaVNh partially sheared with BamHI and then cloned into the  vector XEMBL3 was screened with the nick-translated pCS474.  Of a total 100,000 plaque  screened, only one strong positive was identified. The phage, designated as XDE73. contains a 17 kb insert composed of two different 8.5 kb BamHI fragments.  One of the fragments  showing positive hybridization to both pCS474 and tRNASer gene-specific  probes was  subcloned into the BamHI site of pEMBL18* and mapped by standard restriction digests (data not presented).  Southern blotting on the subclone showed that a tRNA$ gene was located  on a 2.0 kb fragment bounded by a Pstl and a BamHI site (fig. 3D.  er  This fragment was cloned  into pEMBL8- and used as the template for dideoxy-sequencing by both the single- and doublestranded methods. The sequence indicates that it is a 474 gene (denoted as pDe474 in fig. 32), identical to its counterpart in pCS474. Unfortunately, the homologous DNA segment could not be isolated from X libraries of D.  158  3 0 . Genomic Southern blot of Drosophila sibling species subgroup w i t h probe pDtl6R. Lane I: melanogaster; lane 2: mauritiana; lane 3: teissieri; lane 4: simulans; lane 5: erecta; yakuba The litter from fig. 29 was treated in 0.4M KOH at 45 °C to remove the old probe and then neutralized at the same temperature in O.lx SSC, 0.1% (w/v) SDS, 0.2M Tris-HCl (pH 7.5) for 30 minutes. The conditions for Southern hybridization with pDtl6R are described in fig. 29, except the last wash was carried out at 68 °C before autoradiography as shown here.  Fig.  159  160 yakuba using either pCS474 or the D. erecta 1.5 kb M I insert as a probe due to crossreactivity with a large number of other phages.  This observation is consistent with the  previous genomic blot, in which potentially authentic hybridization bands in  D. yakuba (a.g.  BamHl digest) could have been obscured by repetitive sequences. In the same genomic Southern blot above, a 15 kb BamHl band could be clearly detected in  D. teissieri Since this species is closest to D. yakuba cladistically (fig. 28) and appears to have less cross-homology with repetitive elements, I thought it might serve as a possible alternative. This fragment, though, also turns out to be elusive, even though the library was constructed from gel-purified BamHl cleaved genomic DNA ranging from 9.5 kb to 23 kb in size, and a total of 250,000 recombinant plaques screened. I have not made further attempts to clone these fragments using  £ coli host strains other than Q359 (reck*).  (C) Isolation of pDt!6R Homologous DNA Segments from  D. erecta  Of approximately 100,000 plaques screened with the nick-translated plasmid, also only one positive (but three identical copies) was isolated. The restriction map of the phage, designated as JLDE16, is shown in fig. 33  As predicted, the 8.0 kb and one of the two 1.6 kb BamHl  fragments observed earlier in the genomic Southern blot showed homology to pDtl6R (labelled as the "z" region in fig. 33).  In addition, probing with GT7 confirmed the presence  of tRNA$er genes within a 35 kb region bounded by the Hindlll and BamHl sites. In the same blot, the gene-specific probe also revealed additional tRNA$  er  genes in the left most 1.6 kb  BamHl fragment that is not homologous to pDtl6R. When the phage was rehybridized with plasmids used earlier as entry probes in the molecular walk, it clearly showed extensive homology with pDt27R (labelled as the "y" region in fig. 33). Comparison of the two Southern blots probed with pDtl6R and pDt27R indicates that the junction between the two regions is situated within the 2.6 kb Pstl+HindUI fragment (dotted part of region "z" in fig. 33)  Since  this restriction fragment shows strong hybridization with pDt27R but considerably weaker signals with pDtl6R, the point of the junction is very likely to be closer to the Hindlll site. Thus, in contrast to  D. melanogaster, the spacer DNA separating the two regions is much  161  B  L  S l  P i  P  J  s L  B  J  474  1.0  kb  F i g . 31. Subclone of the 8.5 kb BamHI fragment from JLDE73 from D. erecta. The restriction sites B-BamHI; S-Sstl. P-Pstl. There are no EcoRI or Hindlll sites in this subclone. The single 474 gene, denoted by the arrow head, is located between the Sill and EslI sites but the exact location is not known. The direction of transcription was deduced by differential hybridization of gene-specific oligonucleotides to single-stranded templates of the BamHI-PstI fragment.  +50  ccagttatag gtatttattt atttaaggct gcttttaagt tatattcatt +100  agttatcagc aacgaaattc caatttatat tatcagctgt ggctttgcat +150  aagctalcgg aagtgttttt gttttataaa laaatactat  atcaacatta  +200  tattatttta gggcactctt aacgattcat accaaaGCflG TCGTGGCCGfl +250  GTGGTTRRG6 CGTCTGflCTfi GRRflTCRGRT TCCCTCTGGG RGCGTRGGTT +300  CGflflTCCTflC CGGCTGCGtt tgcgagctag atttttgtcc aaaaaaataa +350  ttaaagtagg aaatacatgt gccataagtg cattccagtg ttggctatcg +400  cacgaaaaga agtgcactta aataccatat actgccgagt tattttaatc +450  aagacatcga aatgcctaat  ataaaaaagg  atttttatat taagcataac  atttgcaaag tgatgttcat tatttgtacc cttgcgtata attgt  F i g . 32. Nucleotide sequence of the 474 gene from D. erecta. The capital letters indicate the structure gene beginning at position 1S7 and the three diagnostic nucleotides are underlined.  163  444-1  3'Arg-5 ^  B  L  444-2  774  777  ^Arg-i  BP '  B 1  1  H  H J  .  J  B  B D E 1 6  .  1.0 kb  F i g . 33- R e s t r i c t i o n map o f XDE16. Restiction sites for BamHI (B), H i n d l l l (H) and Pstl (P) are shown. The lines "y" and "z" underscore the regions of the JL clone that are homologous to pDt27R and pDtl6R, respectively. The dotted part of line "z" symbolizes weak hybridization with the pDtl6R probe, suggesting that the homology extends only part way into the 1.5 kb H i n d l l l - P s t l fragment. However, this same fragment shows very strong hybridization to pDt27R. The t R N A ^ genes found in this clone are shown above as small arrow heads pointing in their directions of transcription. The 444-1 and 444-2 genes are homologous to the R melanogaster pDt27R 444-1 and pDt27R 444-2 genes, respectively. The Arg-1 gene, depicted as a wide stemmed arrow, in this clone is very likely to be the antecedent that evolved into the four duplicated counterparts in melanogaster (Argl2.1 to A r g l 2 . 4 i n fig. 21). Abutting the X vector er  arm on the left is the 3 - h a l f of another t R N A S gene (Arg-6), which is homologous to A r g l 2 . 6 in melanogaster. However, this erecta gene is transcribed in the opposite orientation, and it is much closer to the t R N A genes, relative to its melanogaster counterpart. The 774 and 777 genes i n this clone are homologous to the corresponding genes i n pDtl6R from D. melanogaster. The exact locations of the two genes are not known since the resolution of the restriction fragments in the gel was poor during mapping, but both are very near the H i n d l l l site. A r  S e r  164  +50  taagaatgtg acattagtag ttatgtgatc ggtttttttt ttttctataa +100 aaatatcggt gatatgggcc ttaaagtcgg gtgattaggc cacaagtgtc + 150  atccaaGCflG TCGTGGCCGR GCGGTTRRGG CGTCTGRCTR GRRRTCRGRT +200  TCCCTCTGGG RGCGTRGGTT CGRRTCCTRC CGGCTGCGgt tgtaaatact +250  attttacttc gaacaagtaa accaaacatt tgaagcaaaa aggttacagt +300  atagagaata actaattatg caacaattgt taaaaaacct aactctggaa  F i g . 34. Nucleotide sequence of the 774 gene from D, erecta. It was cloned as a Hindlll-Ddei fragment and the 5-end of the Drosophila insert is immediately adjacent to the polylinker cloning site. The structural gene is depicted in capital letters and the three diagnostic nucleotides are underlined.  +50  ctttttatgc ctatgccttg aaatagagcc cccaatcccc aaaaactatt + 100  caatcgtgtt ttcagccaac ttggcgatcg gtgatcatta aGCflGTCGTG + 150  GCCGRGCGGT  TRR6GCGTCT  GRCTRGRRRT CRGRTTCCCT  CTGG6RGC6T  +200  RGGTTCGRRT CCTRCCGRCT GCGatagaaa cttgtttttt tttggaattt +250  ccgaaaataa tgcaagatcg gaaagtataa tatttagaag atatcttatg +300  ctgtttaaat atatgtcatg gtgaaaacat aaagtgtata gtaagtgaaa  ttatgcatta aaaaatatat aact  Fig.35- Nucleotide sequence of the 777 gene from D. erecta.. The fragment is cloned as a Ddel fragment. The structural gene is depicted in capital letters and the three diagnostic nucelotides are underlined.  166 reduced by deletion of at least 18 kb in D. erecta. The 35 kb Hindlll*BamHI fragment with homology to pDtl6R was subcloned into pEMBL18+. Restriction analysis and Southern blotting of the subclone with respect to the enzymes Ddel, Rsal, and HaelH indicate that there are probably only two t R N A not presented).  Ser  genes (data  The two smallest possible DNA fragments containing the separate genes  identified by further blotting (data not shown) were HindW+Ddel and a Ddel fragment of 400 and 550 bp, respectively. The ends of the two fragments were filled with all four dNTPs and were subsequently cloned into the BamHI site of pUCl3 which had also been similarly repaired prior to ligation. As shown in fig. 34, the 400 bp Ddel+Hindlll fragment contains a 774 gene (pDe774), while the 550 bp Ddel fragment contains a 777 gene (pDe777) (fig. 35) From the sequence data and the sizes of the templates, the two genes are situated at least 200 base pairs apart, but have not been mapped precisely.  (D) Isolation and Sequencing of the  D. erecta tRNA^ Genes Homologous to pDt27R er  As mentioned in the above section, 3LDE73 also contains other tRNA^ genes on the 1.6 kb er  BamHI fragment abutting the left arm of the phage (arbitrary orientation in fig. 33)  This  fragment and adjoining sequences to its right totalling 6.3 kb share strong homology to pDt27R. Digestion of this 1.6 kb BamHI fragment with the restriction enzymes Hpal I. Sau3a. Alul, H i n d i and then hybridization with GT7 indicated that the insert contains at least two tRNA$er g  e n e  s . These genes were cloned as 350 bp and 700 bp Alul fragments by blunt-end  ligation into the filled EcoRI site of pUCl3. Sequence analysis showed that both the small and the large Alul fragments contain tRNASer genes identical to 444-1 (fig. 36) and 444-2 (fig. 37) in pDt27R, respectively. From Southern blotting and mapping experiments (summarized in fig, 38), the arrangement of the two 444 genes from each other and from the flanking BamHI cloning sites (corresponding to two tRNA^g genes, see Chapter III) are virtually identical to that in D. melanogaster.  As will be discussed later, the conserved spatial  arrangement of the different tRNA genes would argue against simple reciprocal exchanges between tRNASer  a n  d tRNA7$er genes as the likely mechanism for creating the hybrid  167  ctaagttcgc tgagaaatta gaatcttgtc tagggtattg ggcacgcaga +100 acaacatgta GCflGTCGTGG CCGflGTGGTTflflGGCGTCTGRCTCGflflflTC +150 RGRTTCCCTC TGGGRGCGTfl GGTTCGARTC CTRCCGGCTG CGgagtaaat +200 ctttatttta tttagaagta tttttttttt attttttttt taaatttatt +250 tttgatgttt ttattttagc cagaaattaa actaatatat gttattgaaa +300 tagaattttc aacataacag cacatgtgaa agttaggtgt tttaatgcat  aUaattaat cgtgttacag aattatcgtt ctttaaagat c  F i g . 3 6 . Sequence of D. erecta 444-1 gene. The 5-end of the insert is adjacent to the polylinker cloning site. The sequence of the gene is depicted in capital letters. The three diagnostic nucleotides are underlined. The flanking sequences showed that it is homologous to the pDt27R 444-1 gene of D. melanogaster.  168  ctagtatgtt aacctttgga accgaaattc gcataaaatc ccgaagattt +100  ttggtattcg atcggtatga aGCAGTCGTG GCCGAGTGGT TAflGGCGTCT + 150  GACTCGAAAT CAGATTCCCT CTGGGflGCGT AGGTTCGAAT CCTACCGGCT +200  GCGgatggaa catttatttt acataattcc taggagaggg ttacattttt +250  gtgttccttt tgcttgacaa attcttcctg tctgctgaat cttttatcat +300  ataacattat ataaaatttc tcattctaat cttattcaag caaccacatc +350  tcaaattttt Uacgttacc tatttgtctg gcgttgcgtg gacttacaca  F i g . 37. tRNA4 gene of D. erecta homologous to pDt27R 444-2 gene of D. melanogaster. structural gene is depicted in capital letters and the three diagnostic nucleotides are underlined. Ser  169  Localization of the 444-1 and 444-2 genes i n the 1.6 kb B&mHI fragment of JLDE16 by oligonucleotide indirect labelling method. The BamHl fragment was released from the cloning vector by cleavage with BamHl (B). The enzyme Hoall ( ± ). which cleaves at nucleotide 77 in the 3' complementary strand of the aminoacyl acceptor stem of the structural gene, was used to locate the gene within the insert. Also, since the two genes were previously subcloned as Alul fragments, this enzyme was use to distinguish the order of the two genes within the BamHl fragment (Alul- • ) . Aliquots of the partial digests with Hoall and Alul were removed at time intervals at indicated (from 3' to 30). The products were resolved by electrophoresis in a 2X agarose gel and then transfer onto a sheet of Hybond nylon filter as described. Fragments specific to the Drosophila DNA were detected by hybridization to the oligonucleotide Arg3' (dotted line) and detected by autoradiography. The two genes are shown as arrow heads pointing in the same direction of transcription. The numbers I and 2. indicate the equivalent 444-1 and 444-2 genes in erects, respectively. Note that the spatial arrangements between the two tRNA " genes and between the 444-2 and Arg-1 at the right BamHl site in JlDEl 6 are similar to those in pDt27R in D. melanogaster.  F i g . 38.  561  170  B  •  _L  200 bp  B  I  171  genes.  (E) Isolation of oDU6R Homologous DNA Segment From R vakuba Of approximate 100,000 plaques screened, three different phages (&DY16-1, 3UDY16-3 and JM16-82) vere obtained (fig. 39). Only 3LDY16-82 has been used in the subsequent detailed gene analysis. Similar to JLDE16, this phage also contained homologies to both pDtl6R (region "x") and pDt27R (region V ) ; although in this case, their junctions are separated by at least a 2.2 kb Sstl+HindlH fragment. The pDtl6R homologous segment is localized within the 58 kb Hindlll fragment (fig. 39), and contains two tRNA^ genes. One gene has been cloned as a er  300 bp Sau3a and subsequently as a larger 2.5 kb Sstl+BamHI fragment, identified earlier by standard restriction analysis and Southern blotting. The OIL mapping strategy on the latter fragment showed that the gene is very close to the SstI site at one end (fig. 40). Combined sequencing experiments performed on both cloned Sau3a and the Sstl+BamHI fragments indicate that it is a 777 gene (pDe777) (fig. 41).  From the sequence data, the direction of  transcription is towards the EcoRI and Ss±I sites in the X restriction map (fig. 39). The other gene was mapped to, and cloned as, an 800 bp Sau3a and a 2.5 kb Sstl+Hjndlll fragments (fig. 42). Combined sequencing experiments utilizing the two cloned restriction fragments showed that it is a 774 gene (pDe774) (fig. 43). Mapping experiments showed that it is also very close to the SstI site, but the exact distance from the 777 gene has not been determined. The direction of transcription is deduced by sequencing through the Sau3a sites in the 2.5 kb Sstl+HindlH fragment (fig. 42).  (F) Analysis of oDt27R Homologous Region in JIDY16-82 Blotting experiments using radiolabelled pDt27R as the probe detected strong homology in the left most 50 kb region bounded by a SstI site in 3116-82 (region "w" in fig. 39). Rehybridization of the filter-bound 31 clone cleaved with Rsal. Ddel and HaelH with tRNA$  er  gene-specific probes (mp9Ser7 and oligonucleotides) showed that a single gene is confined within the left most 0.8 kb BamHI fragment adjoining the X vector arm,  Sequencing  172  444-2  774  |Arg-i  777  B B BE  B  I  II II  S  H  S  E  -LL  B  I  H  B  B  I I J  w DY16-1 DY16-3 DY 16-82  1.0 kb  I  F i g . 39. Restriction map of pDtl6R/pDt27R homologous region from D. yakuba. The thick line shows the chromosomal region with the restriction sites marked above. B-BamHl; S-Sstl: E-EcoRl: H-Hindlll. Below are the regions "w" and "x", which underscore their homology with pDt27R and pDtl6R, respectively. The dotted part of region "x" indicates weak hybridization to pDtl6R in the 0.5 kb H-B restriction fragment. Unlike HDE16 in fig. 38, the two regions here clearly do not overlap, but are separated by at least 2.2 kb of intervening DNA. The three individual phages collected from this chromosomal site, and their extent of overlap, are shown below the map. Only 3XDY16-82 has been used in detail analysis in gene localization and DNA sequencing. The t R N g e n e s are shown as small arrow heads, and the t R N A ^ - 1 gene is shown as a wide stemmed arrow above the restriction map, pointing in their respective directions of transcription. f8  173  Fig. 40. Localization of the 777 gene in JlDY 16-82 by the oligonucleotide indirect labelling method. The 2.5 kb Sstl-BamHl subclone was released from plasmid vector by cutting with Pvull at about 70 bp 5' to the universal priming site and at the MI site within the polylinker cloning site. The mixture of fragments were treated with "mapping" enzymes Hinfl, IaoJ, and Sao3a. Aliquots were removed at time intervals designated above the autoradiograph (top). The site of the tRNA*" gene, as indicated by the small arrow head to the left of the autoradiograph was localized by the overlapping Hinfl and Tag I sites and it is also shown below in the restriction map. The direction of transcription was deduced by sequencing through the Sj£| cloning site 3' to the gene. B-BjfflHI; S-SslI; ( • )-Hjn/I; ( ° KTaflJ; ( • )-S&u3a. The dotted line below the restriction map shows the Fl priming site. 5 1  B  • •  • 1  •  ,s  •  • • 1  1  200 bp  1  ,  1  175 +50 ctataaaatg ccgcattcaa tcagcaatcc tcatcaaaat aaaacaaacg +100 tcaactactt ttaacttcat taccattatc atcgaccaca cattgcttac +150 tcagctttta tgcctatacc ttgaaatagt ggccccaacc ccaacccccc +200 aaaaaacgat ccaatcttgt tttcacgcta acttggcgat catgatcact +250 aaGCRGTCGT GGCCGRGCGG TTRRGGCGTC TGRCTRGRflfi TCRGRTTCCC +300 TCTGGGRGCG TRGGTTCGRR TCCTRCCGflC TGCGatgcat atgagttttt +350 ttttggaatt ccaaaaatat agcaagatta gaaattatta gaagctagag  ctct  F i g . 4 i . Sequence of the 777 gene i n D. yakuba. The structural gene is depicted in capital letters and the three diagnostic nucleotides are underlined. Two restriction sites are used to orientate the direction of transcription in the molecular walk: EcoRI site (GAATTC) at nucleotide 306 and the SstI site (GAGCTC) at nucleotide 348. Both of these sites are highlighted by dotted underlining.  176  F i g . 4 2 . Localization of the 774 gene i n ADY16-82 by oligonucleotide indirect labelling method. The 2.5 kb Sstl-Hindlll subclone was released from the plasmid vector by cutting with EyuII 5' to the priming site (dotted line below the restriction map) and at the M I site. The mixture of fragments was partially digested with "mapping" enzymes H M I ( • ) . Tag! ( • ), Hoall ( • ), and Sau3a ( • ). Fragments were resolved in a 1.5% agarose gel and transferred to a sheet of Hybond as described. The DNA partial products specific to the Drosophila insert were detected by hybridization to F l . The arrow head to the right edge of the autoradiograph points to the approximate location of the tRNA^ * gene. The resolution in this gel region is too poor to tell the precise location and transcription direction of the gene. However, from ONA sequencing, the 5' end of the gene is close to a Sau3a site but more than 200 bp from the §gll cloning site. The restriction map is shown below, where the small arrow head points to the direction of transcription (H-HjndHI; S-&JJ). 61  177  200bp  178  +50  gatcagtaat attggcccta aaagtcgtta aagtccggta attaagcctc + 100  tggggtcatc caaGCflGTCG TGGCCGAGCG GTTflflGGCGT CTGflCTflGflfl +150  RTCflGflTTCC CTCTGGGRGC GTRGGTTCGfl RTCCTRCCGG CTGCGgttaa +200  tgaatattat tttattttaa ataattaaat caataatagg cattaccctt  ttactctatg aaattataca  F i g . 4 3 . Nucleotide sequence of the 774 gene from D. yakuba. The fragment was cloned as an 800 bp Sau3Al fragments and sequenced using gene specific oligonucleotides GT$ and GT7. Confirmatory sequences were obtained by using the universal sequencing primer R i . The structural gene is depicted in capital letters and the three diagnostic nucleotides are underlined.  179  +50 ggatccgatc ggcaagaaGC AGTCGTGGCC GAGIGGTTAR GGCGTCTGAC +100 TfiGAAATCRG RTTCCCTCTG GGAGCGTAGG TTCGARTCCT ACCGfiCTGCG +150 aagtgtataa aactatttta tttattttaa tacaaacaag gcgctttaaa +200 attctttaaa tatctttatt atgttctaag cacaaggtgt aaaaattgtg +250 t t t t c c t t t t gcttgacgaa ttcttcccgt ctaatgaatc ttttaccatt  taaaattcta taaaattccg cttttaatc  F i g . 44. Sequence of the 444-2 gene i n /> yakuba that is homologous to theftmelanogaster pDt27R 444-2 gene. It was cloned as an 0.8 kb BamHI fragment and sequenced using primers GT&, GT and F l . The structural sequence of the gene is depicted in capital letters and the three diagnostic nucleotides are underlined. The putative termination signal is 16 bp downstream from the gene (at nucleotide 116). The direction of transcription relative to other genes in the molecular walk is deduced from sequencing through the EcoRI site (GAATTC) at nucleotide 218, which is highlighted by a dotted underline. 7  180 experiment shoved that it is an expected t R N A ^  e r  gene (designated as pDY444-2), by virtue of  its limited but notable sequence homologies to previously cloned 444-2 genes from the other  Drosophila species (fig. 44). From the sequence data, it is clear that the direction of transcription is towards the EcoRI site ("118 bp downstream).  The other expected tightly  linked 444-1 homologous gene was not detectable in the X clone, presumably it was removed by the fortuitous presence of a BamHl site during construction of the library. Note here, again, that the spatial arrangement of the 444-2 and the BamHl site 600 bp downstream (corresponding to a tRNA^rg gene as discussed below) is also well conserved when compared  to both erecta and melanogaster  Part III- Rates of Flanking Sequence Divergence in Homologous tRNASer Genes From Different  Drosophila Species  Homologous bona fide genes from the different  Drosophila species were compared to  determine empirically the rates of divergence in the flanking sequences. For the hybrid genes, they usually show greater rates of divergence and the data obtained for these have not been condensed into a figure. They have only been dealt with as the occasion required. In general, 5'-flanking sequences of bona-fide homologous genes (444-1, 444-2 and pDtl6R-777) from the different species display divergence of approximately 20-30% from each other (fig. 45). In all bona fide genes, convincing coupled -5 and -20 boxes that are gene type specific are observed. The only exception is pDe444-2, where the -5 box is missing (line 3 in fig. 45). Only the immediate flanking regions of about 70 nucleotides are shown, but the general trend of sequence conservation is clearly pervasive in all cases where more extensive data are available.  Most of the differences between species are due to point mutations that are  scattered sporadically and evenly throughout the lengths of the sequences. A slight deviation to this general pattern is observed between the -5 and the -20 boxes in the 444-1 genes from  melanogaster anti erecta (line 1 vs 2) where the nucleotide changes are clustered. These clustered nucleotide changes are not random, as seen in a later study, but show homology with a DNA segment in the 5-flanking sequence of the  D. erecta 474 genes.  181  1. pDe444-1  -20 -10 qQQflaqTTflQflfltCTTqtcTflGqGTRTTGGGCQcGCogflQcflfiCFltGTfl  2. pDt2?444-1 RRaCTTooqTRGtGTRTTGGGCttGCQtflQqRRCRQGTR  3. De444-2 TTflflCCTTTGGaaCcGaaflTtcGCRTflflflaTCCcGflflGRTTtnGiiflllcGRTCGGtatgRfl 4. pDt27R444-2 TTRRCCTTTGGccCtGttRTatGCRTRRRcTCCgGRRGRTTgTTGGGflTTtGRTCcoqRRtRR 5. pDy444-2 GGRTccGRTCGGcfiflgflfl P  6. pCsl6-777 GCCCCgflTCCt CflflflRgcgfiTcCRRTCt TcTTTTCflt GCcflMnGaCGflTCcGcGflTCfllJflfl 7. De777 GCCCCcafiTCCCCflflflflflCtflTtCRRTCgTgTTTTCflGCcflMnGgCGflTCgGtGflTCflllflfl 8. pDy777 cccqqCCCCCCqRRRRflCqRTcCRRTCtTqTTTTCRcGCtRRCTTGgCGRTCa-tGRTCRcTRfl P  F i g . 45.  Comparison of 5 - f l a n k i n g sequences among homologous genes from different 1 vs line 2-69%; all inter se comparisons among lines 3,4 and 5-73%; all inter se comparisons among lines 6, 7 and 8-80%. Most differences are due to single mismatches. Amost all genes contain convincing -5 and -20 boxes (underlined), except pDe444-2 (line 3), where the -5 box is missing. In the case for the two 444-1 genes from erecta md me/anogasteriiines 1 and 2), greater perturbations are observed in the small region between the -5 and -20 homology boxes.  Drosophila species. The% of homologies are as follows: line  182 In the 3-flanking regions of these same genes, sequence divergence is surprisingly rapid. For the 444-1 and 444-2 genes, homology hovers just above randomness, ranging from 35% to 45%, respectively. Most of the conserved nucleotides can only be superimposed around the poly-T putative termination signals, with the rest of the sequences being essentially unique. However, the three 777 genes do share higher degrees of homology ("60%), although remain depressed relative to the 5-flanks (fig. 46), Against this backdrop of variability, the two 474 genes of  D. melanogaster and D. erecta  appear to record a distinctly different pattern of sequence alterations (fig. 47, lines 2 vs 3). The overall homology in the 5-flanking region is only "55%, with infrequent blocks of conserved nucleotides that are A-T rich in content. Also in contrast to the bona fide tRNA genes above, the mutations are not homogeneously peppered throughout. In particular, the flanking region closest to the gene, between -1 and -30, shows the most divergence (homology at best is 35% even with loopouts). However, the replaced nucleotides are not random; instead, as alluded to above, they embrace striking homology patterns and sequence alliance with their respective intra-specific 444-1 genes- "intra-specific" is defined as within a species (Dover, 1982) (fig. 47, lines 1 vs 2 and lines 3 vs 4). For the 774 genes, the 5'-ftanking sequences display average homologies of "72%-78%. The values are probably slightly inflated, since insertion or deletion of entire sequence blocks have been scored as single mutational events. Nonetheless, homologies are ample throughout most of the lengths of the sequences displayed in fig. 47 (lines 5-7). The two conserved boxes at the -5 (CATCAA) and -15 (GCCAC) positions, described in an earlier analysis for the D.  melanogaster pbtlTR-777 gene, are also present in the 774 genes from all three species. The presence of these two segmental homologies in all three sibling species would suggest that their origin may be ancient and that they may serve some biological function such as contact points for transcription factors. However, a similar functional significance is more difficult to assign to a third homology segment at the approximate position of -36. From comparison of the 774 genes between the different species, this small region also exist as clustered mutations. At least two separate mutational events are notable. Relative to D.  ereeta($}$77A,  183  1. pDe444-1 gfigtflafltcttTflTTTTfiTTTogflfigTaTTTTTtTTTTattttttttttaaatttatttttgat 2.  pQt27R444-i  aRtgflgflatgtaTflTTTTRTTTcRRaTgTTTTaTTTTctgaaattaaataaaaacgttctgca  3. pDe444-2 GfltGGaacaTtTRTTTTflcataflTtccTaggagagGGTtaCRTTTTTGtgttccttttgcttga 4. pDt27R444-2 GflaGGgtatTcctaTRTTTTTTRTgTTTtaaaaGGTgCRTTcttacagtTTTGaatatgtttat 5. pDy444-2 aflgtgtataaaacTRTTTTattTRTTTTaatacaaacaaggcgctttaaaattctttaaatatc  6. pCs777 flflTflGtaatcTGTTTTTTGGRflGTCCaGflfiflRTagflTcgacagflflGRTCaGflflflaagTflT 7. pDe777 flTRGaaactTGTTTTTTTTTGGflflTTtCCGflflfiflTflflTGCflflGRTCgGRflfigtataaTflTT 8. pDy777 RTgcatatgaGTTTTTTTTTGGRRTTCCaflRRRTfltaGCRflGflTtaGRflfltTRTTagaagctflg  F i g . 46, Comparison of 3'-flanking sequences among the homologous bona fide genes from the different Drosophila species. The % of homologies are as follows: line 1 vs 2-45%; inter se comparisons among lines 3, 4, and 5-35%-40%; inter se comparisons among lines 6.7 and 8-60%. Most of the conserved regions correspond to the putative poly-T termination signal. However, the three 777 genes (lines 6, 7 and 8) show higher levels of conservation (which is also observed in their 5'-ends). Thus, the 3 ends of genes record 25-40% faster sequence divergence in general, relative to their 5 -ends. Note that the trailer sequences between the 3'-end of the structural genes and the poly-T signals are not well-conserved. This appears to be a prevalent trend in the evolution of tRNA genes in general.  184 line 5), the yakuba. copy carries an additional sequence inserted at -36, AAAAGTCGT, immediately upstream from the GGGCCT tract (pDy774, line 6). In D. melanogaster, there is no extra sequence, but the latter tract is replaced with the sequence TGTAGCTA, which resembles a similar sequence in the intra-specific pDtl7R-777 gene (line 8).  Although this third  segmental homology present in the 774 gene is less dramatic relative to those in the 474 gene, the substitution again does not appear to be capricious, but submit to the same pattern of intra-specific sequence alliance and spatial conservation that has been the general theme with all other hybrid genes studied thus far. No homology was detectable in the 3-flanking sequences of pDe474 and pDe774 genes near the poly-T termination signal as would be expected from the melanogaster studies. This is not surprising, based on the surveys in the sibling species that even the 3' tails of homologous tRNA genes usually show extremely poor sequence conservation. However, pDe474 could also share a common lineage with other tRNAS genes in the cluster that have not been cloned. er  For example, a likely candidate would be the intraspecific pDtl7R-774 equivalent gene, based on the earlier analysis in melanogaster. Since this gene as well as the remaining tRNA$  er  genes have not yet been cloned, a complete analysis as for the melanogaster genes is not possible.  In summary, two distinct types of segmental homologies have been identified from sequence comparisons of all available 5'-flanking sequences of tRNA47$ genes. The first er  (type I) is the coupled homology elements that are always held at positions -5 and -20. They can in fact be divided further into two different subclasses since these elements are highly specific for either 444 or 777 genes. Because homology boxes are generally extremely rare in the 5'-flanking sequences of tRNA genes, the most striking aspect is the persistance of the -5 and -20 boxes in all of the bona fide tRNA$  er  genes obtained from different Drosophila  sibling species spanning at least 13 million years of evolution. It is possible that such a high degree of conservation may reflect some functional status within the cell. For example, they may serve to modulate the rate of or to direct the correct initiation sites for transcription  185  -30  -20  -10  1. pDe444-1 gaat ct t gt ct agggt aTTGGGCflCgcagflRCRflcat gt a 2. pDe474 TttatqoaTqflRtqcTRTRTcflacaTtqTRttflTTTTRGGGCRCtcttRRCGRttCflTRccaRR 3. pCS474 ToatactcTtRRcttTRTRTtRgttTctTRqRTTTTRtTGaTRTTttttTTGCGCRTRtRTCRRG 4. pDt27 444-1 aTRgTG-TRTTgggcTTGCGtRggqRRCRRGtq  5. pDe774 -50 -40 -30 -20 -10 RTcaGTGRTRTqqqcctTRqRGTCGGGTqRTtRgGCCRCqqGtGTCRTCCRR 6. pDy774 cqGTqflTRTtggccct g^g^gfTRqRGTCCGGTqRTtRqGCCTCtgGgGTCRTCCflfl 7. pCS16-774 fiTqcqGTGRTflTtqqJiJJGXIflgflGTCGGGTaRTaRqGXCICtgGqGTCJKM 8. pDt17-777 ccqqcctctTGCRCCTcttgqqctcaqttttcGCCRCccqccCRTCRR  F i g . 4 7 . Evidence for concerted evolution of t R N A 4 7 genes. Comparison of the 474 genes from erecta and melanogaster shoved rapid sequence divergence in the immediate 30 nucleotides 5' to the genes (lines 2 vs 3). The only obvious conservation is the sequence, CATA, at position -6 in pDe474 and -8 in pCS474. The sequence divergence is far from random, but the nucleotide replacements show strong homology with their intra-specific 444-1 genes. The 774 genes from three different species (lines 5-7) also showed coincidental clustered mutations at approximately -40. In melanogaster, the replacement sequence in pCS 16-774 is almost identical to a spatially conserved tract in pDtl7R-777 (lines 7 vs 8). Note that the co-evolving DNA segments between non-allelic genes within the samespecies (intra-specific) are reflected as clustered mutations between the homologous genes from different species (inter-specific). Ser  (  186 dictated by the particular gene types above. The second type (type II) of segmental homologies is much more heterogeneous in sequences, in sizes, and in their spatial arrangements. One generalization than can be applied would be that they are only found in gene pairs; one of the genes is always a hybrid gene and the other is always a bona fide gene. The other generalization is that, with rare exceptions, both 5'- and 3 - segmental homologies are not well-conserved across species.  Since they do  not usually cross species boundaries, this would strongly suggest that these small segmental homologies arose only after the sibling species have diverged from 2 to 37 million years ago. The origins of these small homology patches would make sense if the focus of attention is now shifted to the comparison of the 5'-flanking sequences of those homologous genes derived from across species. For example, when the homologous 444-1 genes (fig. 45, lines 1 and 2), 474 genes (fig. 47, lines 2 and 3) and the 774 genes (fig. 47, lines 5,6, and 7) are aligned, a background of mutations randomly distributed throughout the lengths of the DNA is readily apparent. Above this background "noise", there are also major interruptions in homology characterized by clustered base changes.  In every case, these major interruptions can be  aligned in sequence and in a spatially conserved manner with another tRNA$ gene, which er  is invaribly intra-specific. Thus, like the three point changes internal to the genes, the clustered changes in the flanking region are not novel mutations but are merely "renovations" using materials from a known donor genes within the 12DE gene cluster. 3-Flanking homologies have also been identified in the tRNAS  er  genes in  melanogaster  where more complete data are available. One may argue that since this region is A-T rich, the homologies may only be apparent owing to this biased nucleotide content. This explanation seems unlikely. Again, when homologous genes are compared from across species, the nucleotide sequences in these trailer regions are only slightly above random.  Sequence  randomness in the trailers tends to be a general feature also found in other families of tRNA genes even when they are closely linked; for example, tRNA^rg at 12DE reported in Chapter III and those at 42A reported by Yen and Davidson (1980). Hence, based on considerations from other tRNA gene families and from the homologous tRNA$ genes from across species, er  187 those segmental homologies for the non-allelic t R N A  Ser  at 12DE would be highly significant  (1.8- to 2.0-fold higher in sequence conservation), despite their biased base content. The most intriguing correlation from the above sequence comparisons is that both the segmental homologies in the flanking sequences and internal base changes at positions 16 and 77 within the structural genes are always inherited together as either tRNASer- or tRNA 7$ -type. Thus, these telltale signs obtained from the independent analyses above do er  converge and assemble into coherence entirely conforming to the central prediction of Molecular Drive.  Furthermore, the above observations do not support a model invoking  reciprocal exchange. Note that the local organization of both tRNASer and tRNA^rg genes in HDE16 (from  D. erecta ) and M6-82 (from D, yakuba ) is extremely similar to that in D.  melanogaster, with no gross  rearrangements detectable  in  the  two corresponding  chromosomal regions. Although the spacer DNA between the local tRNA gene clusters is variable, their rearrangement does not extend into or affect the relative organization of the genes. Furthermore, in both  erecta and melanogaster, it has been clearly demonstrated at  the DNA sequence level that the 444-1 and the 474 genes are co-evolving. If these hybrid genes were generated by DNA slippage followed by recombination, it would be expected that the tRNA S genes on either side would be deleted during straddling and recombination Ar  between the tRNA$  er  genes. This definitely has not been observed.  Note that while the  t R N A 8 gene cluster encoded in the pDt27R region does fluctuate in size between different Af  melanogaster strains and species, as will be discussed shortly, the fluctuation would be better explained by standard recombination process within the small direct repeats flanking these genes, rather than as the direct results of conversion events between the adjacent tRNA$  er  genes. The structures of the tRNAS genes themselves have not changed across species and the er  reason for this is not clear. However, the fact remains the the 474 and the 444-1 genes in the two species examined,  melanogaster and erecta, always evolve cohesively despite their  divergence from each other between 2 and 13 million years ago suggests that perhaps certain types of gene interactions are favored, constrained either by selection or by the type of  188 surrounding sequences (such as stabilization by repetitive elements, hotspots for conversion or pairing constraints of the chromosomes), that may exert a strong influence on the frequencies and types of hybrid genes observed.  189 CHAPTER  III  tRNAArg Genes at 12DE As suggested by the observations in the previous chapters, it is likely that the hybrid tRNASer genes were formed by conversion events between the bona fide 444 and 777 genes. All of the existing models that have been formulated to explain gene conversion at the molecular level invoke heteroduplex formation as the key intermediate step in the process (reviewed by Szostak et. aJ„ 1983). Thus it is conceivable that occasionally the DNA strands at 12DE may slip and slide, causing adjacent 444 and 777 genes to mispair, forming the heteroduplex as the necessary prelude to gene conversion. Because this region is also rich in repetitive sequences, it is also possible that mispairing at some of these repeats may assist in stabilizing the t R N A ^ r / t R N A y ^ r heteroduplex, or may even be agents critical for initiating the slip-slide events themselves.  Since half of the tRNA$  er  genes at 12DE have DNA  patchwork reminiscent of gene conversion (see Discussion), and also that the repetitive elements in this same region are highly polymorphic between the different  Drosophila  species, these observations would suggest that DNA slippage and recombination may be common occurrences. Since the tRNA rg genes are interspersed among the tRNA$ A  er  genes, it is reasonable to  assume that both gene families should be equally susceptible to this local distortion of the DNA. Thus, in this chapter, I have exploited the tRNAArg genes encoded within pDt27R as independent monitors for DNA slippage in this region. The inpetus to this work was provided by the available sequence data of pDt27R (Newton, 1984). Downstream from the tRNA-^r genes, there are four clustered BamHI sites, which are actually part of the coding sequences (beginning at position 36) for four duplicated tRNAArg genes. Each of the duplicated units is demarcated by two flanking direct repeats, TAGCCCAA (fig. 55) Two of the duplicated cassettes are 600 bp in length (pArgl2.1 and pArgl2.2) and the others are both 200 bp variants (pArgl23 and pArgl2.4) (fig. 21). The 600 bp units differ from the smaller 200 bp ones by extra nucleotides in the 5'-flanks, but all four duplicated units are virtually identical in their  190 3' regions. The structural sequences of the tRNA S genes are also identical, but bear a Ar  characteristic T13 - C13 mutation that distinguishes them from all other members in the genome (Newton, unpublished observations). The above organizational pattern of the tRNA S genes would thus strongly suggest that Ar  the gene cluster arose from repeated duplication of an ancestral gene.  The most likely  mechanism would involve unequal exchange at those direct repeats demarcating the duplicated units (at least for one of the duplication events). Therefore, even though the evolution of the tRNA4/7S and the tRNA S genes may involve two distinctive mechanisms, er  Ar  both are most likely to be related by the same triggering event of local mispairing of DNA. The degree of fluidity at the t R N A  A r £  gene cluster was first measured by genomic Southern  blotting experiments on commonly available laboratory strains of  melanogaster (Newton et.  al, 1987). It was shown that 40 of the 45 strains (from Europe, North and South America, Asia and Africa) have the four duplicated t R N A § genes arranged as described above; while the Ar  rest have only three copies of the genes, with one of the 600 bp units missing. However, no copy number lower than three has been found in these strains.  All sibling species, except  sechellia and orena which have not been measured, showed only a single gene at the homologous sites. These encouraging results thus prompted an extended study of their organizational patterns, at the DNA sequence level, in the more distantly related sibling species  erecta and yakuba. Their corresponding DNA segments have already been cloned  from the earlier studies (Chapter II). From the previous examination of tRNA$  er  genes presented in Part II of Chapter II  (sections C and F), XDE16 and 31DY16-82 also harboured pDt27R homologies at one end of the clones (fig. 33 and 39). Blotting experiments with tRNA S-specific oligonucleotide probes Ar  (Arg5' and Arg3') have confirmed the presence of these genes associated with BamHl sites. However, in both of these species, only one t R N A « gene has been identified at the Ar  homologous locations, in agreement with the previous Southern analyses. solo genes, there is a small sequence resembling the  For each of these  melanogaster repeats 3' to the gene, but  insufficient sequencing data in the 5'-end preclude the identification of a similar upstream  191 copy. However, the high degree of sequence conservation in the 5'-flanks of the tRNAArg across species would suggest that a similar copy should exist. The results would thus strongly support the notion that the present day tRNAArg gene cluster in  melanogaster has been  derived by unequal exchange at the short repeats flanking the ancestral single unit. Even though the tRNA$ and the tRNAArg genes at 12DE have embarked on separate evolutionary er  pathways, their respective organizational patterns are probably rooted in the common triggering mechanism of regional DNA distortion.  R E S U L T S tRNA Arg Genes in XDE16 From  D. erecta  Two oligonucleotide probes, Arg5' and Arg3'< were used to identify coding sequences for tRNAArg genes within the il clone. The former probe is identical to the coding strand from +3 to +20 while the latter is identical to the tRNA sequence from nucleotides +56 to +72. Blotting experiments revealed a single tRNAArg gene (designated as pDeArg-1) marked with the expected BamHI site "600 bp downstream from the two t R N A ^ r genes. Moreover, probing with the Arg3' also uncovered a 3" half of another tRNAArg gene (designate as pDeArg-6) associated with the left-most BamHI site abutting the JL vector arm (fig. 33) Sequencing experiments show the expected pDeArg-1 downstream from the tRNA4$  er  genes (fig. 48). However, the structural sequence of pDeArg-1 differs from its counterparts in  melanogaster by having aT, ratherthan C, at nucleotide 13 within the Block A promoter. The 28 nucleotides immediately 5' to the structural gene are well conserved between  erecta and  melanogaster(%2%). This extent of homology defines the 5' limit coincidental with the 200 bp duplicated units (pArgl2.3 and pArgl2.4) in  D. melanogaster Beyond this transition point, the  5' flanking sequence of pDeArg-1 shows about 70% homology with pArgl2.1 of melanogaster, The pattern of homology is just the reverse in the 3' flanking region, where significant matches (~80%) can only be detected near the small repeat and beyond, beginning at 55 nucleotides downstream from the gene (nucleotide +195 in pDeArg-1, fig. 48) with the same region of pArgl2.4 of  melanogaster. The interstitial sequence block between the structural  192  +50 tttatcttgt aaaagggcct tacttttgct tacatttggg tgacatatgc +100  tgggaatttg ggcggcaatt gcgcaactGA CCGT6TG6CC  IRRTGGHTRR  + 150  GGCGTCGGflC TTCGGflTCCGflfiGHTTGCRGGTTCGflGTCC TGTCflCGGTC +200  Gtgatgctaa gtttttattt ttcgtaagcc caataaaata attccatggg +250  agattcccta gccaataacc ctttttgtgt aacctgagtg aggtaagcag ccatcccaac caattggcat a  F i g . 48. Sequence of pDeArg-1 from D. erecta, The gene was cloned as separate BamHl fragments. Several isolates were independently picked and sequenced by using the universal sequencing primer F l . Thus, only sequence from one strand was obtained but confirmed several times from the independent isolates. The gene is depicted in capital letters and the characteristic BamHl site, GGATCC, is located at position 114 (dotted underline). The nucleotide T is highlighted by a solid underline (position 91), rather than the C13 that is characteristic of the clustered melanogaster t R N A ^ genes. The 3 oligonucleotide corresponding to the melanogaster flanking repeats is located at position 208 in the above figure (boxed). The other direct repeat upstream from the gene has not been sequenced. The region that is highly homologous to the 200 bp melanogaster duplicated units starts at nucleotide 51 and ends at nucleotide 238. The 5'- and 3'- flanking sequences of pDeArg-1 show the most homology with the corresponding regions of the two outermost genes in the melanogaster cluster, pArgl2.1 and pArgl2.4, respectively. The most rapidly diverging region corresponds to the 54 nucleotide sequence between the end of the tRNA structural gene and the 3' small flanking repeat (nucleotides 152 to 195). 1 3  \  193  +50 GGRTCCGRRG RTTGCRGGTT CGRGTCCTGT CRCGGTCGat tgtctaactt + 100 ttttttcttg tatacgacat ataatttcct tgagttctga atattacata  ttttaltaat tcgtcaagcc  Fig.49. Sequence of 3-end of pDeArg-6 from D. erecta The structural gene is displayed in capital letter. The BamHI site (GGATCC) marking the gene is at the 5*-end of the insert DNA (underlined). The 3 flanking sequence of the gene shows approximately 75% homology with pArgl2.6 of D. mefanogaster'xu the pDt27R molecular walk. Several different isolates were sequenced using the universal primer, F l . Only sequence information from one strand of the DNA was obtained but the above were confirmed several times.  194 gene and the small direct repeat, which coincidentally delimits the 3' tail of all duplicated units in melanogaster, is composed of completely divergent nucleotides . Abutting the left-most BamHI site in X.DE16 is the 3' end of another tRNAArg gene, pDeArg-6 (fig. 33)  Sequencing experiments demonstrate significant homology between this gene  region with its melanogaster counterpart, pArgl2.6, a gene that placed at least 11 kb away from the tRNA4$ genes (established in molecular walk) (fig. 49). In addition, this tRNAArg er  gene is transcribed in the opposite orientation in the two species. It is not known whether the intergenic sequences have been deleted in R. erecta or if the tRNAArg gene has been brought to close proximity to the t R N A 4  Ser  genes by a simple inversion. Blotting experiments  using probes derived from the pDt27R molecular walk in D. melanogaster failed to resolve this issue since they hybridized to many sequences in the D. erecta genome (data not shown, but see fig. 60 lane 6).  tRNAArg Genes in XDY16-82 From R. vakuba The pDt27R homologous region of this phage was also analyzed for the presence  of  tRNAArg genes. Only one was identified (designated as pDyArg-1), again in association with a BamHI site approximately 600 bp downstream from the lone tRNA4^ gene as illustrated in er  fig. 39. Sequence analysis shows that it is identical to pDeArg-1, also retaining the "original" T nucleotide at position 13 (fig. 50).  Comparison of the 5-flanking sequences between  pDeArg-1 and pDyArg-1 show that the 49 nucleotides immediately 5' to the coding sequences have predominantly single base changes totalling ~23% in divergence. Beyond this point, the mutational events are mixture of deletions of 2-7 base pairs and extensive base substitutions in strings of at least 20 nucleotides in length. Similar to the case described in the R. erecta section, even though the more distantly located 5-flanking region of pDyArg-1 shows correspondingly more divergence but still maintains strong ("70% on average) homology with pArg 12.1 of melanogaster. The 3-flankspDyArg-l also shows strong homology with pArgl2.4 of melanogaster (80%) starting at the small direct repeat and beyond. The interstitial block between the structural  195  +50 ggatccttga tgatgtcttt aatattatta atgcactaac tltaagtata +100 aaataatgat taaataagta tgttaatgta aagcgaggtt tatctgcaat +150 atgagaaact atcatcaatg atagtcagct tacttacatg ggcgttacat +200 atgttcggaa tttcggacgt cgattgcgta actGRCCGTG TGGCCIRRTG +250 GRTRRGGCGT CGGRCTTCGG RTCCGRRGRT TGCflGGTTCG RGTCCTGTCfl +300 CGGTCGtaat gtcaagtttt tatttttcgt aatccccatt taataattgjt +350 agcccagtct ttttgtaacc tgagtggagt aagcgggaat agcaaccaat +400 tggcaaaccc aattgaaaga tttattggac ttttacatgg gtcttcttcc  atggacgacg aatcaacatg tggctgccat c  F i g . 50. Nucleotide sequence of pDyArg-1 from D. yakuba, The sequence of the structural gene displayed in capital letters. The gene was cloned on separate BamHl fragments. Several independent isolates were picked and sequenced with the universal primers. The BamHl site, GCATCC, marking the gene is at position 212 (dotted underline), which is 3' to the anticodon TCG. The nucleotide T 1 3 within the gene is underlined. The putative termination signal is 11 bp downstream from the structural sequence. The region that shows strong homology (>80%) to the 200 bp duplicated unit in D. melanogaster begins at nucleotide 157 and ends at nucleotide 324. Beyond this region, the 5 -flanking and 3 -flanking sequences show 70% to 80% homology to the corresponding regions of the melanogaster genes pArgl2.1 and pArgl2 respectively. The most rapidly diverging region is the sequence between nucleotides 257 and 299, which is between the end of the structural gene and the direct repeat-like oligonucleotide sequence, TAGCCCA. This putative 3' direct repeat is highlighted by a box.  196 gene and the direct repeat is also composed of unique sequence. When this interstitial block is compared between pDyArg-1 and pDeArg-1, there remains some weak homology (~60%). The differences include many single base changes and small deletions. Thus, it appears the most rapidly diverging sequence in the three sibling species occurs between the 3-end of the structural gene and the direct repeat.  Thus, in summary, there is only one tRNAArg gene in the pDt27R homologous regions in both erecta and yakuba. The structural sequences of the tRNAArg genes from both sibling species are identical but differ from the melanogaster counterparts at nucleotide 13 (C in melanogaster and T in the other two sibling species). Since the nucleotide T also occurs in all other melanogaster tRNAArg genes, except those at 12DE, it would suggest the ancestral gene was probably similar, if not identical, to those represented by the two sibling species. The 3tails within the duplicated regions among the genes from the different species are not well conserved. Weak homology is still detectable for species derived from the same sibling species complex, but completely divergent if they are derived from the two different species complexes (e.g. yakuba vs melanogaster). However, sequence identity of about 80% is evident starting at the small direct repeat and beyond for all inter se comparisons. Insufficient sequence data in the 5'-flanking sequences of either pDeArg-1 or pDyArg-1 preclude the confirmation of a similar upstream small direct repeat. However, it most likely exists in these two genes as well, judging from the high degree of homology shared among the available flanking sequences in all three sibling species.  Also a minor generalization  stemming from sequence comparisons of the tRNAArg genes across species showed that their 3-ends immediately outside the genes to be the most rapidly diverging region, which is also the case with the tRNA$ genes reported previously. er  The presence of the small repeats surrounding the tRNAArg genes, well-conserved across species (at least for the downstream copies) and demarcating each of the duplicated cassettes in melanogaster, suggest that unequal exchanges at or near the repeats could be responsible for at least the first duplication step in the tRNAArg genes. Subsequent duplication steps  197 could occur anywhere within the duplicated units, and not necessarily restricted to the repeats, would still yield the same final morphology in their organization. In fact, duplication of the first tRNAArg gene would enhance unequal exchange by providing an increase in the length of homology and may explain the propensity for the higher copy variant in the  melanogaster population (i.e. 90% four copies vs 10% three copies) and no two or one copy variant. The conserved existence of the tRNAArg genes flanking the tRNA4^ genes in the sibling er  species would undermine standard recombination for generating the hybrid tRNA$ genes. er  In contrast, both the appearances of hybrid tRNA$ and duplicated tRNAArg genes at 12DE er  could be explained by DNA slippage as the common triggering mechanism in contributing to the divergent evolutionary pathways observed in the two gene families.  198 DISCUSSION  1. The Overall Molecular Organization of 12DE A chromosomal walk has been conducted in the polytene chromosome bands 12DE in D  melanogaster  using both cosmid and Jt genomic libraries. A total of eight tRNA$ 47 genes er  have been collected from this chromosomal region that is at least 157 kb in size.  A  compendium of these genes and their sites of origin on the molecular map have been tabulated in Table V. While there are two each of the 444 and 777 genes containing sequences predicted from their tRNAs (Cribbs, 1982), there are three other hybrid structures that are composites of the tRNA47Ser genes (two 774 and one 474). The remaining is designated as 444* found in the pDU7R molecular walk, It is characterized by a C50 to T50 mutation at the tip of the extra arm, while its three diagnostic nucleotides remain indicative of tRNA4S . er  Nestled among the tRNA$ preparation and fig. 51)  er  genes are six tRNA^rg genes (Newton et. al, manuscript in  Of these, five are encoded within the pDt27R domain and the  remaining one is ~200 bp upstream of the 444* gene in the pDtl7R molecular walk. The overall molecular organization of the tRNA genes at 12DE would thus conform to other previously studied clusters where two or more families of tRNA genes have been found to coexist. In this case, the direction of transcription for the X-linked genes also appears to be completely random. However, there are two unique properties associated with the organization of the t R N A 4 j S genes not found in other clusters. First, this is the only known gene cluster that er  harbours two different isoacceptors for the same tRNA; whereas all other clusters reported so far, house but a single isoaccepting species for any tRNA.  Second, the tRNA4$  er  and the  tRNA7$ recognize two non-overlapping sets of codons and are thus functionally distinct. er  Yet, their genes are extremely similar in sequence.  These two unique properties may  facilitate interactions, such as conversion or reciprocal exchange, between the two gene types to produce hybrid sequences. This latter point is discussed more fully in section 2 below. All of the tRNA4 7$er genes in the (  D. melanogaster  genome have been previously  199  TABLE V - A Summary of tRNA Genes Identified in Bands 12DE  Domain  Size (kb)  Genes  pDt73  ~33.5  474  pDtl7R  "45  444*,774, 777 tRNAArg (T13)  pDt27R  ~56.5  444,444 one tRNAArg (T13),four tRNAArg (CI3)  pDtl6  ~22.5  777,774  200 identified as either eight EcoRI or seven Hindlll restriction fragments by Southern blotting (Cribbs et at, 1987b). Approximately half of these fragments from either restriction digests have been assigned as X-chromosome in origin based on their weaker intensity of hybridization with DNA prepared from the male (one X chromosome) relative to that from the female (two X chromosomes).  All of these X-linked tRNA$  er  restriction fragments were  accounted for in the molecular walk. The results are summarized in fig. 51b. Although minor variants of genomic clones were occasionally obtained, the only thoroughly substantiated polymorphism detectable is in the clone 420R, where the Hindlll  Drosophila insert is ~17 kb  rather than the smaller 6.5 kb found in the original pDt27R (fig. 51. lane 5). This larger Hindlll fragment probably represents a high frequency variant, since it has been repeatedly observed in genomic Southern blots in several different fly strains (data not presented). Thus, it is likely that of the approximately 12 tRNA^Ser genes estimated in the genome (Cribbs et at, 1987b), only eight genes exist at 12DE. This chromosomal region appears to be well under-represented in all genomic libraries used in the experiments. With the exception of the pDt73 molecular domain, most Jt or cosmid clones occur at 2.5 to 5% the frequencies expected of impartial representation (Bender et al, 1983: Kaiser and Murray, 1985); further, impasses met in one library were often encountered repeatedly in all others as well. The reason for the under-representation of sequences is unclear, but it may be due to the high density of repetitive elements present at 12DE that are poorly tolerated by the more commonly used £ «>//'hosts. Some of these repeated sequences are probably non-essential for viability, since those tested are either entirely absent or highly variable in numbers in the different genomic Southern blotting.  Drosophila sibling species as revealed by  For example, the 10 kb Hindlll+EcoRI fragment between  coordinates 30 and 38 in the pDt27R walk contains sequences that are moderately repeated in  melanogaster, simulans, mauritiana and erecta, but highly repeated in both teissier yakubaXdaia, not presented). Also by direct molecular cloning, it has been demonstrated that the intervening sequences (at least 20 kb) separating the pDtl6R and pDt27R domains in  melanogaster are absent in both I clones derived from erecta and the yakuba (figs. 33 an  201  Fig. 51. tRNA47 and t R N A 8 genes at 12DE, (A), h and cosmid clones with the least overlap representing the 157 kb from 12DE are shown. The recombinant clones have all been cleaved with Hindlll and resolved by electrophoresis in a 0.7% agarose gel. Lanes 1-3L736, lane 2=31731 and lane 3-31739 are from the pDt73 walk; lane 4-312161R is from the pDtl 6R walk; lane 5-420R. lane 6-cosP273R are the from pDt27R walk; lane 7-3L1731R, lane 8-31731. lane 9-311722 are from pDtl7R the walk. (B). The DNA from the gel was transferred onto a sheet of Hybond filter and probed with GT7, which is specific for tRNA genes. The hybridization signal in lane 2 (-4.7 kb), lane 4 (-8.0 kb), and lane 8 (-10 kb) correspond to the Hindlll Drosophila inserts cloned in pDt73, pDtl6R, and pDtl7R respectively. The hybridzation signal in lane 5 is a-17 kb polymorphic fragment corresponding to pDt27R which has been encountered in a number of nonisogenic strains. The 1.8 kb hybridization signal in lane 6 is a small fragment overlapping into the tRNA-j " gene in pDt27R. One end of this fragment contains the Hindlll site from the polylinker cloning site in the cosPneo; the other corresponds to a real Hindlll site in the genome. (C). The same filter hybridized to Arg3'. The only signals clearly detectable are in the pDt27R (lanes 5 and 6) and pDtl7R (lane 8) regions. Some faint bands are due to background from over-exposure of the film. Only the tRNA genes are detectable with total 4S hybridization suggesting that tRNA^s genes may be under-expressed in viva Ser  Af  S e r  47  581  Ser  202 o ' ~d  CN I I I  n i  o  CN CN I I  aYHQd oo  avztad a9i*a •« d  c/tad tN  II  aYUrjd oo IX  t 11  1  it  szztad  * • t «  a9nad ^  1 II  I  CO  C/(Qd  CN  I  I  f  I( I  (  (  I  IIIII  II  •O d  203 39). One unusual class of repeats that are also non-essential for viability but known to be functional have been encountered during the molecular walk in the pDt73 domain. They have been identified previously by Hardy and Kennison (1980) as a genetic unit, called Ste or  stellate, playing a role in male fertility. Subsequent molecular cloning by Lovett et at (1983) and Livak (1984) showed that Ste is a multi-gene family consisting of 1.3 kb repeats that are tandemly reiterated "200 times on the X chromosome near polytene bands 12F and "80 times on the long arm of the Y chromosme.  The copy numbers are highly variable for different  melanogaster strains, and entirely missing in all four of seven sibling species tested (this report and also Livak, 1984).  Their function remains unknown, but those on the Y  chromosome may also play some negative regulatory role controlling the expression of the Xlinked copies. It has been shown that the poly A* RNA homologous to a Ste cDNA clone are 30 to 70 times more abundant from XO testes or from animals carrying Y deficiencies deleted for the presumptive regulatory region, than from XY testes. The high levels of RNA homologous to the 12F sequences are exactly correlated with the appearance of star-shaped crystals in the testes (and hence the name) and sterility in the male. The four domains obtained by molecular walking presented in Chapter I have not been successfully joined as a contiguous chromosome segment, and thus the overall organization of the t R N A  S e r  and tRNAArg genes within the polytene bands 12D to 12E2-3 remains unknown.  However, a number of inferences can still be derived from several related observations. From the pDt73 molecular walk, it is known that the 474 gene is approximately 9.0 kb away from the Ste  sequences, which in all probability occupy segments 12E-12F on the  chromosome as a continuous block (Chapter I and Livak, 1984). Previous  polytene  in situ hybridization  using purified tRNASer has demonstrated that all coding sequences are localized distal to 12F (Hayashi et al., 1980). Hence, the juxtaposition of pDt73 and the Ste  sequences in the  molecular walk indicates that the 474 gene is almost certainly to be the most proximal in the cluster (closest to the centromere). adjacent in in both  The other two domains, pDtl6R and pDt27R, could be  melanogaster, a possiblity that is intimated by the spatial conservation observed  D. erecta (JLDE16) and yakuba (3LDY16-82) despite their divergence approximately 16  204  F i g . 5 2 . The c u r r e n t progress i n the assignment of the X-linked tRNA genes to polytene bands. The line at the top of the figure is the genetic map of the X chromosome with well known markers as indicated (y-yellow; tv-crossveintess; v- vermilion; g-garnet; f- forked). The dark circle to the right of the genetic map represents the centromere. The tRNA genes are proximal to the genetic marker garnet, which is located at 12B. Below is the expanded area near garnet as seen in the polytene X chromosome (redrawn and modified from Waring et. al., 1983). By molecular walking and by the copy number estimation of the Stellate sequences by Livak (1984), these testis-specific genes are proximal to the tRNA gene cluster and exist as a large block of tandem repeats totalling approximately 260 kb in size between 12E and 12F. The following deficiency mutants are in the process of being analyzed by both in situ hybridization and genomic Southern blotting. Df(l)g'fB deletes segment from bands 12A to 12E; HA92 deletes segment from bands 12A&-7 to 12D; KA9 deletes segment from bands 12E -3 to 12F/13A; RK2 deletes segment from bands 12E1 to I3A2-5- Only Df(l)g'fB is known to delete the entire tRNA cluster as well as removal of a large proportion of the Stellate sequences. The two small deficiencies, HA92 and KA9, do not show a decrease in hybridization intensity as determined by in site hybridization with t R N A ^ (S. Hayashi, unpublished) and Southern blotting with cloned probes (data not shown) relative to the 23E site on the autosomal arm 2L. The mutant strain Df(l)g'fB/RK2, which is homozygous for the deletion from bands 12E1 to 12F1, has been constructed by Dr. D. A. R. Sinclair. Hybridization of pDt73 and pDtl7R to genomic Southern blots prepared from this strain failed to show any signal. However, re-hybridization of the same filter to pDt27R and pDtl6R showed intense hybridization of the expected sizes (Sinclair, unpublished). These results would strongly suggest the order of the plasmids containing tRNA genes from proximal to distal along the Xchromosome is pDt73, pDtt7R, [pDtl6R and pDt27R|. The above results, along with those obtained with the strain HA92 from the in situ hybridization experiments, would thus suggest that the tRNA genes are probably located at or very near band 12E1. The Df(l)g'fB/RK2 strain also removes >90% of the Stellate signal from the earlier identified X-linked site (Sinclair, unpublished), but the flies survive to adulthood. This would imply that almost the entire 12E chromosome segment contains no essential genes for viability. The adult females, Df(l)g'fB/RK2, are sterile however, and it is suspected that a female sterile mutation is located near one of the tRNA genes, possibly proximal to the tRNA gene cluster. Previously, a female sterile mutation has been genetically assigned to this region (Waring et. at., 1983). 2  206 million years ago, and by deletion analyses employing various mutant  melanogaster strains  (fig. 52). However, this remains a reasonable conjecture only, resting on the assumption that the two domains have also remained sequentially conserved and not been translocated elsewhere by inversions during  D. melanogaster evolution. Using several deficiency mutants  deleted for small segments within 12DE, the combination of  in situ hybridization and  Southern blotting experiments further suggest that all coding sequences are distal to band 12E2, and possibly within band 12Ei (Leung, Hayashi and Sinclair, unpublished observations). The current status of the mapping studies has been summarized in fig. 52.  2. Co-evolution of the tRNA4, 7 S e r ^  e n e s  Of the 82 nucleotides in the coding sequences between tRNASer  | tRNA7$ genes, they er  a n £  only differ from each other at positions 16, 34 and 77. As alluded to earlier, there are several permutational forms that are strictly altered at these three possible positions (in addition to the modified 444* gene).  Changes at these sites are significantly non-random, with  substitutions only resonating between nucleotide that are diagnostic of either bona fide genes. Such predictability in the observed mutations was the mainstay that inspired Cribbs (1982) to hypothesize that the tRNA^jSer genes are co-evolving. It was further proposed that the hybrid sequences, arose from non-reciprocal recombination between the two parental genes, to be forceful testimony to the existence of such adynamic evolutionary process (P.160 of Cribbs, 1982). If such a process were indeed dynamic and continually renovating the tRNA$  er  genes at  12DE, then a convincing biochemical demonstration of non-reciprocal recombination as the underlying mechanism would be to isolate alternative permutations of the same tRNA^Ser genes. Thus it was disappointing to find that equivalent genes represented by pDt73 (474) and pDtl6R (774 and 777 genes) isolated from different fly strains and species have failed to directly support this contention since the corresponding genes are all identical in sequence (Chapter I). Even though the structural sequences themselves have remained static, the strikingly non-random distribution of the flanking homology patches, both in sequence  207 content and in spatial alignment, threaded among the tRNA$  er  genes at 12DE do indirectly  support the notion that the hybrid genes are likely to be decended from multiple lineages (Chapter II, Part I). Further, the morphology of these homology patches is reminiscent of intrachromosomal conversion events (Baltimore, 1981; Cami et at., 1984). Statistically, each small homology patch could have independently arisen by a low chance occurrence approximately once in 1000 to 65,000 nucleotides (5 to 8 matches, respectively). However, in the pair-wise comparisons of 5' region of less than 100 base pairs. I have observed more than one homology block. More importantly, these blocks of homology are never scattered in random array; instead they are almost perfectly aligned in a linear order without any artificial intervention of major sequence distortions (e.g. large insertions or deletions, loopouts, inversions etc.). Such homology patterns are, moreover, not the exclusive properties of one gene pair, but have been a recurring theme involving almost all t R N A  Ser  genes at 12DE (pDt27R 444-2 being the only exception). The intraspecific pattern of flanking sequence homologies (sequence homogenization in Dover's parlance) in the evolution of multigene families, fuelled mainly by biased gene conversion, is central to the theory of Molecular Drive (Dover, 1982). If indeed the shared homology patches observed in the tRNA^  er  genes were generated by a similar mechanism,  based on the prediction of Molecular Drive, then they should also occur in only one species and the identical regions of the homologous genes from other species should take on another set of shared homology patches. This has been directly tested by comparing homologous tRNASer genes across species. Two different types of homology patterns in the 5-flanking region have been observed in the t R N A  Ser  genes. The first type (type I) is the homology  boxes at -5 and -20. These are well conserved in sequence, spatial distribution, gene-type specificity (that is, only associated with either the 444 or the 777 genes) and they transgress species boundaries. Since sequence conservation in 5-flanking regions of tRNA genes is rare (see Introduction), it is tempting to suggest that the existence of these homology elements may reflect a functional selection for their role as important modulatory signals in the cell. More pertinent to this study is the second type of homology pattern (type II) associated  208 with the hybrid genes. Their occurrence is more difficult to reconcile based on the premise of functional selection alone.  These homology patches are extremely heterogeneous in  sequence, in size and in the spatial distribution within the 5-flanking regions.  More  importantly, these homologies do not usually transgress species boundaries unless they also overlap with the -5 and -20 elements. What are the origins of these homology patches? This can be best understood if the focus of attention is shifted to the comparison of homologous genes from the different species. When their 5'-flanking sequences are aligned, random mutations scattered sporadically throughout their lengths are readily apparent.  Superimposed on top the background  mutations are regional occurrences of clustered base changes. These clustered changes are not generated  de novo by random mutations, but show homology to a high degree in both  sequence content and positional alignment with another non-aileiic tRNA^e * gene in the 1  same genome (or species). For further clarification, a specific case is discussed below with the aid of an illustration.  444  mel  [  474  DNA TRAFFICKING  CLUSTERED CHANGES  CLUSTERED CHANGES  ere DNA TRAFFICKING If the homologous 444-1 (black bars) and the 474 genes (open bars) are compared between  erecta and melanogaster, both sets of genes show clusters of base changes in the immediate 30 base pairs 5' to the structural gene (shown as either open or filled small retangular patches  209 5' to the genes). When the 444-1 gene is now juxtaposed against the intraspecific 474 gene, what were originally identified as clustered base changes between homologous genes are now almost perfectly aligned in sequence and positions between these non-allelic genes (illustrated as similar rectangular patches). It is unlikely that these flanking sequences from homologous genes have undergone such substantial mutational alterations during speciation and yet, within a species, have remained conserved. Parallel evolution of such precision in the gene pairs compelled by random mutation alone would thus seem weak empirically, not to mention unlikely on the grounds of mathematical probability. I propose instead that the clustered base changes above  the  background of random muatations  are  regional  perturbations introduced by sequence trafficking between non-allelic genes within the same genome (see section 4 for possible mechanisms), in this specific example, between the 444 and 474 genes. A similar argument can also be tentatively made for the 774 gene in pDtl6R. Relative to the 474 genes above, the data here are less dramatic because the homologous pDtl7R-777 genes have not been cloned from D. erecta z&d D. yakuba. However, the patterns of homology as revealed by the 774 genes from three different species trade on a familiar theme: first, interspecies comparisons of homologous 774 genes showed coincidental clustered base changes; second, in the documented case of melanogaster, the replacement of the clustered changes show sequence alliance with the intraspecific pDtl7R-777 gene. Again in this regard, it is unlikely that this clustered base changes in melanogaster 774 gene could simply be de novo, but the replaced block of sequence is consistent with DNA information trafficking from another gene at 12DE. In the 3-flanking regions of the melanogaster tRNASer genes, only a few members show significant level of sequence homology.  Their scarcity could indicate that only a small  number of "donor" genes were involved in transmitting the 3' sequence information to the different hybrid genes (see fig. 53 and 54). An alternative or additional reason could be that the 3'-flanks invariably show faster rates of sequence divergence, particularly the interpolated region between the 3'-end of the structural gene and the poly-T termination  210 signals as ascertained by comparisons of homologous genes between different species (fig, 46). As a result, any evidence suggestive of common descent could become quickly obliterated. Finally, there is another interesting correlation which supports coevolution  of the  genes via gene conversion. The changes at nucleotide 16 and 77 internal to the genes accurately presage the type of immediate flanking sequences in the hybrid genes. It appears that both types of internal and external nucleotide changes are always inherited as a unit as if they were products of coconversion. In summary, the following parallel lines of observations do forge strong support to the argument that the X-linked tRNA4 7^ (  er  genes are probably co-evolving as a cohesive unit.  Further, the overall observations from sequence comparisons are in keeping with the central framework of Molecular Drive, the intraspecific sequence homogeneity  in multigene  evolution driven mainly by biased gene conversion: (i) The permutational nature of the hybrid genes suggest that the internal base changes are non-random. Except for the 444*, all nucleotide replacements within the genes strictly fluctuate between those of either tRNA4$ or tRNA7^ . er  er  (ii) The flanking regions of the hybrid genes show patchwork homologies characteristic of known tRNA4,7  Ser  genes at 12DE.  (iii) Particularly significant are the 5' and 3' segmental homologies in tRNA  S e r  D. melanogaster  genes, which show sequence alliance with a non-allelic gene and do not usually  transgress species boundaries (i.e. conforming to the concept of sequence homogenization). (iv) The multiple type II homology blocks between non-allelic genes can be aligned in a linear order without artificial assistance by distorting the sequences.  Moreover, such  patchwork homology patterns have been a recurring theme involving almost all tRNA$  er  genes at 12DE (pDt27R 444-2 being the only exception). (v) In the non-allelic gene pairs that show parallel clustered base changes, one member is consistently a bona fide gene, the other a hybrid gene. Such a relationship is reminiscent of a donor and a recipient, respectively, participated in the presumed conversion event. (vi) The inheritance of the immediate flanking sequences in the hybrid genes is always  211 coupled to expected nucleotide changes at positions 16 and 77 internal to the genes. One may still argue that any one of the preceding steps could have arisen fortuitously from random mutation followed by the vagaries of drift and selection.  Selection as the all  encompassing force would be difficult to reconcile, since the type II homology patterns are a conglomerate of heterogeneous sequences with no distinctive features. Moreover, the concatenation of such individual (and independent) chance events, nucleotide by nucleotide and coincidentally in different species, somehow leading inexorably to the consistent patterns observed would be exceedingly unlikely. A simpler and more compelling hypothesis would be that the tRNASer genes at 12DE are co-evolving as a unit and that the type II homology patches are indicative of active DNA trafficking involved in such an evolutionary process.  3. A Model Postulating the Origins of Tvoe II Homology Patches The foregoing observations seem to naturally consolidate into a simplistic yet coherent scheme delineating the possible origins of most of the hybrid genes (including the 444*); although in all likelihood, it addresses only the most recent events in their long history of coevolution (at least 13 to 37 million years) (fig. 28). This scheme invites the lowest number of possible information transfer events (for now, the events could be interpreted as either recombination or gene conversion). For clarity, I have explained the scheme in two separate figures. The first, as shown in fig. 53, is a genealogy tracing the origins of most of the hybrid genes. Each of the steps involved, indicated as A (to form the 474 gene), B (pDtl7R-774), C or D as equal possibilities (pDtl6R-774), and E (444*) is explained in detail in fig 54. The repertoire of hybrid genes cascading from these interactions appear to impinge on the pDtl7R-774 gene as the crucial intermediate (fig. 53).  However, the origin of this intermediate remains  difficult to assess. For simplification, I have occasionally used the general term "recombination" below as an operational description that includes all exchanges of information between genes, and not as a direct inference to the actual mechanism involved. In section 4, the possible mechanisms  212  pDU6R-774  E  D  444* <=• 444-1 X pDU7R-777 X 474 E = >  pDtl6R-774  X B  A  pDtl6R-777 X ? E = > pDtl7R-774 X 444-1 E=> 474  F i g . 5 3 . Genealogy delineating the formation of the t R N A ^ genes at 12DE. In all likelihood it only traces the most recent events in their transmission of DNA information. As shown by the diagram, all type 11 segmental homologies surrounding the tRNA genes can be traced to pDtl6R-777 and an unknown tRNA^genes. A, B, Cor D as equal possibilities, and E symbolize the steps leading to the formation of all the known hybrid genes, and the 444* gene. These steps are explained in detail in fig. 54. Open arrows indicate the direction of, and the products derived from, the information transfer events. Dotted arrows indicate possible additional flow of information that has obscured the origin of the hypothetical 444 gene. Ser  213 will be discussed in detail. In fig. 54, the following steps are explained: -Step A. the 474 gene could be created by information transfer events involving the pDtl7R-774 and 444-1 genes. Even though the 474 gene could be explained by a simultaneous double recombination event involving the 444-1 and one of the 777 genes, this seems to be a less likely explanation. From such a double recombination event, both the immediate 5 - and the 3- flanking regions of the 474 gene would be expected to display sequence alliance with the 444-1 gene. However, in the present case, the immediate 3'-flanking region of the 474 gene shows stronger homology with the pDtl7R-774 gene than with the 444-1, thus suggesting that the formation of the final recombinant occurred via two or more steps, rather than as simultaneous events. -Step B: As mentioned above, the origin of the pDtl7R-774 gene from above is probably more complex. Since this is the first intermediate in the scheme, it would be logical to assume that it may have been derived from recombination between a 777 and a 444 gene as the initiating event. While its 5-flanking region would suggest that the pDtl6R-777 gene could be involved, its 3-flanking region shows no homology with either of the known bona fide tRNASer genes. One possible scenario could be an information transfer event involving the pDtl6R-777 and a 444 gene to form a transient pDtl7R-774. The 3'-flanking region of this product could have diverged and then subsequently acted as the donor in creating the 474 and the pDtl6R-774 (as discussed below). Alternatively, the present day form of the pDtl7R-774 3' end could have already undergone additional interactions with the 474 or the pDtl6R-774 genes. This would be analogous to the phenomenon observed for coevolution of the immediate 5'-flanking regions between the intraspecific 474 and its respective 444-1 genes described previously in Chapter II (Part I). -Steps C and D: the pDtl6R-774 gene could be generated by two equally likely lineages: first, pDtl7R-777 and pDtl7R-774 could be information donors to form the pDtl6R-774 (step C). Alternatively, the same hybrid gene could have been generated from information transfer between the pDtl7R-777 and the 474 genes (step D). The similarity in the immediate 3' regions of the 474 and the pDtl7R-774 genes would preclude any clear distinction between the two  214  F i g . 54. Schematic stepwise diagrams showing the possible lineages of h y b r i d t R N A 4 , 7 genes encountered at 12DE based on their shared f l a n k i n g homologies. Each of the steps (A. B, C, D and E in fig. 53) leading to the formation of a particular hybrid gene is illustrated. The recipient of the information transfer leading to the present day form of the hybrid gene is sandwiched between the putative donor sequences. The t R N A ^ - t y p e genetic information is depicted as filled boxes and the associated flanking regions are depicted as thick lines. In contrast, the t R N A ^ - t y p e genetic information is shown as open boxes and the flanking sequences are distinguished as thin lines. The pathways of information transfer are shown as square waves between the genes. The dotted part of the sequence waves indicates the possible extent of the regions involved. Step ( A ) shows the creation of the 474 gene. The donation of genetic information could be pDtl7R-774 and 444-1 genes in two independent steps rather than a simultaneous event. Step (B) attempts to account for the pDtl6R-774 gene. While its 5-flanking homology suggests that pDtl6R may be involved, the origin of its 3-flanking sequence is complex and may have undergone additional interactions with other hybrid genes further down the pathways. Steps (C) and (D) are two possible ways by which the pDtl6R-774 gene could be accounted for. In both possibilities, one of the donor sequences is pDtl7R-777, the other could either be pDt!7R-774 (as in step C) or the 474 gene (as in step D). Step (E) suggests that the 444* gene could have been formed by genetic information transfer involving pDtl7R-777 gene in the 5-end between -9 and -15. while its 3-end shows sequence signature of the 444-1 gene. The mutation at residue 52 is indicated as (*) and probably the result of a random mutation. Scr  (A)  p0t17R-774  474  444-1  (B) pDt16R-777  pDt17R-774  444? pDt!6R-774? 474?  (C) pDt17R- 777  pDt16R-774  pDt17R-774  (D)  474  p»t16R-774  pDt17R-777  (E)  pDt17R-777  444*  444-1  218 possibilities. -Step E: the 444* gene could have formed by estrogenic information transfer event instigated by pDtl7R-777 at approximately between -9 and -15 in the 5'-flank. The structural sequence of the 444* gene, plus about 20-bp in the 3'-flanking region probably involve the 444-1 gene as the other donor. The single mutation at residue 56 within the coding sequence could be random, or the direct result of misincorporation during repair after the process of information transfer. Thus, with the possible exception of the pDtl7R-774, the rest of the hybrid sequences and the 444* gene could be accounted for, in the simplest way, by step-wise information transfer events involving known genes at 12DE.  4. Possible Mechanisms Involved in Generating the Hybrid Genes Heterologous recombination between tRNASer and tRNA7$er genes and/or slippage of DNA strands during replication (Jones and Kafatos, 1982; Goldsmith and Kafatos, 1984; Streisinger et. al, 1966), are two mechanisms by which the hybrid sequences might arise. Either mechanism, however, would also presumably result in fluctuations (expansion and contraction) in the size of the gene family due to the irregular spacing and orientation of the tRNA  S e r  genes. It is also expected that the intermingled tRNA S genes would be deleted in Af  some instances as a result of straddling DNA slippage during replication or by unequal exchange. From limited analyses in  D erecta and D. yakuba, no deletion for the tRNA4,7^er  tRNA 8 genes has been detected at the representative loci. Thus, if the hybrid genes were Ar  formed from either mechanism, it would necessarily imply that gene dosage may act as an additional selection factor in eliminating segregants containing "improper" gene copy numbers or other associated deleterious defects. Furthermore, unequal exchange between the 444 and 777 genes should issue an equal number of reciprocal recombinants if there is no differential selection against either class of products. Thus far, only one of the two possible classes of recombinants has been observed (see below). While these two possible mechanisms for generating hybrid genes seem unlikely, they cannot be formally eliminated since the  o  r  219 absolute copy numbers of tRNA$ genes at 12DE have not been rigorously measured in the er  different  Drosophila strains and species.  Both gene conversion and double recombination would deserve equal merit as valid explanations only if the sequence structures of each hybrid gene were interpreted independently. Either mechanism could give rise to the hybrid genes and would preserve the number of gene members for both tRNA47^  er  and tRNAArg in the  cluster.  Two  considerations, however, render double recombination as the weaker possibility. First, on theoretical grounds, double crossovers in the same chromosome are not independent events. It has been well established that crossing-over in one region will decrease the likelihood that another will occur in an adjacent region along the same chromosome. This phenomenon is known as "interference" (reviewed by Suzuki et al., 1986) and it refers to the observation that the number of double recombinants recovered by a genetic cross is usually lower than the predicted value based on the product of their individual probabilities.  Interference is  generally measured over 10 to 20 map units, but it has been surmised to hold true for short distances for a few hundred base pairs.  If the hybrid genes were in fact generated by  frequent double crossovers, then this tRNA gene cluster would be truly an unusual genetic site, by virtue of the fact that interference appears to be negated in this region to allow a large number of tightly linked double recombination to proceed unimpeded. A second, and more forceful argument against double recombination comes from the systematic cloning and sequencing of almost the entire tRNA4,7^  er  gene family.  A model based on double  recombination would also predict two classes of reciprocal recombinants. Since only one class has been found (see below), this observation would be antagonistic to double as well as single reciprocal exchanges as plausible models. In the absence of reciprocal products, this model would also necessitate other additional and unwarranted contrivances invoking differential selection on the two classes of reciprocal products with no real supporting evidence. A definitely more parsimonious explanation would be non-reciprocal recombination (probably by gene conversion-like mechanisms) as first suggested by Cribbs (1982). Nonreciprocal recombination is an operational definition used in describing the recovery of only  220 one of the two possible classes of recombinants in a given genetic cross. While it is still slightly premature to tell whether the entire caste of hybrid genes (and the 444*) are nonreciprocal recombinants, at least as far as is known, all seem to fit this description as suggested by three parallel lines of evidence (also see Cribbs et al, 1987b).  From the  molecular walk at 12DE, none of the anticipated reciprocal hybrid sequences (747, 477 genes) have been recovered, and yet in all likelihood, the t R N A  Ser  gene cluster from this major  chromosomal site have been fully recovered. It seems equally unlikely that all reciprocal recombinants are located at the three minor autosomal sites, since three of the possible four genes from these sites have also been cloned as restriction fragments purified from gels and from recombinant libraries. Two 777 genes separated by a maximum of 15 kb have been isolated from 23E on chromosome 2L from a limited chromosomal walk of approximately 50 kb (J. Leung, D. Sinclair and S. Hayashi, unpublished). The other, as yet unmapped cytologically, has been isolated as a 2.0 kb Hindlll fragment identified earlier by genomic blotting. This is also a 777 gene (D. Sinclair, unpublished). Furthermore, no patchwork homologies can be found between these autosomal and any of the X-linked tRNA$ genes, except at the expected er  -5 and -20 elements (data not shown). The last and weakest hybridization band, identified as an approximately 24 kb Hindlll or 9.5 kb EcoRI fragment by genomic Southern (Cribbs et al., 1987b) and most likely to contain the remaining autosomal gene, has not yet been cloned. Even if this gene were one of the possible reciprocal recombinants, it still could not fully account for the three hybrid and the 444* genes captured in the walk. The third indirect line of evidence suggests that the homology blocks associated with the hybrid genes can be more simply accounted for by localized interactions of the X-linked tRNA$ invoking recombination between the X and autosomes as discussed above.  er  genes without It is intuitively  obvious, that such illegitimate exchanges would also be burdened by high frequencies of aneuploidy. In contrast, gene conversion-like events as viable a hypothesis would embody the twin virtues of providing a more literal interpretation of the non-reciprocality of the hybrid genes with minimum assumptions, as well as the import of flanking homologies while evading the consequences of aneuploidy or alteration in gene numbers.  221 Even though the data obtained in the above studies are circumstantial, with the almost complete  characterization of the tRNASer g  e n e  structures in the  genome, standard  recombination (by either single or double exchange) would emerge as a less appealing explanation.  In contrast, the structural sequences of the hybrid genes, their molecular  organization as a tightly linked cluster and the patchwork flanking homology patterns do impart extensive and striking similarities coincidental with other multigene families that are thought to co-evolve by gene conversion (Dover, 19S2 and below).  If the tRNA4,7Ser genes  are no exception, they may conform to this general mode of cohesive evolution.  It must be  acknowledged though, gene conversion may be the dominant mechanism, but it does not in any way eliminate other mechanisms mentioned above as minor players in the co-evolution of the t R N A  Ser  genes.  As called to attention by Baltimore (1981) and Egel (1981), co-evolving gene families usually show clustering, probably as a direct reflection on their common origin. From his surveys, Baltimore (1981) showed that allelic members of a gene family usually show strong conservation; but he also noted that base differences between alleles are usually clusters of several nucleotides.  However, these clusters are never random mutations, but show very  strong sequence identity with non-allelic members adjacent on the same chromosome. Baltimore (1981) and Egel (1981) thus further proposed that such segmental homolgy blocks may represent molecular evidence for gene conversion between non-allelic members of the family. Gene conversion would make intuitive sense as a mechanism to rectify deleterious mutations and to keep the multigene family (duplicated but irregularly spaced genes) evolving as a cohesive unit (Geliebter and Nathenson, 1987). Since that time, small imperfect homology patches have also been shown in other multi-gene families as diverse as human fetal 6-type globin (Slightom et. al,, 1980; Stoeckert et al., 1984; Powers and Smithies, 1986), the Gy2aand Cyb genes of the mouse immunoglobulins heavy chains (Olio and Rougeon, 1983), the K», Kd and Q10 mouse class I H-2 genes (Metier et. al., 1983) and the human embryonic £globin genes (Hill et. al., 1985). The organization of all these genes conforms to the prediction of Baltimore (1981) in that they are all tightly linked within a small chromosomal domain.  222 With the exception of the human C, genes, all other studies represent fairly extensive analyses of allelic differences within polymorphic populations.  In these surveys reported above, a  large proportion of the differences between alleles exist as clusters of 4-6 (analogous to the clustered base changes between homologous tRNA$  er  nucleotides  genes from the  different species). These bases differences are also non-random, but display strong sequence identities with adjacent and non-allelic members of the gene family. These collective studies thus provide circumstantial evidence suggesting that gene conversion may be a pervasive mechanism for constraining sequence divergence between members of a multi-gene family. Their data further implied that the proximity of the non-allelic gene members may enhance their frequencies of intrachromosomal gene conversion either by sister DNA (chromatid) exchange or by looping within a DNA strand (Slightom et al., 1980; Egel, 1981; Baltimore, 1981; Stoeckert et al., 1984; Willard and Waye, 1987). Intrachromosomal gene conversion may indeed represent the predominant mode of information exchange between tightly clustered loci  In vivo and this has been very  persuasively simulated in yeast. Klein and Petes (1981) selected a yeast transformant, in which the wild-type lea 2* gene, together with a bacterial vector, was integrated in the vicinity of the chromosomal leu 2 gene that was previously inactivated by two frameshift mutations. This transformed strain was then crossed with a teu2* partner. Hence, one of the chromosomes of the diploid carried two  leu2 genes in tandem only separated by the vector  DNA. In the absence of recombination, all four spores of a tetrad should receive a copy of the teu2* allele and grow without leucine. However, occasional tetrads were observed in which one spore was auxotrophic. When a random sample of these aberrant tetrads were analyzed further (12 of 306), the largest fraction was due to conversion between the duplicated genes (6 of the 12). Four cases were due to conversion between non-sister chromatids, and only one case was due to crossing-over (the remaining tetrad was apparently the result of multiple events). Exactly parallel studies using duplicated his A genes, but in mitotic yeast cells, have led Jackson and Fink (1981) to the same conclusions as those above.  However, their studies  were even more compelling, since in a total of 127 aberrant colonies examined, at least 88% of  223 them could be traced directly to intrachromosomal conversion while the remaining 12% were pooled from more complex events (either reciprocal recombinants or convertants that have subsequently participated in recombination events). Both sets of authors, based on their independent results, have also stressed that intrachromosomal conversion may be the dominant and effective driving force in sequence rectification during both meiosis and mitosis since the overall number of genes within a family would be faithfully maintained. Intrachromosomal conversion being the major pathway of information transfer is certainly not just an esoteric refuge restricted to fungi. The same type of mechanism has been recently advanced by Liskay and Stachelek (1983 and 1986) and concurred by others (Lin and Sternberg, 1984; Smith and Berg, 1984) to be also operative in mammalian cells using direct duplications of HSVitf " (type I Herpes Simplex Virus) inserted into different chromosomal positions as a model.  Events consistent with nonreciprocal information  transfer, or gene conversion were found to make up a majority (50-85% depending on the systems investigated) of the total recombination events (Liskay et  at., 1984).  These  experiments have one undeniable advantage over the fungal studies, because the integration of the plasmids is not targeted to a specific site, the effects on recombination from different chromosomal environments can be compared. The collective studies strongly suggest that positional effects play either a minor or no role in affecting the frequency of conversion and that the rates are largely, if not entirely, dependent on sequence homology of the inserted duplicated fragments.  Although interchromosomal conversion  cannot be  absolutely  eliminated in these studies, most of the cell lines have been ensured to contain only a single integrant (as determined by Southern blotting) to minimize this effect. conjoining  investigations  in  higher  eukaryotes  and  in  yeast  Thus,  seem  to  the above reinforce  intrachromosomal conversion as a widely employed mechanism in maintaining sequence homogeneity in tightly linked multi-genes. The molecular workings of gene conversion remain the most resolute problem, although several models have been advanced in recent years in an attempt to unwrap its "black-box" mystery. These models ranged from the simplistic (single DNA strand invasion of Holliday,  224 1964) to the more exotic (D-loop formation and branch migration of Meselson and Radding, 1978 and double-stranded break and resolution of multi-branched structures of Szostak et al., 1983); each embodies certain inherent peculiarities stemming from the unique properties of the systems under study. Yet, gene conversion in all its likelihood, is probably a complex phenomenon with different aspects being exaggerated by the choice of genes and nature of the mutations, cell cycle time, the genetic background and the biological systems employed (Fogel et al., 1978; Fogel et al, 1982; Herskowitz and Oshima, 1981; Hamza et al, 1986). All of the above models, however, do converge on a unifying theme invoking heteroduplex formation as the intermediate. The ensuing repair of the heteroduplex would thus leave a wake of patchwork sequences upon its resolution.  This concept has been tested by  constructing heteroduplexes of H-2 genes from mouse in vitro with two partial cDNA clones of 1.15 kb and 1.0 kb in length (Abastado et. al., 1984) and transforming them into either £  coli (dam' and either reck' or recBC' strains) or Cos -1 monkey cells (Cami et al 1984). The cDNA clones differ by a large insertion at the 5' end (142 bp) and in the length of the 3' polyA tails, in addition to many internal point mutations, small insertions and deletions of 3 - 9 bp totalling 8% in sequence divergence (small insertions and deletions had been scored as single mutations). The subsequent resolved heteroduplexes recovered from both cell types indeed acquired blocks of processed patchwork copied from either parental strands, and never de  novo synthesis, as predicted from the above models. Furthermore, they also drew a tentative correlation between the decreasing lengths of the patchworks with an increase in the amount of heterology. That is, in regions of many nucleotide changes, the repair mechanisms showed no template preference, but made frequent switches between the two strands (Cami et. al, 1984). These observations have been confirmed by Folger et al (1985) by transforming heteroduplexes of different insertion mutants of the neomycin genes into mouse lMt£~ cells. They also persuasively showed that repair was in fact the predominant mode of generating such processed patchwork, since co-injection of a mixture of different homoduplexes into the cells failed to produce any patchwork or even simple recombinants of any design. Such a repair process, at least in £. coli, can occur prior to, and hence independent of, resolution of  225 the heteroduplex through DNA replication (Fischel et. al., 1986). Further, such correction of  heteroduplex is severely impaired in yeasts bearing the psm-1 mutation, a gene normally thought to participate in mismatch repair (Bishop et. at., 1987). According to the recombination models, sequence homology is tantamount to the formation of recombination intermediates and subsequent branch migration. Several investigations have been reported on the minimum length of homology which can still catalyze efficient recombination or gene conversion between directly duplicated sequences in higher eukaryotes. Liskay et al., (1987) constructed a series of plasmids, each containing a mutant target RSVtJr' gene inactivated by an Xhol linker, and a donor fragment sharing various lengths of homology overlapping with the mutant site. Upon transfection into mouse L cells, the integrated constructs would contain directly duplicated I f sequences separated by several kilobases from the vector. Their results showed that, for shared homologies between 1.8 kb and 295 bp in length, conversion is efficient, with the rate being directly proportional to the extent of homology (between 9 x 10"* and 2.0 x 10~6 conversion events/ cell division). In contrast, conversion with either 200 bp or 95 bp of homology was still detectable, but the rate was reduced at least seven- to 100-fold, respectively, relative to that observed with 295 bp donor fragment. A similar approach was taken by using an SV40-pBR322 plasmid construct also containing directly duplicated fragments flanking a single SV40 genome (Rubnitz and Subramani, 1984). Recombination between the duplicated fragments would thereby liberate viable SV40 virions in the process. Since SV40 DNA has a specific infectivity of 3 6 x 10  6  PFU/mg, a specific infectivity, or recombination frequency, of only 0.001% would still produce plaques. By progressively trimming the length of homology from 5,243 to 0 bp with Bal3l,they showed that the steepest drop in recombination frequency occurred between 163 and214 bp (030% relative to wild-type), with lower but persistent levels of recombination occurring when there was only 14 bp of homology (0.002% comparing to wild-type SV40). In prokaryotes, the region of homology required can still be shorter. It has been shown in vitro that  reck protein from £ coli is able to extend hybrid-DNA formation through regions of  DNA with an average of 1 mismatch per 10 bp (DasGupta and Radding, 1982).  226 Another unusual feature concerning  recognizing UCN codons is the evolution of  their respective genes appears to be cohesive across the eukaryotic kingdom. Thus far, these are the only recorded examples of pairs of tRNAs with almost complete sequence identity that are yet functionally distinct isoacceptors. completely determined in  In all cases, they are 96-98% in homology, as  S. ceneraaiaKEtcheverry el at., 1979), 5. pombe (Rofer et at. 197  D. metanogaster(Zt\bb% et at., 1987a). Incomplete sequence data indicate that such a pair ma be present in the rat as well (Rogg and Staehelin, 1971; Rogg et at. 1973). The concerted evolution of the genes encoding UCN reading tRNAsS has been extensively er  studied in S. pombe in a series of elegant experiments conducted by Kohli and colleagues. The family of genes coding for two minor serine tRNA isoacceptors consists of three members. Two genes (supV on right arm of chromosome I and sup 9* on chromosome III) code for tRNAs reading the codon UCA and one gene (sup 12*, on the left arm of chromosome I) codes for a UCG-reading tRNA (Munz et at, 1981). The sequences of all three genes are very similar in the regions corresponding to the mature tRNAs. Besides the obvious base differences at the anticodon, the only difference resides at the tip of the tRNA extra arm (sup 3 :T; sup 9 and sup 12:C). All three genes have introns. While those of  sap 3 and sup9 are identical and 15 bp  long, that of sup 12 differs from the two at six positions and is 1 bp longer. The intergenic convertants have been obtained by three different clever schemes (Amstutz et. at., 1985; Munz et al, 1982; Heyer et at., 1986).  All three were designed to independently estimate the  frequencies of conversion dxlO" to 2xl0~5) and as indications of the possible minimum 6  lengths of the repaired heteroduplexes. For example, most crosses consist of the selfing of strains carrying a suppressor gene that has been inactivated by a secondary mutation outside the anticodon (Munz et at, 1981). The parents of such a cross are identical throughout the genome (except for mating type) and consequently recombination between alleles cannot lead to the creation of active suppressors. Among the progeny, sequence analysis performed on convertants from all three schemes has shown that all three loci were involved in DNA transfer and that information trafficking between any two members of the gene family occurred in both directions.  Moreover, with a more relaxed screen that did not demand  227 functional tRNAs$ (either as suppressors or wild-type) some of the convertants contained er  imported DNA patchwork sequences that can be as short as 18 bp (Heyer et al., 1986). Thus, in returning to the  Drosophila tRNASer genes at 12DE, the constraints on their genie  sequences and the accountability of their patchwork flanking homologies could be explained in context of what is known about co-evolution of multigene families discussed above. The tight linkage between t R N A 4  Ser  and t R N A 7  Ser  would suggest that on occasions these genes  could mispair due to the similarity of their coding sequences.  Their length and degree of  homology should be sufficient for stable heteroduplex formation as supported by the various published results cited above. Also, this contention is strengthened by the strong crosshybridization observed between the two gene types even under stringent conditions either by in .aft/hybridization (Hayashi et al., 1980) or by Southern blotting experiments observed during the course of this work (also consult Cribbs et al. 1987b). It is also possible that the surrounding repeats could act as accomplices in assisting the formation of heteroduplex. This type of phenomenon is not entirely novel, but it has been documented previously as aberrations at the white locus caused by misalignment and high frequency recombination between adjacent roo (Davis et al., 1987) and between copia (M. L. Goldberg et. al., 1983) transposable elements that are at least 38 and 60 kb apart, respectively. For the transient heteroduplexes of the misaligned tRNA4jSer genes, they could then be partially repaired using either DNA strand as a template, leaving a wake of processed patchwork in the flanking regions, and in some cases, also partially extending into the structural genes to create the genre of hybrid tRNA$ sequences at 12DE. The amount of patchwork homologies in both the er  5'- and 3-flanking regions would probably be governed by the fidelity of the repair process as well as the spontaneous mutation rates subsequent to the repair. If these conversion-like events were dynamic as suggested by Cribbs (1982) it is not clear why other permutational forms have not been found. It is possible that they may exist at low frequencies. However, the preponderance of certain permutations may reflect their relative physical proximity of tRNA4 7$ genes on the chromosome, or the nature of their surrounding sequences such as er  (  higher density of shared repetitive elements, which could also favor the type or direction of  228 conversion events among the tRNA^Ser genes. This last point is in fact emphasized by the observation that the 444-1 and the 474 genes do show evidence of cohesive evolution in both  erecta and melanogaster If the subsequent repair of the heteroduplex is never extensive, and the initiation is preferentially near the ends of the structural gene, such that one end of the repaired tract frequently terminates somewhere prior to the anticodon, the processed patchwork would certainly go undetected because the final hybrid structure would still be a 474 gene.  Obviously, if the other end of the repaired tract extends into the ^'-flanking  regions, where heterology between the two genes can be conveniently used as "landmarks", the repaired tracts could then be easily discernable as small intraspecific homology patches above the background of heterology. Thus, the inability to detect other permutations may be simply due to constraints imposed by favored interactions, as well as short repaired tracts. The persistence of the same hybrid genes at homologous sites in the three different sibling species spanning the entire evolutionary history could also suggest that there may be selection at work. It is not known whether the hybrid genes are functional in rim although those from melanogasterran support in vitro transcription at reduced rates compared to the bona fide genes (St. Louis, 1985; St. Louis and Spiegelman, 1985). It must be emphasized that even though the hybrid genes may be transcribed in vivo, the integrity of at least the 474 gene is not essential for viability (see mapping studies in fig. 52).  Therefore, functional  selection cannot be an all encompassing factor in favoring (as well as eliminating) certain types of hybrid genes. The hybrid tRNA^ySer genes with their attendant flanking segmental  homologies  resemble partial, rather than complete, convertants. Such partial convertants, or transition stages (Strachan et al., 1985), in the evolution of multigene families appear to be extremely prevalent. In addition, at the DNA sequence level, blanket homogeneity in any multigene family has thus far never been observed (Strachan et. al., 1985; Baltimore, 1981 and relevant references on allelic differences cited on p. 221), suggesting that the spread and subsequent fixation of any variant through the entire family is a slow process (Dover, 1982).  229 5. tRNAArg Genes from  Drosophila Sibling Species  The tRNAArg genes corresponding to the pDt27R region have been cloned from  D. erecta  and D. yakuba. In both these sibling species, there is only one gene at this site. structural sequences are identical, but both differ from the  Their  melanogaster duplicated units by  retaining a "T" at position 13, rather than "C". In comparing the 5-flanking regions between the sibling species, most conservation is seen in the immediate 28 bp. This corresponds to the 5' limit of the 200 bp duplicated cassettes in  melanogaster.  Beyond -28 for at least 50  nucleotides, there is much more sequence divergence; but in both the genes the residual homologies are closest to pArgl2.1 in  erecta and yakuba  melanogaster. In both the pDeArg-1  and pDyArg-1 genes, the flanking regions here contain many small deletions between 1 to 15 bp, relative to that in  melanogaster. All three sibling species diverge from each other by a  minimum of 15%, if all deletions were scored as single differences.  Likewise, the most  conserved nucleotide block in the 3'-flanking regions of the sibling species corresponds to the direct repeats, TAGCCCAA, at about 68 bp 3' to the structural gene.  This small  oligonucleotide is in turn embedded in highly homologous, but non-identical sequences that ranged from 25 bp long in  yakubato 43 bp long in erecta Downstrean from this region, there  is extremely strong homology between pDeArg-1 and pDyArg-1, Both of these sequences also show strong identity with the corresponding 3' region of pArg 12.4 of most conserved 5'- and 3-flanking regions in the  melanogaster. Thus, the  melanogaster tRNAArg cluster, relative to  the sibling species, correspond to those of the two outer most genes, respectively. It was also noted that most of the conserved nucleotide blocks shared among all three sibling species in both the 5' and 3' flanking regions tend to be potential hairpin structures. Whether they are fortuitous or in fact play some biological roles such as recognition signals within the cell is unknown. The most rapidly diverging sequence in the tRNAArg genes from the different species occur in the 40-50 nucleotide long insterstitial regions between the 3-end of the structural gene and the direct repeat. While there are approximately 8 point mutations and one 15 bp long deletion in pDyArg-1 relative to pDeArg-1, both genes are completely divergent from the corresponding segment in  melanogaster.  230  HEL T A6CCC A A  SIM  4 4 9 / 3 1  TAOACCAA  ——I  ERE —  >77  YAK  >141  I  TT  h—TA6CCCAA  |  r---—TA6CCCCA  1 3  T13 CZZ  57 l-^-TAOCCAA 43  -TAGCCCA  F i g . 55. The four tRNA * genes and f l a n k i n g sequences between the direct repeats from the Drosophila sibling species are summarized. The structural gene is represented by the open box. MEL- melanogaster, SM-simutans ERE-erecta, Y ML-yakuba The 5 -flanking sequences (thick lines) four genes are highly conserved whereas the 3' sequence between the end of the structural genes and the direct repeats are much more diverged. The genes can be divided into two groups based on the mutation within the coding sequence and the homology imparted by their 3' tails. The 3' tails belonging to melanogaster and simutans are closer in sequence homology with one another (dotted lines) and those belonging to erecta and yakuba show some sequence similarities ("60%). Between the two groups (which coincidentally belong to the two species complexes in fig. 28), however, they are completely diverged. Across the phylogenetic tree, there appears to be a tendency for the 3' tails to expand in size from yakuba to melanogaster. Also, the two genes from simutans and melanogaster differ from those of erect yakuba by a T 1 3 - C 1 3 mutation within the structural sequence (small black squares). The two direct repeats of simutans are not perfectly conserved, while those from melanogaster are identical. Their distances in base pairs from the structural gene are indicated by the numbers above the genes. In melanogaster. the two different 5-flanking regions indicated are dependent on the size of the duplicated units (600 bp or 200 bp as either 449 or 31, respectively). The smaller duplicated units have a single large deletion in the 5-flank. The 5' direct repeats in erecta and yakuba have not been sequenced but are certainly more than 31 bp upstream from the gene, suggesting that all duplicated units in the present melanogaster are most likely to have been derived from a 600 bp antecedent. Ar  231 Recently, the corresponding tRNAArg gene has also been isolated from Newton (unpublished).  D. simulans by C.  DNA sequence analysis showed that it is identical to that in  melanogaster, having the less prevalent "C" nucleotide at position 13. Both its 5' and 3' flanking sequences are quite homologous to those in  melanogaster Most of the differences  are point mutations or single base pair insertions/deletions totalling approximately 7%. The most diverged region is also in the interstitial sequence between the end of the structural gene and the direct repeat.  The differences noted here, besides several base changes  scattered throughout also include a large block of deletion (~13 bp long) near the putative poly-T termination signal.  A schematic diagram summarizing the configurations of the  tRNAArg genes across the phylogenetic tree is provided in fig. 55 In Newton's extensive surveys by Southern blotting on approximately 45 different strains of  D. melanogaster from Europe, North and South American, Asia and Africa, he showed that  40 of these contained the four duplicated genes, while the remaining five had three genes. In contrast, all sibling species surveyed contained only a single gene (Newton et. al., 1987). Thus, it is likely that the tRNAArg genes only duplicated recently, sometime after the divergence of the  simulans And melanogaster species. From the above sequence data, it  appears the the single mutation at position 13 within the structural gene occurred before divergence of the two species at least 4 x 10& years ago. With the identification of the direct repeat flanking the tRNAArg genes, it seems most likely that at least the first duplication was by unequal crossing over at or near these sites. The point of crossing over is not precise, however, since about 20 bp downstream to the TAGCCCAA repeat are included in pArg 12.2 and pArgl2.3, before the beginning of the next duplicated unit. The phenomenon observed with the tRNAArg genes would be analogous to the expansion of a cluster of three tRNA^lu genes reported by Hosbach et al., (1980); although in their case, no antecedents to the duplicated forms were investigated. To date, no hypothetical two gene-configuration for the tRNAArg has been identified (Newton, personal communications). The reason for this is unclear, but a trivial reason could be that the strains used in the surveys are not truly representative of the population at large. Alternatively, it could be possible that the two gene-configuration is  232 unstable or recombinogenic, quickly catalyzing further duplication events. This could be due to the increase in the length of sequence homology provided by the extra copy of tRNAArg gene in the hypothetical intermediate. Certainly the preponderance of the four gene- over the three gene-, and the absence of one  gene-configuration  among the  different  melanogaster slrzxns strongly suggests that once the duplication process started, there was little impediment to further duplications. Gene copies higher than four have not yet been detected.  Since this appears to be an evolutionarily recent phenomenon, it may be that a  sufficient time for this to occur has not elapsed. Conversely, there may be a selection against higher number of tRNAArg genes imposed by extraneous factors such as transcriptional components or post-transcriptional modification enzymes. This latter point could be tested by P-element transformation to artificially increase the copy number in the genome. From fig. 55. the and  5' direct repeat is more than 31 bp upstream from the structural genes in both erecta  yakuba and since the homology extends far beyond this region, it would strongly suggest  that the predecessor that gave rise to the duplicated genes in  melanogaster was likely to be  the 600 bp unit. In &DE16-82, the 3* half of another tRNA rg gene corresponding to pArgl2.6 in D. A  melanogaster was also sequenced. They are transcribed in opposite orientations, relative to their respective tRNASer genes, in the two species. Further, the large inter genie sequence separating the tRNAArg from the tRNA$ genes is not present in er  D. erecta It is not known if  the intergenic DNA is actually deleted in the latter species, or whether the differences in the organization simply results from a sizable inversion.  Attempts to resolve this issue were  hampered by the fact that several probes prepared from  D. melanogaster corresponding to  this intergenic region were inundated with repeated sequences (see fig. 60), which were also present in  erecta and all other sibling species tested (not shown).  In summary, the studies on the tRNASer and tRNAArg genes at 12DE showed that this chromosomal site is undergoing rapid evolution. Both the creation of the hybrid tRNA47^ and the expanding tRNAArg g  e n e  c  er  i t e r at this site could be due to misalignment of DNA. u s  Such misalignment could be caused by the redundant nature of the genes themselves and  233 could be further stabilized by the presence of many repeated elements in the region. The existence of many repetitive and repetitious elements at 12DE had been predicted earlier from cytological analyses (Rudkin and Tartof, 1973), and in fact have been corroborated from the present chromosomal walk, including the cluster of Ste sequences. Moreover, from casual inspection of the sequence obtained for pDt27R (Newton, 1984), numerous small repeats ranging from 20- to 50-bp can be easily identified. Some of these are composed of complex sequences, while others are simple alternating TG and C A nucleotides or more monotonous poly T and poly A tracts. The mispairing of DNA at the white locus (M. L. Goldberg*/, al, 1983; Davis et al, 1987), recombination within the rDNA arrays in both yeast (Szostak and Wu, 1980) and  Drosophila (Coen. and Dover, 1983; Gillings et al., 1987), in addition to the globin genes  (Proudfoot and Maniatis, 1980) in humans suggest that DNA slippage at redundant loci could be a frequent contributing factor to chromosome aberrations and evolution of multigene families in general. The sequences obtained for the tRNA^Ser and tRNA^rg genes reported here from D.  erecta, D. yakuba, D. melanogaster (Canton-S and Oregon-R) and D. simulans (New unpublished; Cribbs et al., 1987b) showed that their 5'-flanking sequences are always more constrained than their 3'-flanks.  As well, the degrees of divergence as imparted by 5'-  flanking comparisons are in general agreement with the phylogenetic  relationships  established for the different sibling species based on chromosomal abberations (Ashburner  et. al., 1984).  While on average,  simulans shows approximately 7% divergence from  melanogaster, D. erecta and yakuba differ from each other and from melanogaster b average of 15-30%. Using the estimate of 0.5-1.0% difference in nucleotide changes for every million years of divergence (Bodmer and Ashburner, 1984), this would suggest that diverged from melanogaster at approximately 3 5-7 million years ago. would have diverged from each other and from  simulans  D. erecta and yakuba  melanogaster between 7 and 15 million years  ago. All these estimates are within the general limits given based on previous studies (see references cited in Chapter II, Part II). According to the topological relationships established by Ashburner et. al. (1984) however, one would not expect the same degree of sequence  234 divergence from pair-vise comparisons between  yakuba and melanogaster, and between the  latter species and erecta. One tenable hypothesis could be that there may be a wide range of overlap in sequence divergence contingent on the choice of genes (or regions of DNA) being investigated.  It could be equally likely that evolution of the  Drosophila species is a  complicated plexus and not the linear process as implied by the topological relationships.  234A  APPENDIX  235 CHAPTER IY  tRNA3h  Val  Genes and Related Sequences  Seven tRNA Val isoacceptors have been resolved by RPC-5 chromatography in D.  melanogaster. tRNA3 ^ ^, t R N A 3 b ^ and tRNA^al constitute the major isoacceptors whereas a  a  tRNA 1 Val, tRNA2 , tRNA Val and tRNA Val constitute the other minor species (Dunn et. al., Val  5  6  1978). t R N A 3 b ^ is the second most abundant valine isoacceptor and binds to the ribosomes in response to the valine codon GUG (Dunn et. al., 1978, Addison, 1982).  Its nucleotide  sequence has been determined (Addison, 1982; Addison et. al.. 1984). In situ hybridization to polytene chromosomes using purified tRNA has localized the coding sequences for these tRNAs to two major sites at 84D3-4 and 92Bi-9 and one minor site to 90BC on the right arm of (  chromosome 3 (Hayashi et al., 1980 and 1981). Grain distribution over these sites suggests a template ratio of 5;4:1, respectively. By measuring tRNA3b Va* levels in Drosophila mutants with altered gene dosage at the two major loci, both Dunn et al. (1979a) and Larsen et al. (1982) have independently demonstrated that genes at 84D3-4 and 92B1-9 are actively transcribed in vivo. Moreover, each of the sites contributes to approximately 50% of the total cellular tRNA3bVal pool, in direct proportion to gene dosage. tRNA3b  Val  In contrast, no substantial level of transcription was detectable from the  genes located at 90BC. The results suggest that most, if not all, of the active genes  reside at the two major loci. The lack of detectable transcription from 90BC would suggest that there may only be a small number of templates, in accordance with the reduced level of in  situ hybridization at this site. Alternatively, it could also be possible that sequences at 90BC are gene vestiges that are transcriptionally silent and this hypothesis could explain the reduction of affinity to the tRNA probe. A homoestatic mechanism for compensating for the loss of a large portion of the tRNA3b l templates appears to exist, since heterozygous va  deficiency for 84D caused no changes in total valine acceptance (Dunn et al„ 1979a). It is not known how this adjustment is achieved but compensation by selective gene amplification of the remaining tRNA3b Val templates has been ruled out (Larsen et. al., 1982).  236 The wealth of available genetic data , unprecedented for any other Drosophila iKilk gene family, has in part prompted a further investigation on the expression of tRNA3b^ * genes at a  the molecular level, and in particular whether the sequences at 90BC are tRNA3b^ l gene a  vestiges. The results presented here have been a collaborative effort with Drs. W. R. Addison, A. D. Delaney and R. M. MacKay, and have been published in Leung et al. (1984).  RESULTS Sequence Analysis of pDt41R A plasmid derived from 90BC, designated as pDt41R, was isolated from a pBR322 library containing Hin_dIII cut D. melanogasterWk (Dunn et */,l979b). The recombinant plasmid contains a 2.0 kb DNA fragment that hybridizes to tRNA3bVal (data not presented).  An  infrequently encountered restriction site, Xmal. was identified in the T? C loop of the tRNA 3 b  V a l  and was used as a marker to locate the gene within the Drosophila DUk. Two genes  have been identified, corresponding to the Xmal sites in the insert: one for a gene similar, but not identical, to the expected tRNA3b * and the other for a t R N A u G G ^ va  portion of the plasmid is shown in fig. 56.  ro  The sequenced  A transcript of this tRNA3b l-like gene would va  differ from the tRNA3b * sequence at four sites; nucleotides C5, Ci6 G&8 and G^g would be va  (  replaced by U5, Ui$, A$$ and A69. To clarify the positions of the nucleotide replacements and the Xmal site, the cloverleaf structure of this putative tRNA is displayed in fig. 57. However, an in pyyptRNA species containing these nucleotide changes has not yet been reported. The proline tRNA gene in pDt4lR is 276 bp downstream from the tRNA3b^ * gene and is of a  opposite polarity.  Neither of the two major tRNA^  ro  species of Drosophila  has been  sequenced (White et al., 1973), so a comparison to the gene sequence is not yet possible. However, the non-transcribed strand of the tRNATjGG  Pro  gene in pDt41R is 95% homologous to  the corresponding tRNA of mouse and chicken (Sprinzl et al. 1987). Homologies With Another tRNA3fr Containing Plasmid VaL  A second plasmid, designated as pDt48, containing a 2.4 kb Drosophila insert, also maps to the 90BC locus by in situ hybridization (Dunn et al., 1979b). It also contains an identical  237  F i g . 56. The sequenced segments of pDt41R, pDt48 and the corresponding region from Canton-S from region 1 of chromosomal site 90BC (see fig. 3) are shown. The genes encoding the tRNA3b -Hke and the tRNAucr/™ axe boxed. The small arrow heads Indicate the 3 -ends of the two tRNA genes and the direction of transcription. The top line is the sequenced region from pDt48. Aligned underneath are the sequenced portions of pDt41R and the corresponding segment derived from the Canton-S strain (Can-S) as reported by DeLotto and Schedl (1984). Dashed lines indicate identical sequences. Where differences do occur, they are indicated as (*) for deletions; substitutions are indicated by the base replacements. Va,  pDt48 pDt4lR Can-S pDt48 Qc41R Can-S P  LOO AAAATAAATC TAAGTATGCA ACTTTGGCAA GATCAGAAGA ATAAGTTAAA CGGCCATTG* AAAATGTGTT TCTCCAATTC TTCTAAAAAA AATGTAATAA _; __Q _. , vi ; ** ' !_ . _ _: ** 200 AATTTTAAAA TAAGCAAATA GTTCCACAGG AAACTAGAGT CATGCAGGTT AGTCCTTTTT GTGTTGTGTG •AACACAATAC TCTATACTGT TAGTTTAAAC -—. • —--A--A— —G T " " " " " — — — — — C - - A — G T A  pDt48 pDt41R Can-S pDt48 pDt^lR Can-S pDt48 pDtMR Can-S pDt48 pDt4 1R Can-S pDt48 pDt41R Can-S pDt'<8 pDtMR Can-S  A  ;  :  2  V >• 300 CGAACCCGGG CGAAAAC/fc* ATTGA*TTTT ~ A* — C—* ______A___ c AA — C - - C 400 TTTTTAATTT CTTTTTACAT TTTCGATGAA TCTTAGGGTT GAAAACGGTA ACACAAATAA AATATTTTAA TACCCTTAAG GAATAATTGA A A A A A C A C * * _____ TT-C*** ********** * G— — T — —TA-T '• T-AAA ___**__*__ . .TT-C*** ********** * G —T TA-T T-AAA ,  pDt48 pDtAlR Can-S  ;  -  TAAATGTTTC ATGATJGTTTT _ A  C  •• •  • .  CGTAGTGTAG TGGTTATCAC GTGTCCTTCA CACGCACAAG GTCCCCGGTT : — : •- ~ — - ~ :  A  i  :  CGATGCTATA GAAACGTACC AATATTTGAA TAAGCCAATG GGGTTTAAAT CCATACATAT TGTTTACGGG TCAAACCATT TACTTTCTAT ACTTTAAATA _. __; T G ' T-r : r : - -T___: -_______T-A __-_-G — - -TT- — . r- - T ^ - — • •+ P 600 TTTTCTTTAA TTTTCAGAAA AATTAGCAAA GAAAAAATTT GTACGTGCGG TTGA**GT*T GAGCAATAAA AACAGTACAG CTfcGGCTCAA CCCGGATTTT; _c -T — A A CC~*T ~ A C *T-AC ___ C — T A A - — -CC—GC — A C— -*T-AC—  .  ' •  - ;  ' •••  ;•  .  :  •.