Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

A construct of overlapping yeast artificial chromosomes spanning a seven centimorgan region of human… Everson, Ted 1997

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


831-ubc_1997-0347.pdf [ 12.07MB ]
JSON: 831-1.0087871.json
JSON-LD: 831-1.0087871-ld.json
RDF/XML (Pretty): 831-1.0087871-rdf.xml
RDF/JSON: 831-1.0087871-rdf.json
Turtle: 831-1.0087871-turtle.txt
N-Triples: 831-1.0087871-rdf-ntriples.txt
Original Record: 831-1.0087871-source.json
Full Text

Full Text

A CONSTRUCT OF OVERLAPPING Y E A S T ARTIFICIAL CHROMOSOMES SPANNING A S E V E N C E N T I M O R G A N REGION OF H U M A N C H R O M O S O M E 8p22 by TED EVERSON B.Sc, University of British Columbia, 1994 A THESIS SUBMITTED IN PARTIAL F U L F I L L M E N T OF THE REQUIREMENTS FOR THE DEGREE OF M A S T E R OF SCIENCE in THE F A C U L T Y OF G R A D U A T E STUDIES M E D I C A L GENETICS P R O G R A M M E We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH C O L U M B I A APRIL 1997 ©TedEverson 1997 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of ^ ^ > > ' C ^ L - .G-^Ofc-nc.S The University of British Columbia Vancouver, Canada DE-6 (2/88) ABSTRACT An important goal of the Human Genome Project is the physical mapping of the human genome. This thesis describes the preparation of a set of overlapping yeast artificial chromosomes (YACs), spanning approximately a 7 centiMorgan (cM) region of chromosome 8p22, delimited distally by D8S550 and proximally by D8S552. This set of overlapping YACs, or YAC contig, is a useful contribution to the human genome physical map. It will have valuable applications for the identification of sequence ready clones and for the detection of disease genes that may be mapped to 8p22. To construct the YAC contig, the Whitehead Institute/ Massachussetts Institute of Technology (MIT) Center for Genome Research genomic database was searched for YACs containing markers that were previously localized to the region of interest. A singly-linked contig, WC8»1, was found, which identifies a set of 62 overlapping YACs within this region. Database information for these YACs was examined in order to exclude from analysis any YACs for which significant evidence of chimaerism was available. A subset of YACs was chosen for further analysis; these included eight YACs with no evidence for chimaerism, and two YACs with relatively weak evidence for chimaerism. DNA sequence from the insert ends (terminal sequence) of a number of these YACs was isolated by a modified bubble PCR protocol (Riley et al.,1990), a procedure that amplifies terminal sequences. These sequences were then used to develop new markers for the region. PCR was performed, using selected markers from WC8«1 and the new markers designed from terminal sequences. PCR amplification of markers in the set of ten YACs ii resulted in the identification of overlapping YACs, forming a contig that completely spanned the region of interest. In addition, terminal sequences from YAC 729el2 were found to be highly similar to the murine guanine nucleotide release factor 2 (GRF2) gene; a marker designed from this sequence was amplified in a human chromosome 5 somatic cell hybrid, localizing this putative human gene to chromosome 5. iii TABLE OF CONTENTS Abstract ii Table of Contents iv List of Tables vi List of Figures vii Acknowledgements viii Chapter 1: Introduction 1 1.1 An introduction to the Human Genome Project 1 1.2 Genetic mapping technologies 2 1.3 Physical mapping technologies 4 1.3.1 Cytogenetic mapping 5 1.3.2 Overlapping clones 6 Construction of a clone fingerprint for contig construction 6 Modifications to the original fingerprinting strategy 7 Single-copy DNA sequences and STS content mapping 8 1.3.3 Human genome physical mapping 9 Physical mapping of chromosome 8p22 9 1.4 Thesis objectives 12 Chapter 2: Methods and Materials 13 2.1 Collection of STS content mapping data 13 2.2 Exclusion of chimaeric YACs 13 2.3 Preparation of yeast DNA 14 2.3.1 Protocol One: Large- scale DNA preparations 14 2.3.2 Protocol Two: DNA preparation using glass beads 16 2.4 Polymerase chain reaction (PCR) conditions 17 2.5 Isolation of YAC terminal sequences using bubble PCR 17 2.5.1 Restriction digest 19 2.5.2 Preparation of bubble linker 20 2.5.3 Ligation of bubble linker 20 2.5.4 PCR amplification of YAC terminal sequence 22 2.6 Isolation of bubble PCR products 23 2.7 Cloning of bubble PCR products 23 iv 2.8 Transformation of ligation mixture 23 2.9 Sequencing of YAC termini and design of PCR primers 25 2.10 PCR using STSs corresponding to both YAC termini and genetic markers 25 2.11 Determination of YAC order using the System for Assembling Markers 27 Chapter 3: Results 29 3.1 Identification of a singly-linked contig encompassing 8p22 29 3.2 Exclusion of chimaeric YACs 29 3.3 Isolation and sequence analysis of YAC insert termini 35 3.4 STS content mapping 43 3.4.1 The localization of a human gene to chromosome 5 43 3.5 Analysis of STS content mapping results using SAM 44 Chapter 4: Discussion 60 4.1 The exclusion of chimaeric YACs 60 4.1.1 STS content mapping data 62 STS content mapping in YAC pools 62 Resolving ambiguous results 63 Examination of evidence for chimaerism in YAC 920d 12 64 4.1.2 CEPH Alu- PCR data 65 Alu- PCR hybridization to somatic cell hybrids 65 Examination of evidence for chimaerism in YAC 737e5 66 4.2 A comparison of WC8« 1 and the contig produced for this thesis 66 4.2.1 Positive results not predicted by WC8« 1 66 4.2.2 Negative results not predicted by WC8»1 67 4.3 YAC termini isolation for contig construction 68 4.4 The effectiveness of employing YACs for contig construction 70 4.5 Conclusions 72 4.5.1 Construction of a minimum tiling path spanning 8p22 72 4.5.2 Physical mapping of 8p22 and the Human Genome Project 73 4.6 Future areas of research 74 Bibliography 75 Appendix 1 Contig WC8»1 85 Appendix 2 Published research completed during this thesis 91 LIST OF TABLES Table 1 Bubble linker oligonucleotide sequences 20 Table 2 PCR primers used for amplification of terminal sequence from bubble PCR libraries 22 Table 3 STSs used for STS content mapping 28 Table 4 (a) YACs from WC8»1 for which no evidence of chimaerism was available 32 (b) A subset of WC8*1 (W) containing YACs from the region of interest with no evidence of chimaerism 33 (c) WC8»1 surrounding the YAC subset, and including two YACs (737e5, 920dl2) with some suggestion of chimaerism 34 Table 5 A list of PCR products resulting from bubble PCR, and associated sizes 37 Table 6 Primer sequences of STS designed from YAC insert termini 37 Table 7 STS content mapping in YACs using STSs previously localized to the region of interest 45 Table 8 Contig WC8« 1 from the region of interest 69 vi LIST OF FIGURES Figure 1 The pYAC4 yeast artificial chromosome vector 10 Figure 2 Bubble PCR 18 Figure 3 The bubble PCR linker 21 Figure 4 The pKRX vector 24 Figure 5 The pKRX vector with insert DNA 26 Figure 6 The MIT/Whitehead Institute contig WC8» 1, emphasizing the region of interest to this thesis 31 Figure 7 PCR amplification of YAC termini with primers 224 and 226 (for trp- termini amplification) or 228 (for ura- termini amplification) 36 Figure 8 Sequence analysis of bubble PCR products 38-41 Figure 9 Sequence alignments 42 Figure 10 STS content mapping in YACs 46-53 Figure 11 PCR amplification of YAC termini in YACs 54-56 Figure 12 PCR amplification in controls, using STSs designed from YAC termini 57 Figure 13 PCR amplification of an STS designed from ura-729e 12, the terminal sequences with similarity to murine GRF2 58 Figure 14 A YAC contig of the region of interest 59 Figure 15 A YAC minimum tiling path of the region of interest 61 vii ACKNOWLEDGEMENTS I would like to thank and express my gratitude to Dr. Stephen Wood, my supervisor for this thesis, for the opportunity to work on this project and for his support and guidance throughout its completion. I also would like to thank my thesis supervisory committee, Dr. Ann Rose, Dr. Carolyn Brown, and Dr. Robert McMaster, for their invaluable advice and their constructive comments. A particular note of thanks to my lab mate and friend Mike Schertzer for his sense of humour and his expert technical advice. I would like to express my appreciation to everyone in the lab, Dorota Kwasnicka, Tanya Nelson, Leah DeBella, Richard Bruskiewich, Gurjodh Singh, and Karim Damani, who were always available to help and to share in frustrations and successes. Finally, I owe a special debt of thanks to my friends Lingli Ma, Tom Milne, and David Dyment; it would not have been the same without their presence. viii 1 C H A P T E R I INTRODUCTION 1.1 An introduction to the Human Genome Project The Human Genome Project is the most ambitious collaborative effort to have been attempted in biology. Its ultimate goal is knowledge of the complete genetic composition of the human genome, and the understanding of the genetic basis of human biology and disease. The U.S. Human Genome Project was initiated by the U.S. National Research Council (NRC) Committee on the Mapping and Sequencing of the Human Genome (National Research Council, 1988), and the major goals were defined by the National Institute of Health (NIH) and the Department of Energy (DOE) (Botstein et al.,1990). A five-year plan was implemented. The plan included the construction of high resolution (2-5 centiMorgan (cM)) genetic maps, continued construction of physical maps for the human genome and for the genomes of model organisms, and development of new and improved techniques for DNA sequencing and for genomic information storage. In addition, an examination of the ethical, social, and legal issues that would result from the acquisition of the genetic information in the genome was emphasized (Botstein et al.,1990). In 1993, a revised five-year plan was proposed (Collins and Galas, 1993). This revised plan re-emphasized many of the goals of the original five-year plan, but also proposed an increased pace for the project in response to rapid growth in the field of 2 genome research. More challenging goals were set for most areas, including genetic and physical mapping, sequencing, gene identification, technology development, and informatics. 1.2 Genetic mapping technologies The concept of organising a cooperative attempt to map the human genome arose from successes in the development of many techniques for both genetic and physical mapping. The following is a brief introduction to some of the most important advances in molecular genetic technologies, and the resulting achievements in genome analysis. Linkage analysis is a method used to determine the proximity of genetic markers based upon marker transmission patterns within a family. One of its earliest applications was by Bell and Haldane (1937) to demonstrate linkage of the genetic bases of color-blindness and haemophilia on the X chromosome. Originally, however, the technique was not routinely practical in humans; individual human pedigrees are usually too small to provide statistically significant evidence that cosegregation of markers within a pedigree are due to physical proximity. The LOD score method of linkage analysis (Morton, 1955) was developed to overcome this difficulty. This method calculates the logarithm of the odds of observing particular marker distributions in a human pedigree, given linkage versus no linkage between the markers analyzed. The significance of the LOD score method is that scores for individual pedigrees can be combined, resulting in a much greater ability to detect marker linkage (Morton, 1955). 3 The discovery in humans of restriction fragment length polymorphisms (RFLPs) suggested a method by which linkage analysis could be used to construct a genome-wide genetic linkage map (Botstein et al.,1980). RFLPs, defined as variations in length between allelic restriction enzyme fragments, provided a new marker source, and were widely distributed in the human genome. RFLPs are inherited in simple Mendelian patterns; simple analysis of restriction-digested DNA samples from pedigrees is sufficient to identify these RFLPs and possibly determine linkage relationships. It was suggested that a human linkage map, based on RFLPs, be constructed at a 20 cM resolution, in order to be used as a framework to which human traits could be genetically mapped (Botstein et al.,1980). The technology for performing linkage analysis has continued to improve dramatically. New computer programs have been developed that can rapidly evaluate linkage for multiple loci (Lander and Green, 1987). In addition, new and more informative markers have been discovered. Variable number of tandem repeat (VNTR) minisatellites, which are sets of tandemly repeated DNA sequences (Jeffreys et al., 1985, Nakamura et al., 1987), and dinucleotide repeat microsatellites (Weber and May, 1989, Smeets et al., 1989), are two examples of such markers. These typically have a much larger number of alleles compared to RFLPs, so parents in pedigrees used for linkage analysis are more likely to be heterozygous for these markers. Thus, it is more likely that transmission patterns of marker alleles from parent to offspring can be determined, and linkage analysis can be performed as a result. 4 As linkage analysis is greatly aided by the availability of marker information from large, multigenerational families (White etal., 1985), many past and recent successes with linkage mapping have been possible due to the availability of a collection of reference families provided by the Centre d'Etude du Polymorphisme Humain (CEPH) (Dausset et al.,1990).This is a set of forty large Caucasian families, from Utah, France, Venezuela, and Pennsylvania. DNA samples are produced by CEPH from cultured lymphoblastoid cell lines generated from the members of these families. CEPH distributed these to members of the CEPH consortium. The samples are now generally available through a mutant cell repository located at the Coriell Institute for Medical Research. The collection has been invaluable for genome analysis. The availability of DNA from these families, and the continued increase in linkage analysis technologies, have aided the production of many large-scale genetic maps for specific chromosomes (White et al., 1990, Dracopoli et al., 1991, Weissenbach et al., 1992, Spurr et al.,1992, Bowcock et al., 1993) and for the entire genome (Donis-Keller et al., 1987, NIH/CEPH, 1992, Murray et al., 1994, Dib et al.,1996). 1.3 Physical Mapping Technologies In addition to the many successes in the field of linkage analysis and the development of genetic maps for the human genome, similar achievements have occurred in the area of physical mapping as a result of the strides taken in the development of physical mapping technologies. Physical mapping refers to the determination of the location and arrangement of chromosomal DNA, and physical mapping techniques are an indispensable tool for genome analysis. 5 1.3.1 Cytogenetic Mapping Some of the earliest physical maps were cytogenetically based (DOE/NIH, 1990). Chromosomes can be stained to create a unique banding pattern reflecting the arrangement of repeat sequences (George, 1970), and the pattern allows for a chromosome to be subdivided into multiple regions. Cytogenetic mapping allows for the physical localization of genes or genetic markers relative to these chromosomal landmarks (Callen et al.,1992); as such, cytogenetic mapping can provide low-resolution physical localization of a marker on a chromosome. Cytogenetic mapping is accomplished using such techniques as in situ hybridization (John et al.,1969, Gall and Pardue,1970, Gerhard et al.,1981, McNeil et al.,1991) and somatic cell hybridization (Weiss and Green, 1967). In situ hybridization refers to the hybridization of labelled DNA to a chromosome spread, and results in a mapping resolution of from several hundred kilobases (kb) to several megabases (Mb), depending on the mitotic stage of the chromosomes to which the probes are hybridized (Callen et al.,1992, Trask,1991). Somatic cell hybrid panels are generated by fusing human cell lines with either mouse or hamster cell lines deficient in a selectable marker, and selecting for the retention of specific human chromosomes or chromosome fragments (Weiss and Green, 1967). They can be used to localize markers to particular chromosomal regions, based on the presence or absence of the marker in individual hybrids. Somatic cell hybrid panels have been used successfully to construct cytogenetic maps up to a resolution of approximately 2 Mb (Callen et al.,1992). 6 1.3.2 Overlapping Clones Other valuable physical mapping strategies rely on the organization of a series of overlapping genomic DNA fragments, according to the order the sequences are normally found on a chromosome. Such a construct of contiguous sequence, or contig (Staden, 1980), is extremely valuable; it provides a very convenient means by which DNA from a particular region can be immediately, accessed and manipulated, for such purposes as detection of genes and markers, and for DNA sequencing. DNA clones, consisting of DNA fragments inserted into a vector and propagated in either viruses, bacteria or yeast, are the most advantageous source for DNA fragments: they can be stably maintained, quickly propagated and easily manipulated. Construction of a clone fingerprint for contig construction Contig construction requires the identification of regions of overlap between DNA clones. A number of approaches have been used to determine such overlap, but all are based on the concept of developing a "fingerprint" for a DNA clone, and determining if fingerprint patterns between clones are sufficiently similar to suggest overlap (Coulson et al.,1986, Stallings et al.,1990). One of the first methods for creating a fingerprint of a clone was to define a restriction digest banding pattern. By digesting a number of clones with a selected restriction enzyme, and electrophoresing the digests in parallel on an agarose gel, restriction fragment banding patterns can be compared for each sample (Smith et 7 al.,1987). Evidence of overlap between two clones can then be determined and used for contig construction. Modifications to the original fingerprinting strategy Generally, a high degree of fingerprint similarity is required to provide sufficient evidence for physical overlap, and a simple restriction digest pattern often does not adequately provide statistically significant evidence (Lander and Waterman, 1988, Barillot et al.,1991). Given this limitation, simple fingerprinting procedures based solely on restriction fragment patterns are a relatively inefficient method for creating contigs of large areas of a genome (Lander and Waterman,1988). However, modifications of the basic fingerprinting technique, based on improving and increasing the amount of fingerprint information for a clone, have increased the efficiency of fingerprinting and contig construction. Improvements to the restriction digest fingerprint include using multiple rather than single restriction enzymes (Olson et al.,1986, Coulson et al.,1986), and generating complete restriction maps for the clones under analysis (Kohara et al.,1987). Another modification to. the fingerprinting strategy is achieved by combining restriction fragment patterns with the presence or absence of a repeat sequence (Stallings et al.,1990). A more informative fingerprint pattern is then established. Clones that share both a repeat sequence pattern and a restriction fragment pattern are much more likely to represent true overlapping genomic DNA sequences. Such fingerprinting techniques have been used successfully to establish collections of overlapping clones spanning all or a large portion of the genomes of a variety of organisms, including Escherichia coli (Kohara et al.,1987), Saccharomyces cerevisiae (Olson et al.,1986), and Caenorhabditis elegans 8 (Coulson et al.,1986). These fingerprinting strategies have also been used to develop human physical maps (Bellanne-Chantelot et al.,1992, Cohen et al.,1993), but the techniques have not been sufficient to create sets of large, continuous contigs of the human genome. Single-copy DNA sequences and STS content mapping A relatively recent modification of the fingerprinting strategy involves associating clones with single-copy DNA sequences that are unique in the genome (Arratia et al.,1991, Barillot et al.,1991, Torney,1991). Clones could be examined for overlap based on concordance for sharing of these sequences. This strategy provides a number of advantages to previous fingerprinting techniques. It produces a completely informative fingerprint for a clone, and eliminates the chance of incorrectly deducing clone overlap; two clones that share a unique sequence are known without any doubt to overlap (Barillot et al.,1991). In addition, unique sequences that have defined locations in the genome can be used, such as markers that have been genetically mapped or physically localized by cytogenetic mapping strategies. This allows the placement of resulting contigs at defined positions relative to the genetic map or the chromosome. A number of methods are available for the detection of unique DNA sequences in clones. These include using sequences as probes and hybridizing these to filters containing clones (Arratia et al.,1991), and using the polymerase chain reaction (PCR)(Saiki et al.,1985, Saiki et al.,1988) to amplify the sequences in clones. Sequences defined by PCR are called sequence-tagged sites, or STSs, and the technique is called STS content mapping 9 (Olson et al.,1989, Green and Olson,1990, Arratia et al.,1991). STS content mapping is an especially useful method, due to the relative ease and accuracy of PCR, and has become the most important tool for developing clone contigs. 1.3.3 Human genome physical mapping The major physical mapping strategies for humans typically have relied on STS content mapping in conjunction with other techniques, such as those already described, to construct yeast artificial chromosome (YAC) (Burke et al.,1987) contigs and high-density STS-based maps for the human genome. YACs (figure 1) are a useful clone for genome-wide physical mapping. Due to the enormity of mammalian genomes, clones containing large inserts are valuable for large-scale contig construction. YACs are presently the largest clones available, containing inserts of an average size of 1 megabase (Chumakov et al.,1995). Human genome physical mapping has resulted in the production of a variety of human contigs, including relatively small YAC contigs with many gaps spanning the whole genome (Bellanne-Chantelot et al,1992, Cohen et al.,1993), much larger YAC contigs spanning significant regions of specific chromosomes (Collins et al.,1995, Gemmill et al.,1995, Krauter et al.,1995, Doggett et al.,1995, Eki et al.,1996, Malaspina et al.,1996, Bardenheuer et al.,1996) and the whole genome (Chumakov et al., 1995), and an STS map with an average resolution of 199 kilobases (kb)(Hudson et al.,1995). Physical mapping of chromosome 8p22 STS content mapping has been completed for chromosome 8p22 as part of a whole-genome physical mapping project (Chumakov et al.,1995). This region is of 1 0 TEL TRP1 ARS1 CEN4SUP4 M i-n— Insert DNA SUP4 VRS TEL - • L = l 1 Figure 1: The pYAC4 Yeast Artificial Chromosome Vector (Burke et al., 1987). TEL- telomere; TRP1, URA3- yeast genes used to select for both arms of pYAC4; ARS1- autonomous replication sequence; CEN4- provides centromere function; SUP4- an ochre-suppressing allele of a tyrosine transfer RNA gene. SUP4 expression results in the production of white colonies for yeast containing pYAC4. When interrupted by insert DNA, loss of SUP4 expression results in the production of red colonies. 11 particular interest; it has been observed to be homozygously deleted in a variety of human cancers, including colorectal, prostatic, and hepatocellular carcinomas (Bergerheim et al.,1991, Cunningham et al.,1993, Emi et al., 1993). These observations suggest the presence of a tumour suppressor gene at 8p22. A singly-linked Y A C contig, WC8«1, consisting of clones singly linked by one STS, has been identified within 8p22 (Chumakov et al.,1995; Appendix 1). The contig spans the region of interest to this thesis, but is insufficient for use in the production of sequence-ready clones, for several reasons. First, it has been estimated that approximately 40-50% of CEPH mega-YACs are chimaeric (Green et al.,1991, Bellanne-Chantelot et al.,1992, Hudson et al.,1995); that is, 40-50% of CEPH mega-YACs contain inserts corresponding to non-continuous portions of the genome. Thus, it is probable that a similar portion of the Y A C s composing the contig for the region of interest are chimaeric. The presence of chimaeric Y A C s in a contig confound the processes of gene localization and genetic sequencing. This complication is due to the presence of genetic sequences, and possibly genes, that may not originate from the region of the genome represented by the contig. Second, a minimal tiling path of YACs , defined as the minimum number of overlapping Y A C s that span a region of interest, has not been identified and confirmed; this is an important prerequisite to isolating sequence-ready clones. Third, contig WC 8»1 was constructed by STS content mapping in complex three-dimensional arrays of Y A C pools (see Discussion); individuals Y A C s were not tested for the presence of STSs. 12 Verification of STS content mapping results by testing individual YACs is a necessary component to ensuring accuracy of the contig. 1.4 Thesis Objectives The objectives of this research thesis are to independently identify a contig of overlapping YACs spanning a seven cM region of 8p22, to reduce to a minimum the possibility of chimaerism complicating the contig, and to define a minimum tiling path of YACs that span the region. To accomplish these objectives, a subset of YACs from the existing singly-linked contig has been chosen for analysis. These YACs show little evidence of chimaerism, according to all public YAC data available. STS content mapping in these YACs has been completed as part of this thesis, using markers from WC8*1. STSs were also developed during the course of this thesis from insert terminal sequence of several YACs belonging to the subset of non-chimaeric YACs, but STS content mapping using these provided no additional data for contig construction. STS content mapping has enabled the identification of a minimal tiling path of YACs, with an anticipated low frequency of chimaerism, spanning the 7 cM region of 8p22. The contig will have many valuable uses, including the identification and ordering of sequence-ready clones and the identification of genes mapped to the region by positional cloning techniques. 13 CHAPTER H METHODS AND MATERIALS 2.1 Collection of STS content mapping data To determine the extent to which contig construction has been completed for the region of 8p22 of interest to this thesis, a database maintained by the Whitehead Institute/ Massachussetts Institute of Technology (MIT) Center for Genome Research ( was queried with STSs that had been genetically mapped (Dib et al.,1996) between the distal marker D8S550 and the proximal marker D8S552. The database provides a summary of results, from the Whitehead Institute/MIT and from CEPH/Genethon, for STS content mapping in YACs and for any resulting contigs that have been constructed. 2.2 Exclusion of Chimaeric YACs It is estimated that approximately 40-50% of the CEPH mega-YACs are chimaeric (Bellane-Chantelot et al.,1992, Green et al.,1991, Hudson et al.,1995). The Baylor College of Medicine (BCM) database (, which allows for examination of a variety of YAC data available from CEPH/Genethon and the Whitehead Institute/MIT, was searched using YACs from within the region of interest to determine if there existed suggestive evidence for YAC chimaerism. Suggestive evidence of chimaerism for a YAC was assumed if one of several results was observed: first, Alu-PCR products (Nelson et al., 1989) from the YAC hybridized to chromosomes other than chromosome eight in a somatic cell hybrid panel, or to YACs that had previously been localized to chromosomes 14 other than chromosome 8 (CEPH Alu-PCR data); second, a PCR product was amplified from the YAC using STSs from chromosomes other than chromosome 8p (Whitehead/MIT STS content mapping data). Only YACs showing little or no suggestive evidence for chimaerism were selected for further analysis. 2.3 Preparation of yeast DNA Yeast cells containing YACs of interest were obtained from Research Genetics as agar stabs. These were used to inoculate 1.2% agar plates containing an acid-hydrolyzed casein (AHC) minimal ura- trp- medium (Brownstein et al.,1989). Plates were incubated at 30°C for two days, and DNA was isolated using one of two protocols. 2.3.1 Protocol one: Large-scale DNA preparations Red-pigmented colonies were used to inoculate 20 ml of yeast extract-peptone-dextrose (YPD) medium (Rose et al.,1990). The medium containing a colony was incubated with shaking at 30°C for two days. The resulting yeast cell suspensions were transferred to a 50 ml centrifuge tube, and DNA was prepared in solution using a modification of a protocol of Sherman et al. (1986). The suspended yeast cells were harvested by centrifugation, resuspended in a 2.4 ml volume containing 1 M sorbitol, 0.1 M EDTA (pH 7.5), and 0.1 mg zymolyase 60 T, and incubated for one hour at 37°C to spheroplast the cells. The mixture was centrifuged, harvested cells were resuspended in 2 ml of YTE buffer (50 mM Tris-HCl, pH 8.0, 20 mM EDTA), and lysed by adding 200 ul 10% sodium dodecyl-sulfate (SDS) and incubating at 65°C for 20 minutes. Residual cellular debris was precipitated by the addition of 4 mM potassium acetate, collected by 15 centrifugation and discarded. The supernatent was transferred to a 15 ml centrifuge tube, and two volumes of 95% ethanol were added to precipitate the DNA. The pellet was dried, resuspended in 475 ul TE buffer (10 mM Tris-Cl pH 8.0, ImM EDTA) and 25 ul of preboiled 1 mg/ml pancreatic ribonuclease A, and incubated at 37°C for one hour. One volume of isopropanol was added to precipitate the DNA, which was collected by centrifugation for 15 minutes, washed with 500 ul of 70% ethanol, dried and resuspended in 500 ul of TE buffer. AHC Minimal Medium 1.7 g yeast nitrogen base without amino acids (Difco #0919-15-3) 10 g acid-hydrolyzed casein (AHC) (Difco #0230-01-1) 20 g dextrose 20 mg adenine hemisulfate (Sigma Cat.#A-3159) 15 gbacto-agar distilled water to one liter concentrated hydrogen chloride to a final pH of 5.8 YPD medium 10 g yeast extract (Difco #0127-01) 20 g peptone (Difco #0118-01-8) 20 g dextrose distilled water to one liter 16 2.3.2 Protocol two: DNA preparation using glass beads Red-pigmented colonies were used to inoculate 5 ml of YPD, and were incubated overnight with shaking at 30°C. The resulting culture was pelleted by centrifugation, washed with 500 ul of distilled water, and resuspended in 500 ul of GDIS. The solution was mixed with 200 ul of phenol-chloroform-isoamyl alcohol (25:24:1) and 0.35 grams of 710-1180 um acid-washed glass beads (Sigma G9393). Samples were vortexed vigorously for 2.5 minutes, and 200 ul of distilled water was added. The aqueous layer was removed and mixed with 2 volumes of 95% ethanol. D N A was precipitated for 2 minutes at room temperature, and collected by centrifugation. The D N A pellet was resuspended in 1 ml of distilled water. GDIS 2% Triton X-100 1% SDS 100 mM sodium chloride 10 M m Tris-Cl ph 8.0 1 m M E D T A Acid-washed glass beads 200 ul of glass beads were washed with 5 ml of 0.1 N HC1, rinsed three times with 5 ml of distilled water, autoclaved for 25 minutes and dried. 17 2.4 Polymerase chain reaction (PCR) conditions Unless otherwise stated, conditions for all PCR reactions were as follows: DNA samples were added to a 25 ul volume containing 1.25 U of Taq polymerase, 50 mM Tris (hydroxymethyl) aminomethane pH 8, 0.05% Tween-20, 0.05% NP-40, 2.0 mM MgCl 2 , 200 uM each of dATP, dGTP, dCTP, dTTP, and 0.5 uM each of PCR primers. Amplification consisted of 35 cycles of a one-minute denaturing step at 94°C, a one-minute annealing step at a temperature specific to the PCR primers used, and a one-minute extension at 72°C. A final ten-minute incubation at 72°C was also performed. PCR products were examined by adding a one-tenth volume of gel loading buffer (0.25% xylene cyanol, 0.25% bromophenol blue, and 40% sucrose), loading the products on a 2.0% agarose (Pharmacia) gel containing 0.1 mg ethidium bromide per 100 ml IX Tris-borate/EDTA buffer (TBE)(0.090M Tris-borate, 0.002M EDTA), and electrophoresing at 60 volts for approximately 2 hours in IX TBE. PCR product size was determined by simultaneously electrophoresing 500 ng of Haelll-digested <|>X174 phage DNA ((|>X174 marker) and comparing band migration patterns. 2.5 Isolation of YAC terminal sequences using bubble PCR Bubble PCR (Riley et al.,1990) is a method for isolating YAC termini. The technique involves a restriction digest of the YAC under analysis, using a restriction enzyme that cleaves near the insert cloning site. Ligation of a universal bubble linker precedes PCR amplification (Figure 2), involving the following primers: oligonucleotide (oligo) 226 and oligo 228 are primers complementary to YAC vector sequence adjacent to the insert 18 YAC VECTOR INSERT trp 1 trp 1 R 226 22 6 I RRESTRICTION DIGEST t 226 •GATE LINKER TO FESTHCTION FRAGMENTS PCR USING VECTOFETTE FRIIVER 224 AND YAC VECTOR FRIIVER 226 (a) EXTENSION FROM FRIIVER 226 YAC VECTOR 228 X ura 3 228 ura 3 22 8 226 (b) VECTORETTE LINKER PRIMER ANNEALS TO NEW STRAND 226 224 (c) EXPONENTIAL AMPLIFICATION Figure 2. Bubble PCR. Amplification of the trp-terminus of the insert, adjacent to YAC vector primer 226, is shown. 19 cloning site, on the left (tip-) and right (ura-) arms of the vector, respectively. Oligo 224 is a primer that is identical to one strand of the universal bubble linker, but not complementary to the opposite linker strand, due to a region of mismatch within the bubble linker. PCR using either oligo 226 or oligo 228, and oligo 224, results in the exponential amplification of sequence corresponding to either the trp- or ura- termini, respectively. Amplification of sequence corresponding to linker-flanked vector or internal insert sequence is prevented by the necessity of initial extension from a YAC vector primer during PCR, subsequent to exponential amplification with both a YAC vector primer and the bubble PCR primer (Figure 2). The following procedure is a modification of that which was originally described by Riley etal.(1990). 2.5.1 Restriction digest Approximately 2 ug of DNA samples were precipitated with 95% ethanol, washed with 70% ethanol, and resuspended in 16 ul distilled water. To each DNA sample, 2 ul 10 x BSA (bovine serum albumin fraction V 1 mg/ml)(Sigma), 2 ul 10X React buffer 2 (NEB), and 1 unit of Ddel restriction enzyme (NEB) was added, and the samples were incubated for 1.5 hours at 37°C. 1 ul of gel loading buffer was added to 5 ul of restriction digest, and the mixture was loaded on a 0.8% agarose gel and electrophoresed for approximately 1 hour at 60 V to confirm that the DNA had been restriction digested. The remainder of the restriction digest was mixed with two volumes of ethanol to precipitate the DNA, washed with 200 ul of 70% ethanol, dried and resuspended in 12 ul of distilled water. 20 2.5.2 Preparation of bubble linker A luM vectorette linker solution was prepared (Figure 3), containing 0.5 uM bubble PCR top strand, with a 5' overhang compatible with the Ddel overhang of the YAC DNA restriction fragments, 0.5 uM bubble PCR bottom strand, and 25 mM NaCl. Sequences for the bubble PCR oligonucleotides are provided in Table 1 (Riley et al.,1990). The solution was boiled for 2 minutes and incubated for 5 minutes at 65°C. Oligo number Description Sequence 221 bottom strand 5' -CTCTCCCTTCTCG AATCGT A ACCGTTCGTACGAGA ATCGCTGTCCTCTCCTTG-3' 222 top strand with 5' Ddel overhang 5' -TNACAAGGAGAGGACGCTG TCTGTCGAAGGTAAGGAACGG ACGAGAGAAGGGAGAG-3' Table 1: Bubble linker oligonucleotide sequences (Riley et al.,1990) 2.5.3 Ligation of bubble Linker The solution of Ddel-digested yeast DNA containing the YAC was added to 15 ul of 1 uM bubble linker solution. Three ul of 10X T4 DNA ligase buffer (NEB) and 100 units T4 DNA ligase (NEB) were added to the mixture, which was left overnight at 16°C. 21 Figure 3. The bubble linker. Oligo 224 is identical to the bottom strand of the linker, but not complementary to the top strand due to the region of mismatch. 22 2.5.4 PCR Amplification of YAC Terminal Sequence The ligation mix (bubble PCR library) was diluted to 100 ul with IX TE, and 10 ul was used for PCR. PCR Amplification of YAC terminal sequence was accomplished by using the universal bubble primer 224 and either primer 226 (for the trp- terminus), or 228 (for the ura- terminus). Primer sequences (Riley et al.,1990) are provided in Table 2. PCR conditions were as described, with an annealing temperature of 56°C for trp-terminus amplification and 60°C for ura-terminus amplification. Oligo Number Description Sequence Annealing Temperature (°C) 224 Universal bubble PCR primer 5'-CGAATCGTAACCGT TCGTACGAGAATCGCT-3' 86 226 Primer for trp-terminus of bubble PCR terminal sequence 5' -GTTGGTTTAAG GCGCAAGAC-3' 56 228 Primer for ura-terminus of bubble PCR terminal sequence 5'-GTCGAACGCCC GATCTCAAG-3' 60 Table 2: PCR primers used for amplification of terminal sequence from bubble PCR libraries. 23 2.6 Isolation of bubble PCR products PCR products were isolated from a gel using an agarose gel extraction kit (Qiagen). One-tenth of the resulting extract was diluted to 10 ul with distilled water, mixed with a one-tenth volume of agarose gel stop mix, and electrophoresed in IX TBE at 90 V for approximately two hours on a 2% agarose gel alongside <j)X174 marker, to confirm the presence of the PCR product in the gel extract. 2.7 Cloning of bubble PCR products pKRX (Schutte et al., 1996) is an artificial vector system designed for cloning PCR products (figure 4). When digested with Xcml, a 3'-T overhang is produced, providing a complement to the 5'-A overhang present within a PCR product. Approximately 100 ng of PCR product in 6 uL of distilled water was mixed with 1 uL of 10X T4 DNA ligase buffer (NEB), 50 ng of Xcml-digested pKRX vector, and 80 U T4 DNA ligase (NEB), for a total reaction volume of 10 uL. The ligation mixture was incubated at 16°C overnight. 2.8 Transformation of ligation mixture Competent DH5cc Escherischia coli cells (Inoue et al.,1990), stored at -70°C, were thawed on ice, and 50 uL was added to a prechilled 15 ml centrifuge tube containing 5 uL ligation mixture. The mixture was chilled for 30 minutes, heat shocked for 45 seconds at 42°C, and chilled for 2 minutes. 400 uL L-broth (5 g yeast extract, 10 g tryptone, 5g NaCl, 1 g D-glucose in 1 L distilled water) was added, and the solution was incubated for 45 minutes at 37°C. Solutions were concentrated by centrifuging briefly and removing all but 100 uL L-broth. The solution was used to inoculate two XIA (X-gal, IPTG, 24 Figure 4. The pKRX vector. Xcml digestion results in the release of a portion of the cloning site, and the production of a 3'-T overhangon each of the two ends. The overhang is complementary to a 5'-A overhang produced during PCR amplification, allowing for the efficient cloning of PCR products. 25 ampicillin) plates. XIA plates were made by autoclaving 1.2% agar in L-broth media, allowing media to cool to approximately 50°C, and adding 40 ug/ml of 5-bromo-4-chloro-3-indolyl-(3-D-galactoside (X-Gal), 50 ug/ml of ampicillin, and 120 ug/ml of isopropyl-(3-thiogalactopyranoside (IPTG). Plates were incubated at 37°C overnight, and white-pigmented colonies were used to inoculate 1 ml of L-broth. Colonies were incubated in L-broth at 37°C overnight, and DNA was prepared following an alkaline lysis protocol (Sambrooketal.,1989). 2.9 Sequencing of YAC termini and design of PCR primers YAC termini were sequenced using an automated sequencer (ABI Model 373 Stretch). 500 ng DNA prepared from DH5a E. coli cells transformed with pKRX vector containing YAC terminal inserts was used for the sequencing reaction. The resulting sequence was examined for the presence of predicted pKRX, YAC vector, human insert and bubble linker sequences (Figure 5), and human insert sequences from those that were confirmed to be YAC termini were searched using the Basic Local Alignment Search Tool (BLAST) at the National Center for Biocomputing Information (NCBI) (Altschul et al.,1990), to determine if insert DNA corresponded to unique human DNA sequence. Sequence-tagged sites (STSs) were designed from these sequences (see Results). 2.10 PCR using STSs corresponding to both YAC termini and genetic markers STS primers designed from YAC terminal sequence and selected polymorphic, genetically-mapped STS markers from the region of interest were used for PCR against the subset of YACs chosen for analysis. PCR conditions were as described, at an 26 Inse rt sequence from T7 I (i) 17—- pKRX —- YAC vec to r —- human insert —- vec tore t te linker -—pKRX or (ii) T7 —- pKRX — - vectoret te linker —- human insert ---- YAC vec tor -—pWRX Figure 5. The pKRX vector with insert DNA. A sequencing reaction initiated from the T7 primer produces one of the two results shown, depending on the insert orientation. (i) Group 1 orientation. (ii) Group 2 orientation. 27 annealing temperature specific to the STS used for genetic markers (Table 3), and at an annealing temperature of 48°C for STSs designed from YAC termini. PCR results were analyzed by electrophoresis on a 2% agarose gel in IX TBE for approximately two hours at 100 volts alongside <))X174 marker DNA. 2.11 Determination of YAC order using the System for Assembling Markers The System for Assembling Markers (SAM) program version 2.5a (Soderlund and Dunham, 1995) takes as input a set of markers and their associated clones, and outputs the most likely marker orders and clone contig, based on the minimization of the number of discontinuities in the clones' STS content. The program was used to determine the most likely order of the YAC subset based on STS content mapping results, and to define a minimum tiling path of YACs spanning the region of interest to this thesis. 28 mat ker (D8S-) size • hp» primer sequences annealing temperature ( ( i 520 189 5'- C T G A A G A G C A A A T G G C C C T - 3' 5'- T A A G A T C A C A T G G C C C C C T - 3' 54 1755 254 5'- A C C G C A T C T G G T C A A C T - 3' 5'- G C A T C T C C A T T G G A A G A T T T - 3' 55 550 263 5'- C A G G A G T C A A T A A C C C A A A G T C A T - 3' 5'- T G G C A C A T C C C G A A G T C - 3' 50 1593 302 5'- A C A A A A C T A A G A T G G A C A T T T C A C A - 3' 5'- A A T T G A C T A G G A A A A T C T A T G G C C - 3' 56 265 212 5'- A C C T C T T T C C A G A T A A G C C C - 3' 5'- C C A A T G G T T T C G G T T A C T G T - 3' 54 1695 261 5'- A A C C C A G C A T C C T A C A A A G - 3' 5'- C A T C T G G A A C C C A T G A G - 3' 55 1759 131 5'- G A G A C T G A C A A T C T C C T C G T C T T A T - 3' C T A T T G C C T A G C T T A G C A C A T T T G A - 3' 55 1946 102 5'- G C A C A A G A T C A G A G A G G T T G T G - 3' 5'- G A G G A G A G A T G G T G T T G G G A - 3' 58 1130 144 5'- G A A G A T T T G G C T C T G T T G G A - 3' 5'- T G T C T T A C T G C T A T A G C T T T C A T A A - 3' 54 1640 175 5'- T G C A G T C T G C G G G A G T T C - 3' 5'- A G C A G G G T G A C T G T A A A G A A G G - 3' 54 1619 226 5'- G T G G T G C A G T T C A T C C T C T G - 3' 5'- C C T T G C A A A G T A T T T G G T A C T A A G A - 3' 58 552 175 5'- A G G A T T G T A A T T T C C T T G C - 3' 5'- G G G A C T T T T T G A A G G T T T G - 3' 48 1106 148 5'- T T G T T T A C C C C T G C A T C A C T - 3' 5'- T T C T C A G A A T T G C T C A T A G T G C - 3' 54 1109 240 5'- T T C T C A G A A T T G C T C A T A G T G C - 3' 5'- T C A G C T C C T C T T C T G C T G A T - 3' 56 1107 296 5'- C A G A A G G A G A C C C T G T C T C A - 3' 5'- A G T C A A G T T C T G T G C T C G C T - 3' 56 Table 3: STSs used for STS content mapping. 29 CHAPTER m RESULTS 3.1 Identification of a singly-linked contig encompassing 8p22 A singly-linked STS contig in YACs, WC 8»1 (Appendix 1), was obtained from the ninth release of the database maintained by the Whitehead Institute/MIT Center for Genome Research by searching the database with STSs that had been genetically mapped to the region of interest. The contig consists of YACs associated with an ordered set of STSs spanning approximately a 28 cM region of 8p21-8p22. It is linked to the genetic map by the inclusion of polymorphic, genetically mapped STS markers for STS content mapping. The contig is delimited distally by D8S503 and proximally by D8S282, located 17 cM and 45 cM, respectively, from the telomeric end of the genetic linkage map. It includes the region of interest to this thesis, and localises a set of 62 YACs to the region (Figure 6). 3.2 Exclusion of chimaeric YACs An examination of MIT/Whitehead STS content mapping data and CEPH Alu-PCR data for all YACs within the region resulted in the identification of 10 YACs with no evidence of chimaerism (Table 4(a)). Examination of contig WC8»1 suggested that a potentially large gap would be present in a contig consisting only of these YACs (Table 4(b)). Therefore, two additional YACs, 737e5 (700 kb) and 920dl2 (340 kb), with relatively little evidence of chimaerism, were added to the subset of ten YACs for further 30 analysis (Table 4(c)). An analysis of the evidence for the absence or presence of chimaerism in these YACs is presented in the discussion. 31 8 p Contig WC 8.1 Distance cM cR YACs 280 282 WI-6088 22 17 22 22 23 24 22 24 43 29 44 45 35.7 36.95 33.5 53.78 47.59 51.87 82 82 - 700d3, 749h6,770e9 -723H0, 770e9, 915h4 • 723f10, 770e9, 793(12, 692c9, 915h4,71 0a5 - 715c10, 729e12, 737e5, 751e7, 770e9, 821e1, 915h4,937h5 - 737e5, 751 e7, 770e9, 937h5, 715c10, 729e12,821 e1, 915h4 - 715c10, 729e12, 737e5, 751 e7, 770e9,773g4,915h4, 937g5 • 715c10,773g4, 809h8, 821 e1, 894g12, 751 el, 737e5 • 725c12, 737e5, 773g4, 809h8 • 773g4, 821e1, 871f3, 809h8, 894g9, 929b6 •711b9,764c7,764d9,773g4,787h1,794d12,794e11,809h8, 820b4,871f3,880f6,889d10,894g9,915c1,920d12,929b6, 937e1,945h8,949d12,954e7 • 715f5,750f10,764c7,773g4,787h1,799b1,820b4,880f6,896f7 • 748c9, 750f10, 764c7, 799b1, 809h8, 871 f3, 896f7, 929e12, 937e1,943c7 • 799b1, 920d12, 937e1, 744e6, 932h3, 691 f8 • 920d12, 799b1, 744e6, 875b7, 937e1, 932h3 • 920d12, 799b1, 744e6, 875b7, 932h3 • 744e6, 762g9,799b1, 875b7, 946a6 Figure 6. The MIT/Whitehead Institute contig WC8»1, emphasizing the region of interest to this thesis. YACs adjacent to an STS are those in which the STS is contained. 32 YAC SI/.b: (kb) COMMENTS 691f8 310 - no CEPH alu-PCR data reported - one positive result (8p STS) reported by Whitehead/MIT STS content mapping 692c9 1210 - no CEPH alu-PCR data reported - one positive result (8p STS) reported by Whitehead/MIT STS content mapping 700d3 470 - no CEPH alu-PCR data reported - one positive result (8p STS) reported by Whitehead/MIT STS content mapping 710a5 550-690 - no CEPH alu-PCR data reported - one positive result (8p STS), two ambiguous results reported by Whithead/MIT STS content mapping 729el2 400 - no CEPH alu-PCR data reported - 3 positive results (8p STSs), one ambiguous result reported by Whitehead/MIT STS content mapping 770e9 1690 - two CEPH alu-PCR results, hybridisation to chromosome 8 - 7 positive results (8p STSs), one ambiguous result reported by Whitehead/MIT STS content mapping 799b 1 480 - one CEPH alu-PCR result, hybridisation to chromosome 8 - 6 positive results (8p STSs), one ambiguous result reported by Whitehead/MIT STS content mapping 871f3 850 - no CEPH alu-PCR data reported - 3 positive results (8p STSs) reported by Whitehead/MIT STS content mapping 915h4 740 - no CEPH alu-PCR data reported - six positive results (8p STSs) reported by Whitehead/MIT STS content mapping 937el 1620 - no CEPH alu-PCR data reported - 8 positive results (8p STSs), 3 ambiguous results reported by Whitehead/MIT STS content mapping Table 4.(a) YACs from WC8«1 for which no evidence of chimaerism was available. All size data was obtained from CEPH. The comments summarize available data regarding the possibility of chimaerism. 33 STSs 915h4 770e9 700d3 710a5 692c9 729el2 871f3 937el 799b 1 691f8 D8S520 * * D8S1755 * * D8S550 * * * D8S1593 * * * * D8S265 * * * D8S1695 * * * D8S1759 D8S1946 D8S1130 * D8S1640 * * D8S1619 * D8S552 * * * D8S1106 * D8S1109 D8S1107 Table 4 (b). A subset of WC8»1 containing YACs from the region of interest with no evidence of chimaerism. An asterisk (*) indicates the presence of an STS in a YAC. 34 35 3.3 Isolation and sequence analysis of YAC insert termini Bubble PCR was used to isolate sequence from both the left (or trp-) and the right (or ura-) termini of the twelve YACs chosen for analysis. Bubble PCR products for eleven insert termini from six YACs were obtained (Figure 7, Table 5), and sequence analysis (Figure 8) confirmed that ten of these represent true insert terminal sequence. Verification was dependent upon the identification of adjacent YAC vector, pKRX, and bubble linker sequence; these were present in all PCR products except the trp- terminus of 729el2 (not shown), which was excluded as a true terminus. Two of the isolated termini, the ura- termini of 700d3 and 770e9, were found to contain insufficient insert sequence for STS construction. BLAST analysis of the remaining eight terminal sequences revealed that the trp- termini of YACs 700d3 and 937el were contained entirely within alu repeat sequences (Figure 9(a) and (b)), and that the ura- terminus of 920dl2 was contained entirely within LI repeat sequence (Figure 9(c)); these were consequently unsuitable for use in developing STS primers, and were excluded from analysis. BLAST analysis also revealed that the ura-terminus of 729el2 contained DNA sequence with similarity to the gene GRF2 from Mus musculus (Chen et al., 1993) (Figure 9(d)), but was not identical to any known human sequence. This result suggested the presence of coding sequences in this YAC. STS primers were designed from this and the remaining four YAC insert termini (Table 6). bp- termini ura- termini Figure 7. PCR amplification of YAC termini with primers 224 and 226 (for termini amplification) or 228 (for ura-termini amplification). 37 PCR product 799b 1 trp 171 920412 trp 223 700d3 trp 170 770e9 trp 409 729el2 trp 275 937el trp 176 920dl2ura 280 700d3 ura 101 770e9 ura 138 729el2ura 371 937el ura 160 Table 5. A list of PCR products resulting from bubble PCR, and associated sizes. ^ \( TERMINI S SEQUENCE 729el2 ura- 5'-CTTCAAACACTATAGTCATTA-3' 5' -TC AGGC ATTTAGTTG ATAC A-3' 770e9 trp- 5' -AACATGGGGATCGATATTC-3' 5'-AATGGGTAGTTTCAACAGC-3' 799b 1 trp- 5' -TGAGAA AC AAC ACTTGGTT A-3' 5' -TCTTGC AATTGTATTACTTTC-3' 920dl2 ura- 5' -TTGTTATGTCCTTTCCTGG-3' 5' -TTCTACC AGAC ATCC AAAG-3' 937el ura- 5' -GAAAGAAAAAGC ATTGCCC-3' 5' -C AGAGATTAGAGGCACC A-3' Table 6. Primer sequences of STSs designed from YAC insert termini. 3 8 Figure 8. Sequence analysis of bubble PCR products. All sequences in plain type are pKRX sequences; sequences in italics are bubble PCR sequences; large, bold sequences are from human insert DNA; underlined sequences are from YAC vector DNA. Sequences are grouped according to the orientation (group 1 or 2) of the PCR product in the pKRX vector (see figure 5, page 26 in Methods and Materials). Any minor sequence differences (excluding human insert sequence) are presumed to be due to errors occurring during the sequencing reaction. 39 (a) YAC termini in the group 1 orientation. ura-700d3 CACCAGCCTTGTCGAACGCCCGATCTCA A.GATTACGG A A T T r T C T Q T T ura-729el2 C A C C A G C C T T G T C G A A C G C C C G A T C T C A A G N T T A C G G A A T T C T C T T C A A A C A C T A T A G T C A T T A A A C C A T T A C A T C G T A T A T T G T C C A C A C A C T G T G A A A T A A A A A G T T G A C A A T T A C A A C C C A C A T G A A A G C C T A G G C A G A T C A T A T A T T C C C A C A A T T G T T T A T A G C T A A A A G A A T C C A T T C A A T T T C T C T C T C C C T G T C T C T T A C A C A C A C A C A C A C A C A C A C A C A C A C A C A C A C A C A C A C A C A G A G T T A G G T A A A T A T T A T A C A C T G A T G A A C A G A T T C A G T A T T C A G A C A C C A A A A C T G T A T C A A C T A A A T G C C T G A C A A A T ACCTGACAAGGAGAGGACGGCGAATCTCCGTTACGAAACGGTTACGAT rCCGACGTTCTGGTCGAACTT ura-770e9 C A C C A G C C T T G T C G A A C G C C C G A T C T C A A G N T T A C G G A A T T C T G C A A A C T C G T A G N C T A G C C G T G G T G A G T A T G T G A A A C C C A G C T T C T G A T C G C T C ACv4v4 GGA GA GGA CGGCGA TACGTCIGGTCG ACCTT 40 (b) YAC termini in the group 2 orientation. trp-770e9 CACCAGCCTTCGAATCGTAACCGTTCGTACGAGAATCGCCGTCCTCTCCTTGTG A G T T T T T T A A A C A T G G G G A T C G A T A T T C A T A A G T T T T G A A A A A T A C T A A T T T G G T A T T T C T A A A A T A T T G A T T T T G T C T T A T G T T T C T C T T C T C T T T C T G A A A C T A T A A T A C T A T A T A T A G A G T A C A T A T A C A C A C A T A T A T A G T A T T A T G T G A T A C A C A T T A T G A A T A T G T C C C A T G T G G C T C T T A T G C C A T G T A C G T T T T C T T T T T T T G T C T C T G G G C A C T T G A C T T C T C T T C T A G T G T T A A A A T T T T G T A T T C T G C T G T G C C T A A T C T G C T G T T G A A A G T A C C C A T T T C A T T A T T A A T T A C A A A C A T G T T T T T T T GAATTCCGGTAGTTGATAAATTAAAGTCCTTGCGCCCTTAAACCAACACG TTCTGGTCCGACCTT trp-700d3 CACCAGCCTTCGAATCGTAACCGTTCGTACGAGAATCGCCGTCCTCTCCTTTCA G C C T C C C G A G T A A C T G G G A T T A C A A G T G C G C A C C A C C A C A C C T G G C T A A T T T T T T G T A T T T T T A G T A G A G A T G G G G T T T C A C T A T G T T G G C C A G T C T G G T C T T G A A T T C C G T A G T G A T A A A T TAAAGTCTTGCGCCTTAAACCAACACGTCTGGTCGACCTT trp-920dl2 CACCAGCCTTCGAATCGTAACCGTTCGTACGAGAATCGCCGTCCTCTCCTTGTG A G A T T A C A G G C A T G A G G C A C C G T G C C T G C C C T G G G G A C A T A C A C T T T T A T C C A C T A G T T C T T T T T T G T T G T T G T T G A A C A G G C A T T G A G C T G T T T A A T T G C T G T T G T T T G A A T A G G C A A A G C A G G C A C G T G G T A C T G A A T T C C G T A G T G A T A A A T T A A A G T CTTGCGCCTTAAACCAACACGTCTGGTCGACCTT 41 ura-920dl2 C A C C A G C C T T C G ^ TCGTAA CCGTTCGTA CGA GAA TCGCCGTCCTCTCCTTGTT A G T T T T C T T T T T T T G T T A T G T C C T T T C C T G G T T T T G G G A T T A G G G T G A T G C T G G C T T C A T A C A A T G A A C T G G G G A G G G T T C C T T C T T T C T C T A T C T T G T G G A A C A G T G T C A A A A G G A T T G G T A C A A A T T C T T T G G A T G T C T G G T A G A A T T C C G T A A T C T T G ANATCGGGCGTTCGACACGTCTGGTCGACCTT trp-937el CACCAGCCTTCGAATCGTAACCGTTCGTACGAGAATCGCCGTCCTCCCGAGTA A C T G G G A T T A C A A G T G C G C A C C A C C A C A C C T G G C T A A T T T T T T G T A T T T T T A G T A G A G A T G G G G T T T C A C T A T G T T G G C C A G T C T G G T C T G A A T T C C G T A G T G A T A A A T T A A A G T C T T G C G C C T T A AACCAACACGTCTGGTCGACCTT ura-937el CACCAGCCTTCGAATCGTAACCGTTCGTACGAGAATCGCCGTCCTCTCCTTGTA A G A A A G A A A A A G C A T T G C C C T C C T G T A G A G C A T G T G T C A C C T G T A A A T T T C C A T G A G A C A G T G G T G C C T C T A A T C T C T GAATTCCGTAATCTTGAGATCGGGCGTTCGACCGTCTGGTCGACCTT trp-799bl CACCAGCCTTCGAATCGTAACCGTTCGTACGAGAATCGCCGTCCTCTCCTTGTG A G A A A C A A C A C T T G G T T A C C T A T T T A T C A A A G T G A C A A T A A A A T A G T T T A A A A T A A G C A T A C C A A G A A A G T A A T A C A A T T G C A A G A A T T C C G T A G T G A T A A A T T A A A G T C T T G C G C C T T A A A C C A A CACGTCTGGTCGACCTT 42 (a) C O i r p 7 0 0 d 3 - t r p : 1 CAAGACCAGACTGGCCAACATAGTGAAACCCCATCTCTACTAAAAATACAAAAAATTAGC * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * alu: 77 CGAGACCAGCCTGGCCAACATGGTGAAACCCCGTCTCTACTAAAAATACAAAAA-TTAGC comp700d3-trp: CAGGTGTGGTGGTGCGCACTTGTAATCCCAGTTACTCGGGAG 102 * ** * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * alu: CGGGCGTGGTGGCGCGCGCCTGTAATCCCAGCTACTCGGGAG 178 (b) COmp937el-trp: 1 CAGACCAGACTGGCCAACATAGTGAAACCCCATCTCTACTAAAAATACAAAAAATTAGCCA ******** *********** ********** ********************* ****** alu: 78 GAGACCAGCCTGGCCAACATGGTGAAACCCCGTCTCTACTAAAAATACAAAAA-TTAGCCG comp937el-trp: GGTGTGGTGGTGCGCACTTGTAATCCCAGTTA 93 ** * * * * * * * * * * * * * * * * * * * * * * * ** alu: GGCGTGGTGGCGCGCGCCTGTAATCCCAGCTA 170 (c) 920dl2-ura: 2 TAGTTTTCTTTTTTTGTTATGTCCTTTCCTGGTTTTGGGATTAGGGTGATG^ * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * L l : 4029 TGTTTTTTTTTTTTGGTTATC^ 920dl2-ura: TACAATGAACTGGGGAGGGTTCCTTCTTTCTCTATCTTGTGGAACAGTGTCAAAAGGATT ** ***** * ****** *** **************** *** * ****** ****** L l : TAGAATGATTTAGGGAGGATTCTCTCTTTCTCTATCTTGTAGAATACTGTCAATAGGATT 920dl2-ura: GGTACAAATTC TTTGGATGTCTGGTAGAATT 149 * * * * * * * * * * * * * * * * * * * * * * * * * * * * L l : GGTATCAATTCTTCTTTGAATGTCTGGTAGAATT 4181 (d) Ras-GRF2: 1753 CAGTGTGTGGACAATATACGATGTAACGGATTGATGACTATAGTCTTTGAAGAGAATT 1810 ************************** ** ** *********** ************* comp729el2-Ura: 58 CaGTGTGTGGACAATATACGATGTAATGGTTTAATGACTATAGTGTTTGAAGAGAATT 1 Figure 9. Sequence alignments. A n asterisk (*) denotes sequence identity. a) Comparison of the complement of the trp- terminus of 700d3 to the Alu consensus sequence. b) Comparison of the complement of the trp- terminus of 937el to the Alu consensus sequence. c) Comparison of the ura- terminus of 920dl2 to the L l consensus sequence. d) Comparison of the complement of the ura- terminus of 729el2 to the Mus musculus gene Ras-GRF2. 43 3.4 STS content mapping Table 7 summarizes STS content mapping results (figure 10(a) to (o)) in ten of the twelve YACs in the YAC subset using 15 STSs previously localised to the region of interest (Hudson et al.,1995), and compares results obtained for this thesis to contig WC8»1. Two YACs from the YAC subset, 691f8 and 692c9, were unavailable for analysis due to consistent bacterial contamination of yeast glycerol stocks. STS content mapping was performed for 5 STSs designed from YAC termini (Figure 11); two of these (from the trp-terminus of 770e9 and the ura-terminus of 729el2) could be amplified in the YAC of origin (figure 11(a) and (b)); the remaining three could be amplified only in a positive control consisting of total human DNA (figure 11(c) to (e)). These STSs were tested for their ability to be amplified in DNA sequences from several controls. An STS designed from ura-729el2, which contained DNA sequences with similarity to GRF2, could not be amplified in a chromosome 8p somatic cell hybrid, suggesting that this YAC is chimaeric (figure 12(a)). An STS designed from trp-770e9 was amplified in all positive controls attempted, and STSs designed from the remaining termini could not be amplified efficiently (figure 12). 3.4.1 The localization of a human gene to chromosome 5 The Mus musculus gene Ras-GRF2 has recently been mapped to chromosome 13 in the region C3-D1 (Fam et al.,1997). This region has synteny to human chromosome five; it was suspected, therefore, that the STS designed from the ura-terminus of YAC 44 729el2, and the associated novel human gene, may be located on chromosome five. To test this hypothesis, PCR was used to amplify the STS in a somatic cell hybrid, GM10114, containing an intact chromosome 5 as its sole human component (Wasmuth et al.,1989). Results (figure 13) confirmed the presence of the STS and the novel gene on chromosome five. 3.5 Analysis of STS content mapping results using SAM A comparison between contigs generated with the System for Assembling Markers (SAM; figure 14) (Soderlund and Dunham,1995) and contig WC8»1 show few differences. A minimal tiling path of the following YAC order could be defined that was compatible with the most likely SAM order and with the contig WC8«1: tel- 770e9 - 737e5 - 871f3 - 920dl2 - 799bl - cen 45 STSs 915h4 770e9 700d3 737e5 710a5 729el2 8710 920dl2 937el 799bl D8S520 f W * W * D8S1755 W W f D8S550 W W * * D8S1593 W W * W W W D8S265 W * W * W * W * D8S1695 W * W * w * W * D8S1759 w * D8S1946 W f D8S1130 * W * D8S1640 • W * W * W * D8S1619 * W D8S552 W f W W f D8S1106 f W * W * D8S1109 W * * D8S1107 * Table 7. STS content mapping in YACs using STSs previously localised to the region of interest. W= positive according to WC8«1; * = positive for STS content mapping; f=faint PCR product. 46 - J —J - - J - - J —1- - -J O to —^  t / i O O - J o =5" CL C O C J O en (a) - J r--j •XT" • X r-o C O - J - J —j —^ --J o C O —i-en o o - J o 3" C O co en en —J C O €» co - - j CO CO --J C O CO ro CO —J o CO CO CL --J CO —i ® CT ro -* ,iS;;i» ' . " ' ' •' D CD l\) CO - J O CO CO C L - -J CO —i cT ro —i —1 o Figure 10. STS content mapping in YACs. THD= total human DNA; Mq= milli-Q deionized water; M= ()>X174 marker DNA; the remaining lanes are YAC DNA. (a) D8S520 (b) D8S1755 Figure 10. (continued) (c) D8S550 (d) D8S1593 48 to Figure 10. (continued) (e) D8S265 (f) D8S1695 Figure 10. (continued) (g) D8S1759 (h) D8S1946 Figure 10. (continued) (I) D8S1130 (j) D8S1640 Figure 10. (continued) (k) D8S1619 (1) D8S552 Figure 10. (continued) (m) D8S1106 (n) D8S1109 Figure 10. (continued) (o) D8S1107 Figure 11. PCR amplification of Y A C termini in YACs (a) amplification of trp-terminus of 770e9 (b) amplification of ura-terminus of 729el2 55 Figure 11. (continued) (c) amplification of trp-terminus of 799b 1 (d) amplification of trp-terminus of 920dl2 5 6 Figure 11. (continued) (e) amplification of ura-terminus of 937el 57 Figure 12. PCR amplification in controls, using STSs designed from Y A C termini. end= terminal sequence cloned into pKRX vector; 8p= chromosome 8p somatic cell hybrid THD= total human D N A (a) STSs designed from ura-729el2, trp-770e9, and trp-937el (b) STSs designed from trp-799bl and ura-920dl2 58 Figure 13. PCR amplification of an STS designed from ura-729el2, the terminal sequence containing similarity to murine GRF2. 5,8- monosomic somatic cell hybrids containing chromosomes 5 and 8, respectively.THD= total human DNA, Mq=milli-q distilled water, M=<|)X174 marker DNA. CD CO CD C 25 E CD CO CO < E CD CO >> CO CD CO CO ~ co (0 O) c T3 CD C CO CD co .E co x: • - CD cn •= co «5 Q. c ® 8 ~ CO H - CO O I— a « ° o » o ± : CO CO CO j _ , c £ o CD CO o c 'CO 1 CD O c 1 CD T3 co >-o =5 £ C CO w ° -CO c n h - c CO = " N 0) g r- 0 3 O * O co £ CO S — -  O) O c 3 § .22 o — CD 2 or as co C CO E E c CD CO co o 0 CD "5 CO CO CD — _ c c o a CO « i » CO c O CO — CD 75 2 o £ -c o o CD < I-o < >-e g ^ CO < '•£ CO CD CD O to < £ CO >- c CO 0 5 CO CO I-(0 (3 x: CD CD o CM CD CD CM CO LO CO •a CD o r>- I— o co r--r» 00 * * CM T— T3 O CM CD r-CO CD X2 CD CD 3 60 C H A P T E R 4 DISCUSSION 4.1 The exclusion of chimaeric YACs For this thesis, STS content mapping was used to construct a contig of overlapping YACs, representing a 7 cM region of 8p22. Of the 10 YACs used for contig construction, eight exhibited no previous evidence of chimaerism; that is, there were no data available from Whitehead/MIT or from CEPH that suggested that these YACs contain DNA from any genomic location other than 8p22. One of these (YAC 729el2) was subsequently found to be chimaeric; an STS derived from the ura-terminus of this YAC did not hybridize to a chromosome 8p somatic cell hybrid. The remaining two YACs, 920dl2 and 737e5, contained some evidence of chimaerism suggested by CEPH or Whitehead/MIT STS content mapping data, or CEPH alu-PCR hybridization data. A minimal tiling path, consisting of five YACs, was identified from the contig (Figure 15). This mimimal tiling path can be used to estimate the maximum and minimum physical size of the contig. The maximum size assumes a minimum overlap between the five YACs, and is the sum of the YAC sizes (4160 kb). To calculate the minimum size, the overlap is assumed to be maximal and the non-overlapping YACs are assumed to be adjacent. The minimum size is then the sum of the sizes of the non-overlapping YACs (3020 kb). The contig is thus between 3020 and 4160 kb in length. One cM is, on average, equivalent to one Mb of DNA. This contig is shorter than the expected average distance of 7 Mb predicted by its 7 cM length. This result may not be significant, but it may imply a recombination hotspot in the vicinity of the contig; an 61 o 00 o CO o o 00 o IO CO o CO CD OJ T— T3 o OJ CD CO 00 LO CD co Q) o E o o u o 1_ 0) E o o < >-sz o CO o * ; g> at > 0} O •3) CO c T 3 • - CD c ^ O jd £ CO CD - 3 a co TO CD .E lo — o * - CO E 2 3 C I I 1 "° 2 o « < CO >- o < < >-in 2. 3 Lu 62 increased rate of recombination in the region may be the cause of the elevated recombination distance. The minimal tiling path contains the two YACs with previous evidence of chimaerism (920dl2 and 737e5). Since a goal of this thesis was to reduce to a minimum the possibility of chimaerism complicating contig construction, it is prudent to provide a description of the methods used to detect chimaerism, the reliability of each method, and the reliability of the contig resulting from this thesis. 4.1.1 STS content mapping data STS content mapping is a simple and fairly reliable method for detecting chimaerism; if a YAC contains two STSs from very different genomic locations, it is likely to be chimaeric. STS content mapping data were examined for all YACs from WC8»1 localized to the region of interest. STS content mapping in YAC pools Both CEPH and Whitehead/MIT test for the presence of STSs in a three-dimensional array of YACs; the CEPH YAC library is divided into 33 superpools, each containing DNA from YACs stored in a block of eight 96- well microtitre plates (768 YACs per block). Each of these 33 blocks is represented by 28 subpools. Eight of these are called row subpools- they contain YACs pooled from the same row in the eight different plates of a block. Twelve are column subpools, containing pooled YACs all from the same column, and eight are plate subpools, consisting of YACs from the same plate in the block. STS content mapping is first performed against superpools to identify the 6 3 block containing a positive YAC (or YACs); STSs are then screened in the corresponding row, column and plate subpools to identify a positive row, column, and plate address-i.e., a unique YAC address. Resolving ambiguous results It is possible that two positive YACs could be present in the same block; if this occurs, the subpool results could provide two rows, two columns and two plates. The exact YAC address is ambiguous in such a case. For example, a PCR result such as (770,774)(e,g)(5,7) could represent any two of the eight possible YAC addresses. If STS content mapping produces an ambiguous YAC address, there is a certain probability that a possible YAC from the ambiguous address contains the STS. In the example just noted, if STS content mapping produces the ambiguous address (770,774)(e,g)(5,7), then there is one chance in four that 770e5 is a positive YAC. Eight STSs used in this thesis displayed no evidence of chimaerism based on data from CEPH and Whitehead/MIT. Five of these, however, were members of ambiguous YAC addresses, produced by STSs that have been localised to regions of the genome other than chromosome 8p22 (see Results, Table 4(a)). In other words, there is a certain probability that these YACs may contain these STSs, and may thus be chimaeric. However, it was decided for the purposes of this thesis that such observations were not adequate justification for excluding these YACs from analysis . A number of techniques can be used to attempt to resolve ambiguous YAC addresses. First, each possible YAC can be tested individually, providing a direct method for resolution. Second, STSs doubly-linked to the STS used to identify the ambiguous 64 YACs can be used for STS content mapping in the pools (doubly- linked STS data). Resolution can be achieved if one of these STSs are contained within one of the possible YACs from the ambiguous results; or, if STS content mapping using one of these STSs produces an ambiguous set of YACs, resolution can be achieved if the intersection of the two sets of ambiguous YACs can identify one unique YAC in common. Finally, other YACs from the region containing the STS can be examined for the presence of overlap with any of the YACs from the ambiguous set (fingerprint data). Each of these techniques were used by Whitehead/MIT to resolve ambiguous results and identify truly positive YACs from sets of ambiguous YACs. Whitehead/MIT reports all YAC results, including ambiguous and resolved ambiguous results. Examination of evidence for chimaerism in YAC 920dl 2 YAC 920dl2, a YAC in the minimal tiling path denned by this thesis, was identified by Whitehead/MIT as containing D17S1717, from chromosome 17. The result was determined by resolving the ambiguous YAC pooling result (917, 918,920)-(c, d)- (7, 12). This was accomplished by identifying an STS doubly- linked to D17S1717 that identified an ambiguous set of YACs also containing YAC 920dl2. Details regarding this doubly-linked STS and its associated ambiguous YAC set are not provided by Whitehead/MIT. It is difficult to determine the reliability of this result. Whitehead/MIT states that, generally, the results are considered reliable, but they treat such an observation as an unconfirmed suspicion rather than a confirmed result. In addition, no other evidence of 65 chimaerism is suggested for 920dl2 except this result. Therefore, YAC 920dl2 was tentatively included in the analysis. 4.1.2 CEPH Alu- PCR data CEPH alu-PCR data describes results obtained by CEPH from using alu-PCR products for a YAC as probes against the CEPH mega-YAC library and against monosomic somatic cell hybrids. The use of alu-PCR results as probes against the YAC library is not a very informative indicator of chimaerism; if a probe consisting of alu-PCR products from the YAC in question hybridizes to a YAC known to be localized to a genomic location other than 8p, this could indicate chimaerism in the YAC from which the probes originated, or, chimaerism could be present in the target YAC. Therefore, results of alu-PCR hybridisation experiments against the YAC library were not used to identify chimaeric YACs in this thesis. Alu-PCR hybridization to somatic cell hybrids More informative data is provided by alu-PCR probe hybridization against somatic cell hybrids; evidence of chimaerism is provided by hybridization to hybrids containing chromosomes other than chromosome 8. Probes are hybridized to two copies of somatic cell hybrids, allowing for verification of hybridization results. Hybridization signals are described by CEPH as faint, medium, or strong. Of the eight YACs used in this thesis that displayed no evidence of chimaerism, no CEPH alu-PCR data were reported for six. This lack of data reduces the certainty that these YACs are not chimaeric. However, the data that are available do not suggest chimaerism, so these YACs were included in the analysis. In addition, no alu-PCR data 66 were reported for YAC 920dl2, which displayed some evidence for chimaerism based on Whitehead/MIT STS content mapping data. Again, this lack of data was not deemed sufficient to exclude the YAC from analysis. Examination of evidence for chimaerism in YAC 737e5 YAC 737e5 exhibited some evidence for chimaerism, according to CEPH alu-PCR data. Alu-PCR probes from the YAC hybridized to both copies of a chromosome 16 somatic cell hybrid. The validity of this result may be somewhat questionable, for several reasons: the hybridization signals were reported as faint for each result, and no other data from CEPH or from Whitehead/MIT provided evidence for chimaerism. Nevertheless, the result is suggestive of chimaerism; the YAC was included for analysis, however, because it was the best candidate to use to attempt to overlap other YACs in this region of WC8«1. 4.2 A comparison of WC8* 1 and the contig produced for this thesis 4.2.1 Positive results not predicted by WC8»1 Eight positive STS content mapping results obtained during the construction of the YAC contig presented in this thesis were not predicted by WC8«1 (Results, Table 6); for example, YAC 737e5 was found to contain D8S1130, resulting in the establishment of overlap between YACs 737e5 and 871f3. This overlap was sufficient, in conjunction with data from WC8*1 that was confirmed in this thesis, to report a continuous set of overlapping clones spanning the region of interest. 67 The frequency of false negative results is not known for contig WC8«1. Therefore, it is difficult to determine the expected frequency of obtaining an unpredicted STS content mapping result. Given that STS content mapping at Whitehead/MIT is performed in a complex collection of YAC pools, it is not unreasonable to assume that false negative results are a significant possibility for WC 8*1. Another result observed during STS content mapping but not included in WC8»1 is that YAC 799b 1 contains STS D8S1107. In fact, this positive result is reported by CEPH, according to the database maintained by the BCM. WC8»1 typically includes data from CEPH that has not been confirmed by Whitehead/MIT. The absence of this positive result from WC8«1 appears to be an error in the contig. 4.2.2 Negative results not predicted by WC8«1 A number of positive STS content mapping results were reported in WC8«1 that were not confirmed by STS content mapping during the completion of this thesis (Results, Table 6). Such a result occurred 9 times out of 33, or at a frequency of 27%. This frequency is considerably higher than the frequency of false positives reported by Whitehead/MIT (4%). This disparity may be due to a number of factors, including YAC rearrangements, false positives, from Whitehead/MIT and false negatives resulting from this thesis. However, it is relevant to note that STS content mapping for this thesis relied on examining individual YACs, and that all positive results were confirmed by at least two independent experiments (data not shown). 68 Whitehead/MIT verify only a small number of positive results by testing YACs individually. In fact, none of the results for contig WC8«1 in the region of interest have been verified on individual YACs (Table 7). Furthermore, only two results (D8S520 and D8S265 in YAC 770e9) have been verified in two independent experiments (CEPH and Whitehead both reported positive results for these), while one result (D8S520 in YAC 700d3) was reported by CEPH and not confirmed by Whitehead/MIT. The majority of results (56%) were described as definite; the remainder had been ambiguous and subsequenctly resolved by fingerprinting or doubly- linked STS data (Table 7). Therefore, it is not surprising that differences were observed between contig WC8»1 and the results of this thesis. 4.3 YAC termini isolation for contig construction Of ten YACs from the region of interest, ten of the possible twenty (50%) YAC termini were successfully isolated using the bubble PCR protocol. Difficulties with the protocol are hard to explain; one possibility is that yeast DNA prepared in a liquid preparation (Sherman et al., 1986) is not as suitable for this procedure as DNA prepared by imbedding yeast cells in agarose plugs (Schwartz and Cantor, 1984). The former typically shears chromosomal DNA, while the latter leaves chromosomes intact. The probability that YAC DNA was sheared at locations that interfered with bubble PCR is probably low; shearing would have to occur between the YAC vector primer and the first Ddel restriction site in a YAC insert to prevent successful isolation of a YAC terminus 69 (see Methods, figure 2). However, this possibility could explain some of the negative results. STSs 915h4 770e9 700d3 737e5 710a5 729e12 871 f3 920d12 937e1 799b1 D8S520 C c D8S1755 D D D8S550 F F D8S1593 F D F D D D8S265 D C D D D8S1695 D D D D D8S1759 D D8S1946 D D8S1130 F D8S1640 S S F D8S1619 D D8S552 D F D D8S1106 D F D8S1109 D D8S1107 Table 8. Contig WC8»1 from the region of interest. C= CEPH result, verified by Whitehead/MIT; c= CEPH result, not verified; D= definite Whitehead/MIT result; F= an ambiguous result, resolved using fingerprinting data; S= an ambiguous result, resolved using doubly-linked STS data. 70 Of the ten termini isolated, eight provided sufficient insert DNA to design STS primers. Of these, five contained unique sequences according to a blast search. An STS designed from trp-770e9 amplified in a chromosome 8p somatic cell hybrid, and in the YAC from which it originated (770e9). The observation that this STS did not amplify in any YACs except 770e9 is consistent with the fact that YAC 770e9 is located at the telomeric end of the contig. There are currently no known YACs that are telomeric to, and overlap with, YAC 770e9; however, there are several apparently non-overlapping YACs telomeric to 770e9 within contig WC 8*1. It may be valuable as a future exercise to attempt to amplify sequences in these YACs using this telomeric STS from YAC 770e9. Ura-729el2 contained sequence similarity to the murine gene GRF2. An STS designed from this terminus did not amplify in a chromosome 8p somatic cell hybrid, but did amplify in a cell hybrid containing human chromosome 5. This observation has resulted in the localization of these coding sequences to chromosome 5. The lack of amplification of an STS designed from ura-729el2 in a chromosome 8 somatic cell hybrid suggests that 729el2 is chimaeric, and explains the lack of amplification of the STS in the YAC subset. This result also suggests that a larger proportion of YACs are chimaeric than are suggested from examination of data from Whitehead/MIT and from CEPH. 4.4 The effectiveness of employing YACs for contig construction YACs provide a means by which large DNA inserts can be cloned. They are useful for creating physical maps of large regions of the genome. They are used 71 extensively for initial physical organization of DNA within a genomic region of interest. Nevertheless, significant difficulties arise during YAC manipulation that reduce their effectiveness. The high frequency of chimaerism and internal deletions in CEPH mega-YACs have been particular hindrances during many examples of physical mapping and detailed genomic analysis (Bronson et al.,1991, Green et al.,1991, Bates et al.,1992, Cohen et al.,1993, Chumakov et al.,1995 Collins et al.,1995, Gemmill et al.,1995). Although chimaerism does not seriously interfere with STS content mapping using STSs that are known to independently map to a region of interest (Green et al.,1991), deletions can be a serious source of error in STS content mapping. Also, only one copy of a YAC is present in a yeast cell, unlike the high copy number possible for cloning systems such as cosmids and plasmids. However, these latter vectors do not allow for the cloning of inserts large enough to be useful for large-scale physical mapping. Bacterial artificial chromosomes (BACs) are cloning vectors that allow the insertion of DNA fragments approximately 200 kb in length, about five times larger than can be maintained in cosmids. BACs are propagated in bacteria, and DNA preparation allows the collection of significantly larger quantities of DNA than is available using YACs. Also, unlike YACs, BACs do not form chimaeras. BACs are a valuable alternative or supplement to YACs for the purposes of contig construction, as they represent an intermediate between YACs and cosmids, and they provide some important advantages compared to these other vectors. 72 4.5 Conclusions 4.5.1 Construction of a minimum tiling path spanning 8p22 A minimum tiling path consisting of five YACs has been identified that spans a 7 cM region of chromosome 8p22; the contig was delimited distally by D8S520 and proximally by D8S552, located 22 and 29 cM, respectively, from the top of the chromosome 8 genetic map (Dib et al.,1996). The YAC order is: tel - 770e9 - 737e5 - 871f3 - 920dl2 - 799bl - cen The contig was constructed by STS content mapping in a subset of YACs, identified from the Whitehead/MIT contig WC8«1 as being localized to the region of interest. An examination of STS content mapping data from Whitehead/MIT and CEPH and alu-PCR data from CEPH revealed no evidence of chimaerism for three of the five YACs. The other two YACs have some suggestion of chimaerism, but the evidence is not substantial. This contig will be valuable for use in the examination of this region of 8p22 for the purpose of refining the physical map and for the isolation of any genes of interest that may be mapped to the region by positional cloning techniques. 73 4.5.2 Physical mapping of 8p22 and the Human Genome Project An important impetus behind the construction of the minimum tiling path presented in this thesis has been to contribute to the completion of the Human Genome Project. One of the major goals of the project is to construct long-range physical maps of large genomic regions, in order to be used as a framework for the generation of sequence-ready clones. The research performed for this thesis is a significant contribution to this goal. Physical mapping has been completed for substantial regions of the genome, including chromosome 8p22, as part of the Human Genome Project (Chumakov et al.,1995). However, these results were obtained using STS content mapping in YAC pools, and they are not as reliable as STS content mapping in individual YACs. The results of this thesis provide an essential confirmation and refinement of the existing STS content mapping data for this region, and have also provided significant additional information, including the exclusion of a large number of chimaeric YACs, the identification of a minimum tiling path, and an estimation of the physical size of this genomic region. The STS content mapping completed for this thesis complements similar STS content mapping data produced for other regions of chromosome 8p (Bookstein et al.,1994, Ranta et al.,1996, Chaffanet et al.,1996). Together, these YAC contigs are a valuable resource for the eventual sequencing and identification of genes within the short arm of chromosome eight. 74 4.6 Future areas of research YAC contig construction within this region has provided a tool that can be used for future genomic analysis of this region. A number of worthwhile areas of research are immediately evident from the results of this thesis, including: 1) constructing a BAC contig of the region by STS content mapping; 2) identifying pools of cosmids associated with each YAC of the contig, and organizing these into contigs; 3) large- scale sequencing of this region once cosmid contigs are constructed. 75 BIBLIOGRAPHY Albertson,H.M.,Abderrahim,H.,Cann,H.M.,Dausset,J.,Le Paslier,D.,and Cohen,D.(1990) Construction and characterization of a yeast artificial chromosome library containing seven haploid human genome equivalents. Proc.Natl.Acad.Sci. USA 87:4256-4260. Altschul,S.F., Gish,W., Miller,W., and Lipman,D.J. (1990) The basic local alignment search tool. J. Mol. Biol. 215:403-410. Arratia,R., Lander,E.S., Tavare,S., and Waterman,M.S. (1991) Genomic mapping by anchoring random clones: A mathematical analysis. Genomics 11:806-827. Bardenheuer,W., Michaelis,S., Lux,A., Vieten,L., Brocker,F., Julicher,K., Willers,C, Siebert,R., Smith,D.L, van der Hout,A.H., Buys,C, Schutte,J., Opalka,B. (1996) Construction of a consistent YAC contig for human chromosome region 3pl4.1. Genome Research 6: 176-186. Barrett,J.H.(1992) Genetic mapping based on radiation hybrid data. Genomics 13:95-103. Barrilot,E.,Dausset,J.,and Cohen,D.(1991) Theoretical analysis of a physical mapping strategy using random single-copy landmarks. Proc.Nat.Acad.Sci. USA 88:3917-3921. Bell,J. and F£aldane,F.R.S.(1937) The linkage between the genes for colour-blindness and haemophilia in man.Proc. Roy. Soc. Lon. Ser.B. 123:119-151. Bellanne-Chantelot,C, Lacroix,B., Ougen,P., Billault,A., Beaufils,S., Bertrand,S., Georges, I.,Glibert,F., Gros,I., Lucotte,G., Susini,L.,Codani,J.-J.,Gesnouin,P.,Pook,S., Vaysseix,G., Lu-Kuo,J., Ried,T., Ward,D., ChumakovJ., Le Paslier,D.,Barillot,E., and Cohen,D.(1992) Mapping the whole human genome by fingerprinting yeast artificial chromosomes. Cell 70:1059-1068. Botstein,D., White,R.L., Skolnick,M., and Davis,R.W.(1980) Construction of a genetic linkage map in man using restriction fragment length polymorphisms. Am. J.Hum. Gen. 32:314-331. Botstein,D.,Cantor,C.,Carbonell,J.G.,Carrano,A.,Lerman,L.,Moyzis,R.,Pardue,M.,Olson, M.V.,Pearson,M.L.,and Wexler,N.S. Understanding our genetic inheritance.The U.S. Human Genome Project: The first five years FY 1991-1995.No.DOE/ER-0452P, No.NIH 90-1590(1990). Bowcock,A.M., Gerken,S.C, Barnes,R.I., Shiang,R., Wang Jabs,E., Warren,A.C, Antonarakis,S., Retief,A.E., Vergnaud,G., Leppert,M., Lalouel,J.-M., White,R.L., and 76 Cavalli-Sforza,L.L.(1993) The CEPH consortium linkage map of human chromosome 13. Genomics 16:486-496. Brenner,S. and Livak,K.J.(1989) DNA fingerprinting by sampled sequencing. Proc.Nat. AcadSci. USA 86:8902-8906. Bronson,S.K.,Pei,J.,Taillon-Miller,P.,Chorney,M.J.,Geraghty,D.E., and Chaplin,D.D. (1991) Isolation and characterization of yeast artificial chromosome clones linking the HLA-B and HLA-A loci. Proc.Nat.Acad.Sci. USA 88:1676-1680. Brownstein3-H.,Silveiman,G.A.,Little,R^ and 01son,M.V.(1989) Isolation of single-copy human genes from a library of yeast artificial chromosome clones. Science 244:1348-1351. Burke,D.T.,Carle,G.F.,and 01son,M.V.(1987) Cloning of large segments of exogenous DNA into yeast by means of artificial chromosome vectors. Science 236:806-812. Callen,D.F.,Doggett,N.A.,Stallings,R.L.,ChenX.Z.,Wliitmore,S.A.,Lane,S.A.,Nanca]TOw J.K.,Apostolou,S., Thompson,A.D., Lapsys,N.M., Eyre,H.J., Baker,E.G., Shen,Y., Holman, K., Phillips,H., Richards,R.L, and Sutherland,G.R. (1992) High-resolution cytogenetic-based physical map of human chromosome 16. Genomics 13:1178-1185. Carle,G.F.,Frank,M., and 01son,M.V.(1986) Electrophoretic separations of large DNA molecules by periodic inversion of the electric field. Science 232:65-68. Caskey,C.T.(1986) Summary: A milestone in human genetics. Cold Spring Harbor Symposia on Quantitative Biology,Vol.Ll Chen,L.,Zhang,L.J.,Greer,P.,Tung,P.S, and Moran,M.F.(1993) A murine CDC25/ras-GRF-related protein implicated in Ras regulation. Dev. Genet 14: 339-346. Chumakov,I.M.,Le Gall,I.,Billault,A-, Ougen,P., Soularue,P., Guillou,S.,Rigault,P.,Bui,H., De Tand,M.-F., Barillot,E., Abderrahim,H., Cherif,D., Berger,R., Le Paslier,D., and Cohen,D. (1992) Isolation of chromosome 21-specific yeast artificial chromosomes from a total human genome library. Nature Genetics 1:222-225. Chumakov,I.M., Rigault,P., Le Gall,L, Bellanne-Chantelot,C., Billault,A-, Guillou,S., Soulame,P.,Guasconi,G.,Poullier,E.,GrosJ.3elova,M.3ambucyJ.-L.,Susini,L.,Gervy,P., Glibert,F.3eaufils,S.,Bui,H.,Massart,C.,DeTand,M.-F.,Dukasz,F.,Lecoulant,S.,Ougen,P., Perrot,V., Saumier,M., Soravito,C, Bahouayila,R., Cohen-Akenine,A., Barillot,E-, Bertrand,S., Codani,J.-J., Caterina,D., Georges,L, Lacroix3-, Lucotte,G., Sahbatou,M., Schmit,C, Sangouard,M., Tubacher3-, Dib,C, Faure,S-, Fizames,C, Gyapay,G., Millasseau,P-5 Nguyen,S-, Muselet,D-, Vignal,A-, Morissette,J., Menninger,J., LiemanJ., 77 Desai,T., Banks,A., Bray-Ward,P., Ward,D., Hudson,T., Gerety,S., Foote,S., Stein,L., Page,D.C, Lander,E.S., Weissenbach,!., Le Paslier,D., and Cohen,D. (1995) A YAC contig map of the human genome. Nature Suppl.377:17'4-294. Chumakov,I.,Rigault,P.,Guillou,S.,Ougen,P.,Billaut,A., Guasconi,G.,Gervy,P., LeGall,L, Soularue,P., Grinas,L., Bougueleret,L., Bellane-Chantelot,C, Lacroix,B., Barillot,E., Gesnouin,P., Pook,S., Vaysseix,G., Frelat,G., Schmitz,A., Sambucy,J.-L., Bosch,A., Estivill,X., Weissenbach,!., Vignal,A., Reithman,H., Cox,D., Patterson,D., Gardiner,K., Hattori,M., Sakaki,Y., Ichikawa,H., Ohki,M., Le Paslier,D.,Heilig,R., Antonarakis,S., and Cohen,D.(1992) Continuum of overlapping clones spanning the entire human chromosome 21q. Nature 359:380-386. Cohen,D.,Chumakov,I., and Weissenbach,J.(1993) A first-generation physical map of the human genome. Nature 366:698-701. CollinsJ.E., Cole,C.G., Smink.L.J., Garrett,C.L., Laversha,M.A., Soderlund,C.A., Maslen,G.L., Everrett,A.A., Rice,K.M., Coffey,A.J., Gregory,S.G., Gwilliam,R., Dunham„A., Davies,A.F., Hassock,S., Todd,C.M., Lehrach,H., Hulsebos,T.J.M., Weissenbach,!., Morrow,B., Kucherlapati,R.S., Wadey,R., Scambler,P.J., Kim, U.-J., Simon,M.L, Peyrard, M.,Xie,Y.-G., Carter,N.P., Durbin,R., DumanskiJ.P., Bentley,D.R., and Dunham,I.(1995) A high-density YAC contig map of human chromosome 22. Nature Suppl. 377:367-371. Collins,F.S. (1995) Positional cloning moves from perditional to traditional. Nature Genetics 9:347-350. Collins,F. and Galas,D.(1993) A new five-year plan for the U.S. Human Genome Project. Science 262:43-46. Coulson,A.,Sulston,J.,Brenner,S.,and Karn,J.(1986) Toward a physical map of the genome of the nematode Caenorhabditis elegans. Proc.Nat.Acad.Sci. USA 83:7821-7825. Cox,D.R.,Burmeister,M.,Price,E.R.,Kim,S.,and Myers,R.M. (1990) Radiation hybrid mapping: A somatic cell genetic method for constructing high-resolution maps of mammalian chromosomes. Science 250:245-250. Cuticchia,A.J., Chipperfield,M.A., Porter,C.J., Kearns,W., and Peterson,P.L.(1993) Managing all those bytes: The Human Genome Project. Science 262:47-48. DaussetJ., Cann,H., Cohen,D., Lathrop,M., Lalouel,J.-M., and White,R.(1990) Centre d'Etude du Polymorphisme Humain (CEPH): Collaborative genetic mapping of the human genome. Genomics 6:575-577. 78 Dib,C, Faure,S., Fizames,C, Samson,D., Drouot,N., Vignal,A., Millasseau,P., Marc,S., Hazan,J., Seboun,E., Lathrop,M., Gyapay,G., Morissette,J., and Weissenbach,! (1996) A comprehensive genetic map of the human genome based on 5,264 microsatellites. Nature 380: 152-154. Doggett,N.A.,Goodwin,L.A., TesmerJ.G., Meincke,L.J.,Bruce,D.C.,Clark,L.M., Altherr, M.R., Ford,A.A., Chi,H.-C., Marrone,B.L., LongmireJ.L., Lane,S.A., Whitmore,S.A., Lowenstein,M.G., Sutherland,R.D., Mundt,M.O., ICnill,E.H., Bruno,W.J., Macken,C.A., Torney,D.C.,Wu,J.-R., Griffith,.!., Sutherlan,G.R., Deaven,L.L., Callen,D.F.,and Moyzis, R.K.(1995) An integrated physical map of human chromosome 16. Nature Suppl. 377:335-340. Donnis-Keller,H., Green,P., Helms,C., Cartinhour,S.,Weissenbach,B.,Stephens,K., Keith, T.P., Bowden,P.W., Smith,D.R.,Lander,E.S.,Botstein,D.,Akots,G.,Rediker,K.S., Gravius, T.,Brown,V.A.,Rising,M.B.,Parker,C.,Powers,J.A.,Watt,D.E.,Kauffman,E.R., Bricker,A., Phipps,P., Muller-Kahle,H., Fulton,T.R., Ng,S., SchummJ.W., BramanJ.C, Knowlton, R.G., Barker,D.F., Crooks,S.M., Lincoln,S.E., Daly,M.J., and Abrahamson,J. (1987) A genetic linkage map of the human genome. Cell 51:319-337. Dracopoli,N.C, 0'Connell,P., Elsner,T.L, Lalouel,J.-M., White,R.L., Buetow,K.H., Nishimura,D.Y., Murray,J.C, Helms,C, Mishra,S.K., Donis-Keller,H., HallJ.M., Lee, M.K., King,M.-C, AttwoodJ., Morton,M.E., Robson,E.B., Mahtani,M., Willard,H.F., Royle,N.J., Patel,L, Jeffreys,A.J., Verga,V., Jenkins,T., Weber,J.L., Mitchell,A.L., and Bale,A.E.(1991) The CEPH consortium linkage map of human chromosome 1. Genomics 9:686-700. Duyk,G.M., Kim,S., Myers,R.M., and Cox,D.(1990) Exon trapping: A genetic screen to identify candidate transcribed sequences in cloned mammalian genomic DNA. Proc.Nat.Acad.Sci. USA 87:8995-8999. Eki,T., Abe,M., Furuya,K., Ahmad,I., Fujishima,N., Kishida,H., Shiratori,A., Onazaki,T., Yokoyama,K., LePaslier,D., Cohen,D., Hanaoka,F., Murakami,Y. (1996) A long range physical map of human chromosome 21q22.1 band from the YAC continuum. Mammalian Genome 7:303-311. GallJ.G. and Pardue,M.L.(1969) Formation and detection of RNA-DNA hybrid molecules in cytologolical preparations. Proc.Natl.Acad.Sci. USA 63:378. Gemmill,R.M., ChumakovJ., Scott,P., Waggoner,B., Rigault,P., CypserJ., Chen,Q., Weissenbach,J., Gardiner,K., Wang,H., Pekarsky,Y., Le GalLL, Le Paslier,D., Guillou,S., Li,E., Robinson,L., Hahner,L., Todd,S., Cohen,D., Drabkin,H.A. (1995) A second-generation YAC contig map of human chromosome 3. Nature Suppl. 377:299-303. 79 George,K.P.(1970) Cytochemical differentiation along human chromosomes. Nature 226:80-81. Gerhard,D.S.,Kawasaki,E.S.,Bancroft,F.C., and Szabo,P.(1981) Localization of a unique gene by direct hybridization in situ. Proc.Nat.Acad.Sci. USA 78:3755. Green,E.D.and 01son,M.V. (1990) Systematic screening of yeast artificial-chromosome libraries by use of the polymerase chain reaction. Proc.Nat.Acad.Sci.USA 87:1213-1217. Green,D.and Olson,M.V.(1990) Chromosomal region of the cystic fibrosis gene in yeast artificial chromosomes: A model for human genome mapping. Science 250:94-98. Green,E.D., Riethman,H.C, DutchikJ.E., and 01son,M.V. (1991) Detection and characterization of chimeric yeast artificial-chromosome clones. Genomics 11:658-669. HoheiselJ.D. and Lehrach,H.(1993) Use of reference libraries and hybridisation fingerprinting for relational genome analysis. FEBS Letters 325:118-122. Hudson,T.J.,Stein,L.D.,Gerety,S.S.,Ma,J.,Castle,A.B.,Silva, J.,Slonim,D.K., Baptista,R., Kruglyak,L.,Xu,S.-H., Hu,X., Colbert,A.M.E., Rosenberg,C, Reeve-Daly,M.P.,Rozen,S., Hui,L., Wu,X., Vestergaard,C, Wilson,K.M.,Bae,J.S.,Maitra,S., Ganiatsas,S.,Evans,C.A., DeAngelis,M.M., Ingalls,K.A., Nahf,R.W., Horton,L.T.Jr., Anderson,M.O., Collymore, A.J., Ye,W., Kouyoumjian,V., ZemstevaJ.S., Tam.J., Devine,R., Courtney,D.F., Renaud,M.T., Nguyun,H., 0'Connor,T.J., Fizames,C, Faure,S., Gyapay,G., Dib,C, Morrissette,J., OrlinJ.B., Birren,B.W., Goodman,N., Weissenbach,J.,Hawkins,T.L., Foote,S.,Page,D.C, and Lander,E.S.(1995) An STS-based map of the human genome. Science 270:1945-1954. Imai,T. and Olson,M.V.(1990) Second-generation approach to the construction of yeast artificial-chromosome libraries. Genomics 8:297-303. Inoue,H., Nojima,H., and Okayama,H. (1990) High efficiency transformation of Escherichia coli with plasmids. Gene 96:23-28. John,H.,Birnstiel,M.L.,and Jones,K.N.(1969) RNA-DNA hybrids at the cytological level. Nature 223:582. Johnston,M., Andrews,S., Brinkman,R., Cooper,J., Ding,H., DoverJ., Du,Z., Favello,A., Fulton,L., Gattung,S., Geisel,C, Kirsten,J., Kucaba,T., Hillier,L., Jier,M., Johnston,L., Langston,Y., Latreille,P.,Louis,E.J.,Macri,C.,Mardis,E.,Menezes,S., Mouser,L.,Nhan,M., Rifkin,L.,Riles,L.,St.Peter,H.,Trevaskis,E., Vaughan,K., Vignati,D., Wilcox,L., Wohldman, P.,Waterston,R., Wilson,R.,and Vaudin,M. (1994) Complete nucleotide sequence of Saccharomyces cerevisiae chromosome VIII. Science 265:2077-2082. 80 Jeffreys.A.J.,Wilson,V.,and Thein,S.W.(1985) Hypervariable 'minisatellite' regions in human DNA. Nature 314:67-73. Kohara,Y.,Akiyama,A.,and Isono,K.(1987) The physical map of the whole E.coli chromosome: Application of a new strategy for rapid analysis and sorting of a large genomic library. Cell 50:495-508. Lander,E.S.and Green,P.(1987) Construction of multilocus linkage maps in humans. Proc.Nat.Acad.Sci. USA 84:2363-2367. Lander,E.S. and Waterman,M.S.(1988) Genomic mapping by fingerprinting random clones: A mathematical analysis. Genomics 2:231-239. Larin,Z.,Monaco,A.P.,and Lelirach,H.(1991) Yeast artificial chromosome libraries containing large inserts from mouse and human DNA. Proc.Nat.Acad.Sci.USA 88:4123-4127. Lindsay,S.and Bird,A.P.(1987) Use of restriction enzymes to detect potential gene sequences in mammalian DNA. Nature 327:336-338. Malaspina,P.,Roetto,A.,Trettel,F.Jodice,C.31asi,P.,Frontali,M.,Carella,M.,Franco,B-, Camaschella,C.,Novolletto,A.(1996) Construction of a YAC contig covering human chromosome 6p22. Genomics 36: 399-407. McCormick,M.K., Campbell,E., Deaven,L., and Moyzis,R. (1993) Low-frequency chimeric yeast artificial chromosome libraries from flow-sorted human chromosomes 16 and 21. Proc.Nat.Acad.Sci. USA 90:1063-1067. McNeil,J.A.,Johnson,C.V.,Carter,K.C.,Singer,R.H.,and Lawrence,J.B.( 1991) Localizing DNA and RNA within nuclei and chromosomes by fluorescent in situ hybridization. GATA 8:41-58. Merriam,J., Ashburner,M., Hartl,D.L., and Kafatos,F.C. (1991) Toward cloning and mapping the genome of Drosophila. Science 254:221-225. Morton,N.E.(1955) Sequential tests for the detection of linkage. Am.J.Hum.Gen. 7:277-319. Murray,J.C.,Buetow,K.H.,Weber,J.L.,Ludwigsen,S., Scherpbier-Heddema,T., Manion,F., Quillon,J., Sheffield,V.C, Sunden,S., Duyk,G.M., Weissenbach,!., Gyapay,G., Dib,C, Morissette,J., Lathrop,G.M., Vignal,A., White,R., Matsunami,N., Gerken,S., Melis,R., 81 Albertsen,H.,Plaetke,R.,Odelberg,S.,Ward,D.,Dausset,J.,Cohen,D.,and Cann,H.(1994) A comprehensive human linkage map with centimorgan density. Science 265:2049-2054. Nakamura,Y., Leppert,M., 0'Connell,P., Wolff,R., Holm,T., Culver,M., Martin,C, Fujimoto,E., Hoff,M., Kumlin,E. and White,R.(1987) Variable number of tandem repeat (VNTR) markers for human gene mapping. Science 235:1616-1622. National Research Council, Mapping and Sequencing the Human Genome (National Academy Press,Washington,DC.,1988). Nelson,D.L., Ledbetter,S.A., Corbo,L., Victoria,M.F., Ramirez-Solis,R., Webster,T.D., Ledbetter,D.H., and Caskey,C.T.(1989) Alu polymerase chain reaction: A method for rapid isolation of human-specific sequences from complex DNA sources. Proc.Nat.Acad. Sci. USA 86:6686-6690. NIH/CEPH Collaborative Mapping Group (1992). A comprehensive genetic linkage Map of the human genome. Science 258:67-86. 01son,M.,Hood,L.,Cantor,C.,and Botstein,D.(1989) A common language for physical mapping of the human genome. Science 245:1344-1345. 01son,M.V., DutchikJ.E., Graham,M.Y., Brodeur,G.M., Helms,C, Frank,M., MacCollin, M.,Scheinman,R.,and Frank,T.(1986) Random-clone strategy for genomic restriction mapping in yeast. Proc.Nat.Acad.Sci. USA 83:7826-7830. 01son,M.V. (1995) A time to sequence. Science 270:394-396. Riley,J., Butler,R., Ogilvie,G., Finniear,R., Jenner,D.,Powell, S.,Anand,R., SmithJ.C., and Markham,A.F.(1990) A novel, rapid method for the isolation of terminal sequences from yeast artificial chromosome (YAC) clones. Nuc.Acids 7?e5.18:2887-2890. Rommens,J.M., Iannuzzi,M.C, Kerem,B.-S., Drumm,M.L., Melmer,G., Dean,M., Rozmahel,R., ColeJ.L., Kennedy,D., Hidaka,N., Zsiga,M., Buchwald,M., Riordan,J.R., Tsui,L.-C.,and Collins,F.S.(1989) Identification of the Cystic Fibrosis gene: Chromosome walking and jumping. Science 245:1059-1065. Rose,M.D., Winston,F., and Hieter,P. Methods in Yeast Genetics: A Laboratory Course Manual. (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1990). Sambrook,J., Fritch,E.F., and Maniatus,T. Molecular Cloning: A Laboratory Manual. (Second Edition)(Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1989) 82 Saiki,R.K., Gelfand,D.H., Stoffel,S., Scharf,S.J., Higuchi,R, Horn,G.T., Mullis,K.B., and Erlich,H.A. (1988) Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science 239:487-491. Saiki,R.K., Scharf,S., Faloona,F., Mullis,K.B., Horn,G.T., Erlich,H.A., and Arnheim,N. (1985) Enzymatic amplification of (3-globin genomic sequences and restriction site analysis for diagnosis of sickle-cell anemia. Science 230:1350-1354. Scharf,S.J.,Horn,G.T.,and Erlich,H.A.(1986) Direct cloning and sequence analysis of enzymatically amplified genomic sequences. Science 233:1076-1078. Schuler,G.D., Boguski,M.S., Stewart,E.A., Stein,L.D., Gyapay,G., Rice,K., White,R.E., Rodriguez-Tome, Aggarwal,A.,Bajorek,E.,Bentolila,S.,Birren,B.B.,Butler,A.,Castle,A.B., Chiannilkulchai,N., Chu,A., Clee,C., Cowles,S., Day,P.J.R., Dibling,T., Drouot,N., Dunham,L, Duprat,S., East,C., Edwards,C., Fan,J.-B., Fang,N., Fizames,C, Garrett,C., Green,L., Hadley,D., Harris,M., Harrison,P., Brady,S., Hicks,A., Holloway,E., Hui,L., Hussain,S., Louis-Dit-Sully,C., Ma,J., MacGilvery,A.,Mader,C.,Maratukulam,A., Matise, T.C.,McKusick, K.B.,Morissette,J., Mungall,A., Muselet,D., Nusbaum,H.C, Page,D.C., Peck,A., Perkins,S., Piercy,M., Qin,F., QuackenbushJ., Ranby,S., Reif,T., Rozen,S., Sanders,C., She,X., Silva,J., Slonim,D.K., Soderlund,C., Sun,W.-L., Tabar,P., Thangarajah,T.,Vega-Czarney,N.,Vollrath,D.,Voyticky,S.,Goodfellow,P.N.,Houlgatte,R., Hudson,J.R.Jr., Ide,S.E., Iorio,K.R.,Lee,W.Y.,Seki,N.,Nagase,T.,Ishikawa,K.,Nomura,N., Phillips,C., Polymeropoulos,M.H., Sandusky,M., Schmitt,K., Berry,R., Swanson,K., Torres,R.,Venter,J.C.,Sikela,J.M.,Beckmann,J.S.,Weissenbach,!., Myers,R.M., Cox,D.R., James,M.R.,Bentley,D.,Deloukas,P.,Lander,E.S., and Hudson,T.J. (1996) A gene map of the human genome. Science 274:540-546. Schutte,B.C., Ranade,K., Pruessner,J., and Dracopoli,N. (1996) Optimized conditions for cloning PCR products into an Xcml T-vector. Genbank Accession Number U61229. Schwartz,D.C, and Cantor,C.R. (1984) Separation of yeast chromosome- sized DNAs by pulsed- field gradient gel electrophoresis. Cell 37: 67-75. Sherman,F., Fink,G.R., and Hicks,J.B. Laboratory Course Manual for Methods in Yeast Genetics. Cold Spring Harbor Press, Cold Spring Harbor, New York, 1986. Silverman,G.A., Ye,R.D., Pollock,K.M., Sadler,J.E.,and Korsmeyer,S.J. (1989) Use of yeast artificial chromosome clones for mapping and walking within human chromosome segment 18q21.3. Proc.Nat.Acad.Sci. USA 86:7485-7489. Smithies,0.(1995) Early days of gel electrophoresis. Genetics 139:1-4. 83 Soderlund,C. and Dunhami. (1995) SAM: A system for iteratively building marker maps. Computer Applications in the Biosciences.11:645-655. Spurr,N.K., Cox,S., Bryant,S.P., AttwoodJ., Robson,E.B., Shields,D.C, SteinbrueckJ., Jenkins,T., MurrayJ.C, Kidd,K.K., Summar,M.L.,Tsipouras,P.,Retief,A.E., Kruse,T.A., Bale,A.E., Vergnaud,G., WeberJ.L., McBride,O.W., Donis-Keller,H., and White,R.L. (1992) The CEPH consortium linkage map of human chromosome 2. Genomics 14:1055-1063. Stallings,R.L., Torney,D.C, Hildrebrand,C.E., Longmire,J.L., Deaven,L.L., Jett,J.H., Doggett,N.A., and Moyzis,R.K. (1990) Physical mapping of human chromosomes by repetitive sequence fingerprinting. Proc.Nat.Acad.Sci. USA 87:6218-6222. Torney,D.(1991) Mapping using unique sequences. Journal of Mol.Biol. 217:259-264. Trask,B.J.(1991) Fluorescence in situ hybridization: Application in cytogenetics and gene mapping. Trends Genet.l:\A9-\5A. WeberJ.L.and May,P.E.(1989) Abundant class of human DNA polymorphisms which can be typed using the polymerase chain reaction. Am. J.Hum.Genet.44:3$8-396. Weber,J.L.(1990) Informativeness of (dC-dA)n»(dG-dT)n polymorphisms. Genomics 7:524-530. Weiss,M.C. and Green,H. (1967) Human-mouse hybrid cell lines containing partial complements of human chromosomes and functioning human genes. Proc.Nat.Acad. Sci.USA 58:1104-1111. Weissenbach,!., Gyapay,G., Dib,C, Vignal,A., Morissette,J., Millasseau,P., Vaysseix,G., and Lathrop,M. (1992) A second-generation linkage map of the human genome. Nature 359:794-803. White,R.L.,Lalouel,J.-M.,Nakamura,Y.,Donis-Keller,H.,Green,P.,Bowden,D.W.,Mathew, C.G.P., Easton,D.F., Robson,E.B., Morton,M.E., Gusella,J.F., Haines,J.L., RetiefA.E., Kidd,K.K., Murray,J.C, Lathrop,G.M., and Cann,H.M. (1990) The CEPH consortium primary linkage map of chromosome 10. Genomics 6:393-412. White,R.,Leppert,M.,Bishop,T.,Barker,D., Berkowitz,J.,Brown,C.,Callahan,P., Holm,T., and Jerominski,L.(1985) Construction of linkage maps with DNA markers for human chromosomes. Nature (London) 313:101-105. Wilson,R., Ainscough,R., Anderson,K., Baynes,C, Berks,M., Bonfield,J., Burton,1., Connell,M., Copsey,T., Cooper,J., Coulson,A., Craxton,M., Dear,S., Du,Z., Durbin,R., Favello,A., Fraser,A., Fulton,L., Gardner,A., Green,P., Hawkins,T., Hillier,L., Jier,M. 84 ,Johnston,L., Jones,M., Kershaw,J., Kirsten,J., Laisster,N., Latreille,P., Lightning,J., Lloyd,C, Mortimore,B., 0'Callaghan,M., Parsons,1., Percy,C, Rifken,L., Roopra,A., Saunders,D., Shownkeen,R., Sims,M., Smaldon,N., Smith,A., Smith,M., Sonnhammer,E., Staden,R.,Sulston,J., Thierry-Mieg,J., Thomas,K., Vaudin,M., Vaughan,K., Waterston,R., Watson,A., Weinstock,L., Wilkinson-Sproat,J., and Wohldman,P. (1994) 2.2 Mb of contiguous nucleotide sequence from chromosome III of C.elegans. Nature 368:32-38. Yu,C.-E.,Oshima,J.,Fu,Y.-H.,Wijsman,E.M.,Hisama,F.,Alisch,R.,Matthews,S., Nakura,J., Miki,T., Ouais,S., Martin,G.M., Mulligan,J., and Schellenberg,G.D. (1996) Positional cloning of the Werner's Syndrome gene. Science 272:258-262. 85 Appendix 1. WC 8*1 The following is a modified representation of the Whitehead/MIT singly-linked contig WC 8*1. YACs are listed vertically, with associated positive markers. The YACs are ordered from the most telomeric to the most centromeric . Letters in brackets () following each marker define the method by which YACs were found to be positive for the particular marker (see the Discussion for a detailed description of these methods): C= CEPH result, verified by Whitehead/MIT; c= CEPH result, not verified; D= definite Whitehead/MIT result; A= ambiguous result; F= ambiguous result resolved using fingerprinting data; S= ambiguous result resolved using doubly-linked STS data; V= positive result verified by testing individual YAC. Y A C S S T S C O N T E N T 918 B 5 D8S1825 (D) 849_H_4 D8S1825 (S) CHLC.GCT18C02 (D) CHLC.GATA67F10 (D) 871_A_8 D8S1825 (D) CHLC.GCT18C02 (S) CHLC.GATA67F10 (F) 776_F_4 D8S1825 (D) CHLC.GCT18C02 (D) CHLC.GATA67F10 (D) D8S503 (D) 785_D_6 CHLC.GCT18C02 (D) D8S516 (C) 742_D_12 CHLC.GCT18C02 (D) CHLC.GATA67F10 (D) D8S503 (V) D8S516 (C) 966_B_5 CHLC.GCT18C02 (D) CHLC.GATA67F10 (D) D8S503 (D) D8S516 (C) WI-5840 (F) 889_B_7 CHLC.GCT18C02 (F) CHLC.GATA67F10 (D) D8S503 (D) 765_F_11 CHLC.GCT18C02 (S) CHLC.GATA67F10 (D) 727_C_12 CHLC.GCT18C02(F) CHLC.GATA67F10 (F) 725_A_3 CHLC.GATA67F10 (S) D8S503 (D) 845_A_9 D8S503 (F) 626_G_9 D8S503 (C) D8S516 (C) 825_A_1 D8S516 (C) D8S542 (C) D8S1721 (D) 843_E_1 D8S516 (C) WI-5840 (A) 786_B_12 WI-5840 (F) 821_G_5 WI-5840 (D) 807_F_11 D8S1721 (D) 915_H_4 D8S1721 (D) D8S1755 (D) D8S550 (F) D8S1593 (F) D8S265 (D) D8S1695 (D) 722_F_6 D8S1721 (D) 908_B_5 D8S1721 (D) 770_E_9 D8S520 (C) D8S1755 (D) D8S550 (F) D8S1593 (F) D8S265 (C) D8S1695 (D) 700_D_3 D8S520 (c) 749_H_6 D8S520 (c) 723_F_10 D8S1755 (D) D8S550 (S) D8S1593 (D) 692_C_9 D8S550 (C) 793_F_12 D8S550 (c) 737_E_5 D8S1593 (F) D8S265 (F) D8S1695 (D) D8S1759 (D) D8S1946 (D) 821_E_1 D8S1593 (D) D8S265 (D) D8S1759 (D) D8S1130 (S) 710_A_5 D8S1593 (D) 751_E_7 D8S1593 (D) D8S265 (F) D8S1695 (D) D8S1759 (D) 729_E_12 D8S1593 (D) D8S265 (D) D8S1695 (D) 937_H_5 D8S265 (C) 715_C_10 D8S265 (S) D8S1695 (D) D8S1759 (D) -YACS STS CONTENT 773_G_4 D8S1695 (D) D8S1759 (D) D8S1946 (D) D8S1130 (F) D8S1640 (D) D8S1619 (D) D8S1993 (D) 937_G_5 D8S1695 (D) 894_G_12 D8S1759 (D) 809_H_8 D8S1759 (D) D8S1946 (D) D8S1130 (D) D8S1640 (D) D8S1993 (F) 725_C_12 D8S1946 (D) 802_C_3 RK1363_1367 (S) 842_A_9 RK1363_1367 (D) 959_F_7 RK1363_1367 (F) 949_D_12 RK1363_1367 (D) D8S1640 (S) 889_D_10 RK1363_1367 (D) D8S1640 (D) 946_D_6 RK1363_1367 (D) 928_E_7 RK1363_1367 (D) 794_E_11 RK1363_1367 (D) D8S1640 (F) 894_G_4 RK1363_1367 (D) 870_F_10 RK1363_1367 (S) 919_F_10 RK1363_1367 (F) 929_B_6 D8S1130 (D) D8S1640 (S) 871_F_3 D8S1130 (F) D8S1640 (S) D8S1993 (F) 894_G_9 D8S1130 (D) D8S1640 (D) 820_B_4 D8S1640 (D) D8S1619 (D) D8S1993 (D) 764_C_7 D8S1640 (F) D8S1619 (F) D8S1993 (F) 787_H_1 D8S1640 (S) D8S1619 (S) 945_H_8 D8S1640 (S) 711_B_9 D8S1640 (D) 920_D_12 D8S1640 (S) D8S1993 (D) D8S552 (D) D8S1106 (D) D8S1107 (D) 794_D_12 D8S1640 (D) 764_D_9 D8S1640 (D) 954_E_7 D8S1640 (S) 937_E_1 D8S1640 (F) D8S1993 (S) D8S552 (F) D8S1106 (F) D8S1754 (F) WI-6240 (F) WI-2428 (F) D8S1790 (F) 880_F_6 D8S1640 (D) D8S1619 (D) 915_C_1 D8S1640 (D) 715_F_5 D8S1619 (D) 799_B_1 D8S1619 (D) D8S1993 (S) D8S552 (D) D8S1106 (S) D8S1107 (S) D8S1109 (S) 750_F_10 D8S1619 (F) D8S1993 (S) 896_F_7 D8S1619 (D) D8S1993 (S) 717_G_11 D8S1993 (D) 929 E 12 D8S1993 (F) YACS STS CONTENT 729_H_6 D8S1993 (D) 943_C_7 D8S1993 (F) 748_C_9 D8S1993 (S) 744_E_6 D8S552(D) D8S1106 (F) D8S1107 (D) D8S1109 (D) 691_F_8 D8S552 (C) 932_H_3 D8S552 (D) D8S1106 (S) D8S1107 (S) D8S1754 (D) WI-6240 (D) 875_B_7 D8S1106 (D) D8S1107 (S) D8S1109 (S) D8S1754 (D) WI-6240 (S) WI-2428 (S) D8S1790 (D) 762_G_9 D8S1109 (D) WI-6240 (D) WI-2428 (D) D8S1790 (D) 946_A_6 D8S1109 (D) D8S1754 (F) WI-6240 (S) WI-2428 (S) D8S1790 (S) 888_E_10 D8S1754 (D) WI-6240 (F) WI-2428 (F) D8S1790 (S) 947_E_6 D8S1790 (F) 944_F_2 D8S511 (C) D8S1827 (F) 732_E_3 D8S511 (C) 886_C_8 D8S1827 (D) 832_A_10 D8S1827 (S) WI-5790 (D) D8S549 (D) 768_D_1 D8S1827 (F) 821_F_7 D8S1827 (D) WI-5790 (S) D8S549 (F) 840_G_7 D8S1827 (D) WI-5790 (D) D8S549 (D) 816_C_3 D8S1827 (F) 754_D_4 D8S1827 (F) D8S549 (C) 810_G_9 D8S1827 (F) 721_C_7 WI-5790 (D) 946_C_9 WI-5790 (D) D8S549 (D) WI-7228 (D) 931_A_1 WI-5790 (S) D8S549 (F) WI-7228 (F) WI-5397 (F) 752_D_3 WI-5790 (D) 877_F_2 WI-5790 (D) D8S1993 (A) 802_F_11 WI-5790 (D) 850_C_5 D8S549 (c) 932_E_9 D8S549 (F) WI-7228 (F) WI-5397 (F) AFM177XB10 (F) CHLC.GATA29A08 (D ) 632_D_12 D8S549 (C) 856_E_11 D8S549 (F) WI-7228 (F) 767_H_8 D8S549 (D) 639_G_4 D8S549 (C) 766_A_12 WI-7228 (F) WI-5397 (F) AFM177XB10 (D) CHLC.GATA29A08 (F) YACS STS CONTENT 874_A_6 WI-7228 (D) WI-5397 (F) 935_F_8 WI-5397 (D) AFM177XB10 (D) 958_C_7 WI-5397 (F) AFM177XB10 (D) 798_G_3 AFM177XB10 (F) 780_F_7 AFM177XB10 (F) 903_H_1 AFM177XB10 (D) 722_C_9 AFM177XB10 (F) 813_E_4 AFM177XB10 (D) 857_H_5 AFM177XB10 (F) 751_D_3 AFM333TH1 (D) 772_E_7 AFM333TH1 (D) 947_G_11 AFM333TH1 (S) D8S261 (S) 841_D_2 AFM333TH1 (D) 852_F_10 AFM333TH1 (F) D8S261 (F) WI-6514 (S) 955_H_2 AFM333TH1 (D) D8S261 (D) WI-6514 (S) 727_F_3 AFM333TH1 (S) WI-6514 (F) 801_A_9 D8S261 (S) WI-6514 (D) NIB1769 (D) CHLC.GATA72C10 (F) 860_D_5 D8S261 (D) 753_E_4 D8S261 (F) WI-6514 (F) CHLC.GGAT12E04 (D) WI-5962 (D) 868_B_1 D8S261 (C) 812_G_7 D8S261 (S) WI-6514 (D) 947_D_11 WI-6514 (D) WI-4688 (F) WI-4873 (F) WI-9031 (F) D8S1715 (F) 769_G_1 WI-6514 (D) NIB1769 (D) CHLC.GATA72C10 (D) 847_F_8 WI-6514 (S) NIB1769 (D) CHLC.GATA72C10 (F) 915_D_8 NIB1769 (D) CHLC.GATA72C10 (D WI-6738 (D) WI-5962 (D) 859_A_7 NIB1769 (D) CHLC.GATA72C10 (D WI-6738 (D) CHLC.GGAT12E04 (S) WI-5962 (D) WI-4688 (S) WI-5355 (S) WI-4873 (D) WI-9031 (D) D8S1715 (S) 757_A_2 NIB1769 (D) CHLC.GATA72C10 (F) WI-6738 (F) CHLC.GGAT12E04 (D 756_C_9 NIB1769 (D) CHLC.GATA72C10 (F) 948_D_2 NIB1769 (D) CHLC.GATA72C10 (F) WI-4688 (F) 895_A_10 NIB1769 (D) 757_D_2 CHLC.GATA72C10 (F 895_A_12 CHLC.GATA72C10 (D WI-5962 (D) 948_A_2 WI-6738 (D) 908_G_7 WI-6738 (D) CHLC.GGAT12E04 (D) WI-5962 (D) 792_B_2 CHLC.GGAT12E04 (S) WI-5962 (F) WI-4688 (F) 787 B 2 WI-5962 (D) YACS STS CONTENT 791_A_7 WI-5962 (F) WI-4688 (F) 723_C_12 WI-4688 (F) 769_E_3 WI-4688 (D) D8S1715 (F) 932_A_6 WI-4688 (D) 857_A_1 WI-4688 (F) 739_B_2 WI-4688 (D) WI-5355 (D) WI-4873 (D) WI-9031 (D) D8S1715 (D) 791_C_8 WI-4688 (F) 742_B_9 WI-4688 (D) 943_D_12 WI-4688 (S) WI-5355 (S) WI-4873 (F) WI-9031 (F) D8S1715 (F) D8S258 (S) D8S280 (F) WI-9078 (F) 936_C_3 WI-4688 (D) WI-5355 (D) WI-4873 (D) WI-9031 (D) D8S1715 (D) D8S258 (D) WI-9078 (D) 720_F_1 WI-5355 (D) WI-4873 (D) WI-9031 (D) D8S1715 (D) 755_F_9 WI-5355 (S) WI-4873 (D) WI-9031 (F) D8S1715 (F) D8S258 (S) D8S280 (C) WI-9078 (F) 732_E_10 WI-5355 (D) WI-4873 (D) WI-9031 (D) D8S1715 (D) 970_B_6 WI-5355 (S) WI-4873 (D) D8S258 (D) D8S280 (D) 911_G_7 WI-5355 (D) WI-4873 (D) WI-9031 (D) D8S1715 (D) D8S258 (S) 947_A_12 WI-4873 (F) 948_D_5 WI-4873 (F) WI-9031 (F) D8S1715 (F) D8S258 (S) D8S280 (F) 755_B_1 WI-9031 (F) D8S1715 (F) D8S258 (S) D8S280 (F) 770_C_5 D8S1715 (F) D8S258 (C) D8S280 (F) 771_D_12 D8S1715 (F) D8S258 (C) D8S280 (F) 904_D_7 D8S258 (C) WI-10327 (D) 913_F_10 D8S258 (C) D8S280 (F) WI-10327 (A) 844_F_12 D8S258 (D) D8S280 (D) 654_G_7 D8S258 (C) 746_H_6 WI-10327 (D) CHLC.ATA18B10 (S) D8S282 (D) WI-6088 (D) 713_G_1 CHLC.ATA18B10 (S) D8S282 (D) 900_G_12 CHLC.ATA18B10 (D) D8S282 (D) WI-6088 (D) 690 F 5 D8S282 (C) 91 Appendix 2. Published research completed during this thesis. The following paper summarizes research completed during this thesis, which resulted in the genetic mapping of the beta-3-adrenergic receptor (ADRB3) to a 17.9 cM region of chromosome 8p. 92 Cytogcnet Cell Gene. 73:331-333 (1996) Cytogenetics and Cell Genetics Analysis of CA repeat polymorphisms places three human gene loci on the 8p linkage map R. Bruskiewich, 1 T. Everson, 1 L. M a , 1 L Chan, 1 M. Schertzer, 1 J.-P. Giacobino, 2 P. Muzzin, 2 and S. W o o d 1 ' Department of Medical Genetics, University of British Columbia, Vancouver, BC (Canada), and 2Departement de Biochimie medicale, Centre Medical Universitaire, Geneva (Switzerland) Abstract. The gene loci for luteinizing hormone-releasing hormone (LHRH), the beta-3 adrenergic receptor (ADRB3), and heregulin (HGL) have been assigned to the short arm of human chromosome 8, but the positions of these loci on the human genetic linkage map have not been previously reported. We have isolated simple tandem repeat polymorphisms (STRPs) for these loci. These STRPs enabled us to determine the genetic map locations for these genes. Luteinizing hormone-releasing hormone (LHRH) is a key neuroendocrine molecule in the hypothalamic-pituitary-gonad-al hormonal system controlling human reproduction. Impaired function of this hormone may underlie such reproductive phe-notypes as hypogonadism and precocious puberty (Cattanach et al., 1977; Mason et al., 1986). LHRH may also inhibit tumor-cell proliferation (Harris et al., 1991; Szende et al., 1991; Limonta et al., 1993; Irmer et al., 1994, 1995). It is interesting to note that a tumor suppressor inactivated in certain reproduc-tive cancers is postulated to reside on chromosome 8p (Keran-gueven et al., 1995). Yang-Feng et al. (1985) assigned the LHRH locus to 8p21 —>pl 1.2 by in situ hybridization and somatic cell hybrid analysis using a cloned cDNA probe (See-burg and Adelman, 1984; Adelman et al., 1986). Oshima et al. (1994) have also recently placed LHRH on a radiation hybrid map. The beta-3 adrenergic receptor (ADRB3) is a member of a family of adrenergic receptors involved in the signal transduc-tion of the hormones epinephrine and norepinephrine. These Supported by the Canadian Genome Analysis and Technology Program of the Medi-cal Research Council of Canada (GOl 2753). Received 26 October l995;revision accepted 22 March 1996. Request reprints from Dr. Stephen Wood, Department of Medical Genetics, 6174 University Boulevard, Vancouver, British Columbia (Canada) V6T 1Z3; telephone: 604-822-6830; fax: 604-822-5348; . e-mail: adrenergic receptors are G-protein-coupled catecholamine re-ceptors with seven membrane-spanning domains (Emorine et al., 1987). ADRB3 is expressed in a variety of tissues, including adipocytes (Granneman et al., 1993). Its high expression in murine adipose tissue suggests a possible involvement of ADRB3 in obesity and diabetes (Muzzin et al., 1991; Nahmias et al., 1991). ADRB3 is located within a chromosomal region that is consistently amplified in human breast cancer (Dib et al., 1995). Human ADRB3 has been shown to map to 8pl2—> 8p 11.1, and its murine homolog to mouse chromosome 8, by in situ hybridization (Nahmias et al., 1991). Heregulin (HGL), or neu differentiation factor (NDF), is a ligand which interacts with the Neu/ErbB-2 receptor tyrosine kinase (Holmes et al., 1992). It is a 44-lcDa glycoprotein that is similar in amino acid sequence to epidermal growth factor (EGF). Alternative splicing produces at least 10 isoforms, clas-sified into two groups, a and (3, which differ in their EGF-like domains (Peles and Yarden, 1993). The receptors for HGL and EGF are encoded by related protooncogenes that are associated with a variety of human malignancies (Yarden and Ullrich, 1988). Lee and Wood (1993) mapped HGL to 8p22-*pll using somatic cell hybrids. Orr-Urtreger et al. (1993) localized the HGL gene to 8p21 -» p 12 by in situ hybridization. Thomas et al. (1993) excluded HGL as a candidate for the Werner syn-drome locus (WRN), positioning HGL on the linkage map rela-tive to WRN with a maximum lod of 5.32 at a recombination fraction of 0.017. KARGER. E-mail © 1996 S. Karger AG, Basel Fax + 41 61 306 12 34 0301-0171/96/0734-0331 $ 10.00/0 tip:// 23.3 23.2 21.3 21.2 21.1 8p D8S136 D8S5 D8S137 D8S87 - | -FGFR1 D8S255 - 1 -cM 4.7 4.3 4.3 Sex-Average Map HGL LHRH A D R B 3 Fig. 1. Chromosome 8p linkage map with three new simple tandem repeat polymorphisms (STRPs). The thin horizontal lines connect each of the three novel gene-associated STRP markers with the corresponding reference map markers exhibiting no recombination with the gene marker; the bold vertical lines indicate the 1:1,000 likelihood intervals containing the novel gene markers. 93 TCCAGTGGTGCC-3'; the final concentration of MgCl 2, 2.5 mM; and the T m , 58 °C. For HGL, the CA strand primer was 5'-CATTGATTATGGAA-TGCC-3'; the GT strand primer, 5'-GTTGAAAAAAATTGTGTTCA-3'; the final concentration of MgCb, 2.5 mM; and the T m , 46 ° C. STRP genotyping Amplification reactions of 40 cycles consisted of 1 min denaturation at 95 °C, 30 s annealing at the specified T m , and extension for up to 2 min at 72 ° C. A 40-ng sample of genomic DNA was used with 10 pmol of each prim-er in a 25-ul reaction mixture. The reaction buffer contained 50 mM Tris-Cl (pH 8.3), 0.02% NP 40, 0.02% Tween, and the MgCl 2 concentration indi-cated above. Each dNTP was present in a concentration of 200 mM. One selected primer in each system was 5' end-labeled with [y-32P]ATP, and 0.25 pmol (0.125 mCi) was added to each reaction with one unit of Taq poly-merase (BRL). PCR products were run out on 5 % Long Ranger modified polyacrylamide denaturing gels (AT Biochemicals) and detected by autoradi-ography on Kodak X-OMAT RP film. To determine STRP allele sizes, a radiolabeled M13mpl8 plasmid sequence ladder was run on the gel along-side the genotyping reactions. Linkage analysis The three STRP loci were positioned on the linkage map by two-point and multipoint linkage analysis using the CRIMAP linkage program and genotypes in Version 7.1 of the CEPH database. In this paper, we report the identification of simple tandem repeat polymorphism (STRP) gene markers for LHRH, HGL, and ADRB3 and the results of genotyping eight CEPH families to place these loci on the 8p linkage map. Materials and methods Families Eight CEPH reference families (102, 884, 1331, 1332, 1347, 1362, 1413, and 1416) were genotyped for each of the three markers (LHRH, HGL, and ADRB3). Heterozygosities were estimated from parents and grandparents of the pedigrees. STRP locus identification Derived or published primer sequences with minor modifications were used as gene-specific sequence tagged site (STS) reagents for identifying cos-mids containing the target loci from a genomic library, LA08NC01, of flow-sorted chromosome 8 DNA (Wood et al., 1992). The STS primers were as follows: LHRH: 5'-CCTTGTCTGGATCTAATTTGATTG-3' and 5'-TCA-CCTGGAGCATCTAGGGTACA-3' (exon #2 primers, Nakayama et al., 1990); HGL: 5'-CCTTTTCAGGATGTGGTCATTG-3' and 5'-CTGTCT-GCCTGAATAGGAGC-3' (primers oher21.11.5 and oher21.11.1; Lee and Wood, 1992); ADRB3: 5'-AGCACGTTGGCCAGAAAGAAG-3' and 5'-TCCTCCGTCTCCTTCTACCTT-3' (derived from a sequence reported by Emorine et al., 1989). Cosmids were screened for (dC-dA)„ simple tandem repeat sequences (STRs) using poly-GT oligonucleotide hybridization. Posi-tive restriction fragments were subcloned into a Bluescript II KS vector (Stra-tagene) by either "shotgun" subcloning or band isolation from agarose gels onto DEAE membranes. Dideoxynucleotide sequencing reactions on double-stranded templates were carried out using vector primers and Sequenase (US Biochemical) to identify the STR sequences. Primers for STR detection by PCR amplification were designed from unique flanking sequences. The STR primers and specific PCR amplification conditions were as fol-lows. For LHRH, the CA strand primer used was 5'-GACTTATCCTC-CTTGTTTCCC-3'; the GT strand primer, 5'-ATAAAGGACAGTCATTC-TGGAG-3'; the final concentration of MgCf;, 2.0 mM; and the annealing temperature (Tm), 58 °C. For ADRB3, the CA strand primer was 5'-GCA-ATGCTTTGTGCCTGTGC-3' , the GT strand primer, 5"-ATGCTATAA-Results LHRH Cosmids 37F12 and 145G1 were identified as containing a portion of the LHRH gene. A polymorphic STR with a struc-ture of (CA)23TTA3(CT)n was subcloned and characterized from cosmid 145G1. The cloned allele yielded a 237-nucleotide product upon amplification. All eight CEPH families were informative, giving 10 additional alleles ranging in size from 219 to 243 nucleotides and one smaller allele of 187 nucleo-tides. Heterozygosity for this system was 0.86. Pairwise lod scores for LHRH against the reference set of 8p markers were calculated, with no recombination observed between LHRH and D8S5, with a Z m a x of 9.33. Multipoint linkage analysis placed LHRH with equal likelihood in either of the genetic intervals flanking D8S5, bounded proximally by D8S137 and distallybyD8S136 (Fig. 1). ADRB3 Cosmid 92G1 was found to contain both the ADRB3 gene and a CA dinucleotide STR. This STR, whose structure is (AC)nGC(AC)9, resides in the 3' untranslated region of the gene. Labeling the CA strand primer in the specified PCR sys-tem yields a cloned STR product 96 nucleotides in length. Three CEPH pedigrees were informative, giving a total of 48 genotyped individuals available for linkage analysis. Two al-leles were noted, 96 and 94 nucleotides in size, with estimated frequencies of 0.814 and 0.186, respectively. The calculated heterozygosity for this system is 0.303. Pairwise lod scores for ADRB3 against the reference set of 8p markers were calculated, with zero recombination observed with D8S87 ( Z ^ = 3.61) and FGFR1 (Z m a x = 8.43). Multipoint analysis placed ADRB3 in the genetic interval bounded proximally by D8S255 and dis-tallybyD8S137(Fig. 1). 332 Cytogenet Cell Genet 73:331-333 (1996) HGL PCR screening identified cosmids 45A2, 67D8, and 91C2 containing the published heregulin STS. Cosmid 91C2 contained a polymorphic STR with a structure of (GT)i2TC(GT)5N9 (TG)4. Labeling the CA strand primer gave a 118-nucleotide cloned STR amplification product. Six of eight CEPH pedigrees were informative for the STRP. A sec-ond allele, 104 nucleotides in length, was also observed. The frequencies of the first and second alleles were 0.24 and 0.76, respectively. The heterozygote frequency was 0.365. Pairwise lod scores for HGL against the reference set of 8p markers were calculated, with zero recombination observed between HGL and D8S87 (Z m a x = 9.03), whereas 6% recombination was observed between HGL and FGFR1 (Z m a x = 9.77). Multipoint linkage analysis placed HGL in the genetic interval bounded proximally by FGFR1 and distally by D8S137 (Fig. 1). 94 Discussion We have placed three genes on the chromosome 8p linkage map by analyzing STRPs in CEPH families. The order of HGL and ADRB3 is uncertain, since the 1,000:1 likelihood intervals for these loci overlap. The order cen-ADRB3-HGL-tel is more likely by a factor of 2.3:1. While no recombination was observed between D8S87 and either HGL or ADRB3, the more proximal locus FGFR1 recombines with HGLbut not ADRB3. The proximity of FGFR1 and ADRB3 is supported by physical map data placing them within 900 kb on a single YAC (Dib et al., 1995). LHRH maps close to D8S5, with which it shows no recombinants, while its map interval places it distal to both HGL and ADRB3. Thus, the order of these three loci is cen-ADRB 3-HGL-LHRH-tel. Current linkage maps are largely based on anonymous DNA markers, whereas it is the genes that are the elements of biologi-cal interest and whose map locations are of primary interest. This study adds three genes to the 8p linkage map and provides a highly polymorphic marker for the LHRH locus. References Adelman JP, Mason AJ, Hayfiick JS, Seeburg PH: Iso-lation of the gene and hypothalamic cDNA for the common precursor of gonadotropin-releasing hor-mone and prolactin release-inhibiting factor in hu-man and rat. Proc natl Acad Sci, USA 83:179-183 (1986) . Cattanach BM, Iddon CA, Charlton H M , Chiappa SA, Fink G: Gonadotrophin-releasing hormone defi-ciency in a mutant mouse with hypogonadism. Nature 269:338-340(1977). Dib A, Adelaide J, Chafianet M , Imbert A, Lepaslier D, Jacquemier J, Gaudray P, Theillet C, Birnbaum D, Pebusque MJ: Characterization of the region of the short arm of chromsome 8 amplified in breast car-cinoma. Oncogene 10:995-1001 (1995). Emorine LJ, Marullo S, Briend-Sutien M M , Patey G , Taka K, Delavier-Klutchko C, Strosberg AD: Mo-lecular characterization of the human beta 3-adrenergic receptor. Science 245:1 1 18-1 121 (1987) . Granneman JG, Lahners K N , Chaudhry A: Character-ization of the human beta-3 adrenergic receptor gene. Molec Pharm 44:264-270 (1993). Harris N, Dutlow C, Eidne K, Dong KW, Roberts J, Millar R: Gonadotropin-releasing hormone gene expression in MDA-MB-231 and ZR-75-1 breast carcinoma cell lines. Cancer Res 51:2577-2581 (1991). Holmes WE, Sliwkowski, M X , Akita RW, Henzel WJ, Lee J, Park JW, Yansura D, Abadi N, Raab H, Lewis G D , Shepard H M , Kuang W-J, Wood W[, Goeddel DV, Vandlen RL: Identification of hereg-ulin, a specific activator of pl85erbB2. Science 256:1205-1210(1992). Irmer G, Burger C, Muller R, Ortmann O, Peter U , Kakar SS, Neill JD, Schulz K D , Emons G: Expres-sion of the messenger RNAs for luteinizing hor-mone-releasing hormone (LHRH) and its receptor in human ovarian epithelial carcinoma. Cancer Res 55:817-822 (1995). Irmer G, Burger C, Ortmann O, Schulz K D , Emons G: Expression of luteinizing hormone releasing hor-mone and its mRNA in human endometrial cancer cell lines. J clin Endocrin Metab 79:916-919 (1994) . Kerangueven F, Essioux L, Dib A, Noguchi T, Allione F, Geneix J, Longy M , Lidereau R, Eisinger F, Pebusque MJ, Jacquemier J, Bonaiti-Pellie C, So-bol H, Bimbaum D: Loss of heterozygosity and linkage analysis in breast carcinoma: indication for a putative third susceptibility gene on the short arm of chromosome 8. Oncogene 10:1023-1026 (1995) . Lee J, Wood Wl: Assignment of heregulin (HGL) to human chromosome 8p22—>pl 1 by PCR analysis of somatic cell hybrid DNA. Genomics 16:790-791 (1993). Limonta P, Dondi D, Moretti R M , Fermo D, Garattini E, Motta M: Expression of luteinizing hormone-releasing hormone mRNA in the human prostatic cancer cell line LNCaP. J clin Endocrin Metab 76:797-800(1993). Mason AJ, Hayfiick JS, Zoeller RT, Young WS, III, Phillips HS, Nikolics K, Seeburg PH: A deletion truncating the gonadotropin-releasing hormone gene is responsible for hypogonadism in the "hpg" mouse. Science 234:1366-1371 (1986). Muzzin P, Revelli J-P, Kuhne F, Gocayne JD, McCom-bie WR, Venter JC, Giacobino JP, Fraser C M : An adipose tissue-specific |3-adrenergic receptor: mo-lecular cloning and down-regulation in obesity. J biol Chem 266: 24053-24058(1991). Nahmias C, Blin N, Elalouf J-M, Mattei M G , Strosberg AD, Emorine LJ: Molecular characterization of the mouse beta-3 adrenergic receptor: relationship with the atypical receptor of adipocytes. EMBO J 10:3721-3727 (1991). Orr-Urtreger A, Trakhtenbrot L, Ben-Levy R, Wen D, Rechavi G, Lonai P, Yarden Y: Neural expression and chromosomal mapping of Neu differentiation factor to 8pl2->p2l. Proc natl Acad Sci, USA 90:1867-1871 (1993). Oshima J, Yu C, Boehnke M , Weber J, Edelhoff S, Wagner M , Wells DE, Wood S, Disteche C, Martin G , Schellenberg G: Integrated mapping analysis of the Werner syndrome region of chromosome 8. Genomics 23:100-113 (1994). Peles E, Yarden Y: Neu and its ligands: from oncogene to neural factors. Bioessays 15:815-824 (1993). Seeburg PH, Adelman JP: Characterization of cDNA for precursor of human luteinizing hormone releas-ing hormone. Nature 311:666-668 (1984). Szende B, Srkalovic G, Timar J, Mulchahey JJ, Neill JD, Lapis K, Csikos A, Szepeshazi K, Schally AV: Localization of receptors for luteinizing hormone-releasing hormone in pancreatic and mammary cancer cells. Proc natl Acad Sci, USA 88:4153-4156(1991). Thomas W, Rubenstein M , Goto M , Drayna D: A genetic analysis of the Werner Syndrome region on chromosome 8p. Genomics 16:685-690 (1993). Wood S, Schertzer M , Drabkin H . Patterson D, Long-mire JL, Deaven LL: Characterization of a human chromosome 8 cosmid library constructed from (low-sorted chromosomes. Cytogenet Cell Genet 59:243-247(1992). Yang-Feng T L , Seeburg PH, Francke U: Human lutein-izing hormone-releasing hormone gene (LHRH) is located on short arm of chromosome 8 (region 8pl 1.2—»p21). Somat Cell molec Genet 12:95-100(1986). Yarden Y, Ullrich A: Growth factor receptor tyro-sine kinases. A Rev Biochem 57:443-478 (1988). Cytogenet Cell Genet 73:331-333 (1996) 333 


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items