UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

The identification and characterization of genes in a candidate region for the Huntington’s disease gene Collins II, Colin Conrad 1993

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata


831-ubc_1994-893451.pdf [ 2.79MB ]
JSON: 831-1.0088018.json
JSON-LD: 831-1.0088018-ld.json
RDF/XML (Pretty): 831-1.0088018-rdf.xml
RDF/JSON: 831-1.0088018-rdf.json
Turtle: 831-1.0088018-turtle.txt
N-Triples: 831-1.0088018-rdf-ntriples.txt
Original Record: 831-1.0088018-source.json
Full Text

Full Text

THE IDENTIFICATION AND CHARACTERIZATIONOF GENES IN A CANDIDATE REGIONFOR THE HUNTINGTON’S DISEASE GENEbyCOLIN CONRAD COLLINS IIB.S., Western New England College, 1982A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHYinTHE FACULTY OF GRADUATE STUDIESGENETICS PROGRAMMEWe accept this thesis as conformingiuired standardBRITISH COLUMBIJuly, 1993© Cohn Conrad Collins II, 1993In presenting this thesis in partial fulfilment of the requirements for an advanceddegree at the University of British Columbia, I agree that the Library shall make itfreely available for reference and study. I further agree that permission for extensivecopying of this thesis for scholarly purposes may be granted by the head of mydepartment or by his or her representatives. It is understood that copying orpublication of this thesis for financial gain shall not be allowed without my writtenpermission.(Signature)Department of IThe University of British ColumbiaVancouver, CanadaDate________________DE-6 (2/88)ABSTRACTHuntington’s disease (HD) is an autosomal dominantneurodegenerative disease affecting approximately 1/10,000 individuals.Progressive psychiatric disturbances, involuntary movement disorders,termed chorea, and dementia are clinical hallmarks of HD while,neuropathologically, a progressive loss of neurons throughout the centralnervous system and in particular within the caudate nucleus of the basalganglia is observed in HD. The gene responsible for HD was localized tochromosome band 4p16.3 in 1983, by genetic linkage and in situhybridization studies. This thesis describes the delineation of a 50 kb HDgene candidate region, within chromosome 4p16.3, and the identificationand characterization of two novel human genes mapping within that 50 kbinterval.The chromosomes of a patient reported to carry a new mutation atthe HD locus and those from 11 of her siblings were haplotyped revealinga distal 4pl6.3 recombination event in the affected individual. Becauseboth new mutations at the HD locus, and distal 4p16.3 recombinationevents are rare, a hypothesis was formulated relating the recombinationevent to the occurrence of HD in the patient. In a test of this hypothesisa chromosome walk was performed to identify the site of postulatedunequal recombination. In the course of the chromosome walk tworestriction fragment length polymorphisms, informative in the sibship,were identified enabling a 500 kb HD candidate region, approximately 700kb from the 4p telomere, to be narrowed to 80 kb.The molecular characterization of the 80 kb region ultimatelyreduced the interval to 50 kb and, moreover, revealed three putative CpGislands. Zoo blot analysis identified phylogenetically conserved DNAassociated with two of these CpG islands suggesting the presence ofIIencoded genes. The subsequent probing of cDNA libraries and Northernblots led to the identification and characterization of two genes encodingfor the n-subunit of rod photoreceptor cGMP phosphodiesterase (PDEB)and a novel regulatory myosin light chain (MYL5). No evidence for anunequal recombination, within the 50 kb interval, was revealed bySouthern blot analysis. Similarly, single strand conformationalpolymorphism (SSCP) analysis of PDEB and MYL5 genes failed to revealevidence for the postulated mutation. These findings strongly suggestthat the 4p16.3 recombination event is unrelated to the occurrence of HDin this sporadic case.IIITABLE OF CONTENTSABSTRACT I ITABLE OF CONTENTS i vLIST OF TABLES viiLIST OF FIGURES viiiDEDICATION xACKNOWLEDGMENTS x i1.0 HUNTINGTON’S CHOREA 1 a1.1.0 INTRODUCTION 11.1.1 History 11.1.2 Pathology 21.2.0 THE GENETICS OF HUNTINGTON’S DISEASE 61.2.1 Dominance 61.2.2 Homozygosity 61.2.3 Penetrance and Expressivity 81.2.4 Anticipation 91.2.5 Heterogeneity 111.3.0 POSITIONAL CLONING 121.3.1 Overview 121.3.2 Genetic Localization 1 51.3.3 Physical Mapping and Cloning 1 61.3.4 Identification and Assessment of Candidate Genes 1 81.3.5 Zoo Blot Analysis 1 91.3.6 CpG Islands 1 91.3.7 cDNA Library Screening 201.3.8 Mutation Assessment 211.4.0 TOWARDThEHDGENE 231.5.0 STATEMENTOFPURPOSE 281.6.0 REFERENCES 312.0 MATERIALS AND METHODS 47a2.1.0 MATERIALS 472.2.0 METHODS 482.2.1 Cosmid Library Plating and Screening 482.2.2 cDNA Library Plating and Screening 482.2.3 cDNA Library Screening by the PCR 492.2.4 Northern Blot Analysis 502.2.5 Hybridization Probes 51iv2.2.6 DNA Isolation and Southern Blot Analysis 522.2.7 Sequencing 522.3.0 REFERENCES 543.0 CLONING A 126 KB SEGMENT OF HD CANDIDATE REGION II 56a3.1.0 INTRODUCTION 563.2.0 RESULTS 593.2.1 Chromosome Walking 593.2.2 Identification Of Informative RFLPs 673.2.3 Identifying CpG Islands 713.2.4 The Identification of PhylogeneticallyConserved DNA 733.2.5 Transcript Identification 793.3.0 DISCUSSION 803.4.0 REFERENCES 834.0 IDENTIFICATION AND CHARACTERIZATION OF THE GENEENCODING FOR THE 3-SUBUNIT OF ROD PHOTORECEPTORcGMP PHOSPHODIESTERASE 86a4.1.0 INTRODUCTION 864.2.0 RESULTS 874.2.1 Transcript Identification and Isolation of cDNAs 874.2.2 The Full Length Sequence of the PDEB cDNA 894.2.3 Genomic Organization of the PDEB Gene 924.2.4 Evidence for Alternate Processing 974.2.5 Identification of Polymorphisms 1 004.2.6 Amino Acid Alignments 1 004.3.0 DISCUSSION 1034.4.0 REFERENCES 1105.0 THE GENOMIC ORGANIZATION OF A NOVEL REGULATORYMYOSIN LIGHT CHAIN GENE (MYL5) THAT MAPS TOCHROMOSOME 4p16.3 AND SHOWS DIFFERENT PATTERNS OFEXPRESSION BE1WEEN PRIMATES 11 5a5.1.0 INTRODUCTION 1155.2.0 RESULTS 1165.2.1 Transcript Identification 11 65.2.2 Isolation of cDNAs 11 95.2.3 cDNA Sequence 1 235.2.4 Genomic Sequence 1 265.3.0 DISCUSSION 1285.3.1 Genomic Organization 1285.3.2 Amino Acid Alignments 1 295.3.3 Expression 1 305.3.4 Assessment as an HD Candidate Gene 1 31V5.4.0 REFERENCES 1336.0 CONCLUSIONS 1 37a6.1.0 SUMMARY OF MAJOR FINDINGS AND CONCLUSIONS 1376.2.0 FUTURESTUDIES 1406.3.0 REFERENCES 141viLIST OF TABLESTable 1-1 Disease loci cloned by positional cloning 30Table 3-1 CpG island mapping 72Table 5-1 MYL5 cDNA Sequencing and PCR Primers 121Table 5-2 Exon-Intron Organization of MYL5 1 27viiLIST OF FIGURESFigure 1-1 Striatal atrophy in HD 4Figure 1 -2 Age matched normal and HD affected brains 5Figure 1-3 A typical HD pedigree 6Figure 1-4 Anticipation in an HD pedigree 10Figure 1-5 Positional cloning flow chart 14Figure 1-6 Map of chromosome 4p16.3 24Figure 1-7 An apparent new mutation at the HD locus 26Figure 3-1 DNA haplotype analysis on the pedigree with a newHD mutation 57Figure 3-2 Map of the cosmids comprising the 126 kb contig 62Figure 3-3 Five stable cosmids forming the 126 kb contig 63Figure 3-4 Alu fingerprint of the five stable cosmids 64Figure 3-5 (GTICA)10 fingerprint of the five stable cosmids 65Figure 3-6 (CCCTAA)4fingerprint of the five stable cosmids 66Figure 3-7 Haplotype analysis on the siblings from thesporadic HD pedigree 69Figure 3-8 Mapped CpG Islands 72Figure 3-9 Zoo blot with cosmid cQ 74Figure 3-10 Southern blot with DpE2 75Figure 3-1 1 Bios somatic cell hybrid mapping panel 76Figure 3-12 Zoo blot with DpE2 77Figure 3-13 Zoo blot with c26H1 78Figure 3-14 Schematic representation of major findings 82viiiFigure 4-1 Results of screening five cDNA libraries 88Figure 4-2 Northern analysis with probe Ki 90Figure 4-3 Map of isolated PDEB cDNAs 91Figure 4-4 Full length PDEB cDNA sequence 93Figure 4-5 Mapping PDEB cDNA AR-i 94Figure 4-6 Localization of the PDEB genes exon 1 95Figure 4-7 Genomic organization of the PDEB gene 96Figure 4-8 The first 12 nucleotides of the PDEB genes exon 10 98Figure 4-9 Alternate splicing in exon 21 of the PDEB gene 99Figure 4-10 Amino acid sequence alignments 102Figure 5-1 Genomic organization of the MYL5 gene 1 1 7Figure 5-2 Northern analysis with BS-i 11 8Figure 5-3 Restriction map of the MYL5 cDNA 120Figure 5-4 cDNA library screening by the PCR 122Figure 5-5 Sequence of the MYL5 cDNA 124Figure 5-6 Amino acid sequence alignments 1 25ixDEDICATIONTo the memory of Dorothy Lee ScottxACKNOWLEDGMENTSI would like to thank my supervisor, and friend, Dr. Michael Haydenfor his unending support during my stay in his laboratory. I know that Iwill continue to learn from Michael and look forward to many fruitfuldiscussions and collaborations in the future. I would also like to thank themembers of my supervisory committee, Dr. Connie Eaves, Dr. Dixie Magerand Dr. Steven Wood for their time and advice during my doctoraltraining.I can never thank the many members of the Hayden laboratory,enough, for their support and friendship. As I write this in San Francisco,my mind returns to the many incredible runs, beach parties and pot luckdinners, in Vancouver, and of course to the hunt for the HD gene. I wouldespecially like to express my gratitude to Susan Andrew, Jane Theilman,Gordon Hutchinson, Biaoyang Lin, Lorne Clarke, Olaf Riess, David Kowbel,Howard Henderson, Amy Hedrik, Robbin Ma and Bruce Wilson. I willnever forget Vancouver.Finally, I would like to thank my family and especially my wifeStacey who embarked on this adventure with me ten years ago, in SanFrancisco, at great self sacrifice. Stacey has been a constant source ofencouragement and support and as such really made this doctoral thesis areality.This thesis was supported in part by studentships from theHuntington Society of Canada and the Medical Research Council of Canada.xiChapter 1Huntington’s Choreala1.1.0 Introduction1.1.1 HistoryThe word chorea is derived from the Greek word otoa meaningdance and describes an affliction characterized by irregular writhingand grimacing due to a loss of voluntary muscle control. The termchorea was first used to describe movement disorders by professorParacelsus, at Basel, in the sixteenth century (Hayden, 1981).Paracelsus argued, contrary to prevailing Galenic psychophysiology,that chorea was of a physiological rather than a supernatural aetiology.However, unfortunately for the witches of Salem, this theory did notgain acceptance for 200 years. In mid-seventeenth century NewEngland it was widely believed that chorea manifested as theconsequence of an individual having renounced God. The horriblesuffering of the choreic individual was believed to represent thesuffering of Christ during the crucification. As a consequence of thissuperstition women displaying choreform movement disorders, some ofwhom may have suffered from hereditary chorea were branded witchesand executed during the Salem witch-hunt (Maltsberger, 1961).In the year 1832 the first clinical description of hereditarychorea was published in the medical journal Lancet (Elliotson, 1832).Elliotson wrote that the chorea “appears to arise for the most partfrom something in the original constitution of the body, for I’ve oftenseen it hereditary” (Elliotson, 1832). Forty years later, in 1872, thefirst comprehensive description of hereditary chorea appeared in theMedical and Surgical Reporter authored by the physician Dr. GeorgeHuntington (Huntington, 1872). Huntington’s paper was abstracted into1German (Kussmaul, 1872) and consequently, in the European tradition,hereditary chorea acquired the eponymic name Huntington’s chorea.1.1.2 PathologyHuntington’s chorea, now referred to as Huntington disease (HD),is an autosomal dominantly inherited neuropsychiatric disordercharacterized clinically by chorea and intellectual decline andneuropathologically by progressive and selective neuronal cell deathparticularly in the basal ganglia (Hayden 1981). The atrophy of thecaudate nucleus, putamen and nucleus accumens, which togethercomprise the basal ganglia, is considered the most characteristicpathological feature of HD (Bruyn, 1968).Postmortem neuropathological and neurochemical analyses haverevealed that, within these structures, it is the medium and smallsized spiny neurons that are the most severely depleted in the brains ofHD patients (Martin and Gusella, 1986). It is evident from figure 1-1and 1-2 that although the basal ganglia is the most profoundly affectedregion of the HD brain neuronal cell loss occurs throughout the centralnervous system (CNS) resulting in atrophy of the cerebral cortex,globus pallidus, thalamus, subthalamus, substantia nigra, cerebellum,brain stem, spinal cord, and lateral tuberal nucleus of the hypothalmus(reviewed in Harper, 1991). In advanced cases this global neuronalatrophy can reduce total brain mass by as much as 25% (Fig. 1-2).Concomitant with the neuronal cell loss is a compensatory enlargementof the ventricles and sulci of the CNS (Fig. 1-1). Despite numerousstudies investigating neurochemical and neuroanatomical changes2associated with HD no knowledge concerning the molecular basis forthese changes has emerged.Three cardinal features heralding the onset of HD, and whichmanifest as a consequence of neuronal cell death, are disturbances inpersonality, mentation and the presence of involuntary movements.Cognitive impairments can precede chorea by many years and ofteninclude loss of memory, particularly short term, and organizationalskills. In this regard it is significant that neuornal cell loss has beenfound in a presymptomatic individual at increased risk for havinginherited the HD gene (Albin et al., 1992). As the disease progresses,the chorea becomes progressively more severe and for the majority ofpatients leads to dystonia and bradykinesia. Speech is initiallyimpaired by dysathria leading eventually to mutism. A particularlycruel aspect of HD is that patients will usually maintain orientation oftime and place and also maintain recognition of their own identitiesand that of family and friends (Hayden 1981).On average the first clinical features of this disease manifest inthe third to fourth decade of life and evolve steadily over a period of15-20 years inevitably culminating in death. Death is usually due toaspiration pneumonia, choking, heart attack, hematomas or suicide(Wexler, 1988). Perhaps the most insidious aspect of HD is that itssymptoms generally manifest at an age after which many asymptomaticcarriers of the HD mutation will have started families. For theseindividuals, the onset of symptoms brings the realization, that not onlyare they themselves affected with HD but, moreover, that each of theirchildren are at 50% risk for developing this devastating disease.3Figure 1-1Striatal atrophy in Huntington’s disease. Severe neuronal atrophy ofthe caudate and putamen are evident in comparing the HD brain (left) tothe control brain (right). Note also the compensatory enlargement anddeepening of the verticals and sulci in the diseased brain.4Figure 1-2.: 20 ‘3 •I” 410 5010 70 IC 00 722 III IDO 00 740 730 02 70The two brains shown in this figure were removed from age matchedfemales. The brain at the bottom was removed from a patient who died•as a consequence of HD. Note the dramatic reduction in total brainmass resulting from global neuronal atrophy (Harper, 1991).51.2.0 The Genetics of Huntington Disease1.2.1 DominanceThe HD mutation is one of approximately 1200 human geneticdiseases displaying an autosomal dominant mode of inheritance(McKusick, 1986). This combined with the fact that there are noreported cases of incomplete penetrance (Hayden, 1981; Harper, 1991)means that if a child inherits the HD mutation from a single affectedparent that child will inevitably develop HD.Figure 1-3A typical HD pedigree displaying an autosomal dominant mode ofinheritance and 100% penetrance.1.2.2 HomozygosityWexier et al. (1987) identified a family consisting of two HOaffected parents and 14 children. DNA haplotype analysis revealed thatfour of the 14 siblings display haplotypes consistent with them beinghomozygous at the HD locus. The probability that as a consequence ofgenetic recombination none of the four offspring are infact homozygous for the HO mutation was calculated to be 1/500,0006(Wexier et al., 1987). Subsequent reexamination using closer markershas confirmed the original assessment of homozygosity. Significantly,this study concluded that the phenotype displayed by the putative HDhomozygotes was indistinguishable from that of HD heterozygotes(Wexler et at., 1987). Further, in an independent study of additionalsuspected HD homozygotes, Myers et at. (1989) found no evidence forthe homozygous condition being more severe than that of heterozygote.On the basis of these studies it seems reasonable to conclude that the1-ID mutation is fully dominant.The observation that HD heterozygotes and homozygotes arephenotypically indistinguishable makes the HD mutation unique in beingthe only genetically determined human trait known to show trueMendelian-dominant inheritance (Wexier et al., 1987; Jenkins andConneally, 1989). For every other dominant genetic disorder thehomozygote, because of dosage effects, is more severely affected thanthe heterozygote and frequently does not survive to birth (Pauli, 1983;Jenkins and Coneally, 1989). Achondroplasia (Pauli et at., 1983),Marfan syndrome (Chemke et at., 1984), camptobrachydactyly (Edwardsand Gale, 1972) and aniridia (Hodgson and Saunders, 1980) are examplesof genetic diseases which result from dominant mutations displayingprofound dosage effects in the homozygote.The observation of complete dominance is important because itmay provide insights into the nature of the HD mutation. One would not,for example, hypothesize that the HD mutation results in the loss of anessential structural protein or enzymatic function since thehomozygote is viable and there is no apparent dosage effect. Similarly,the phenotypic equivalence of HD homozygotes and heterozygotes is7inconsistent with mutations which result in a dominant-negativeeffect (Muller, 1932; Herskowitz, 1987). Interestingly, in Drosophila,fully dominant mutations almost exclusively result in either gain offunction (Bender et al., 1983; White and Akam, 1985), ortransinactivation such as observed at the Drosophila brown locus(Henikoff et al., 1989). Regulatory mutations resulting in ectopicexpression, either spatial or temporal, account for the vast majority offully dominant gain of function mutations in Drosophila (Dr. Hugh Brock,personal communication).1.2.3 Penetrance and ExpressivityWhile the HD mutation is 100% penetrant it displays variableexpressivity. Therefore all individuals who inherit the disease geneinevitably develop the disease; however, because the phenotype has avariable age of onset, the disease manifests over a wide range of ages(Farrer and Conneally, 1985; Hayden, 1981; Hayden et al., 1985; Hodgeet al., 1980). In approximately 35% of cases, clinical presentationoccurs either prior to the age of 20, classified as juvenile HD, or afterthe age of 50, termed late onset HD. Juvenile HD accounts forapproximately 10% of all HD cases (Bruyn, 1968) with the earliestreported age at onset being two years. In contrast to juvenile HD arelate onset patients some of whom remain asymptomatic into the eighthdecade of life.The observation that age at onset can be highly variable suggeststhe possibility that genetic background and/or environmentalinfluences can serve to either ameliorate or exacerbate the rate atwhich the disease progresses. Kindred studies have consistently8revealed that variation for age at onset is greater between familiesthan within families which has been interpreted as evidence forgenetic background (Reed et al., 1958). A role of genetic background isfurther supported by numerous monozygotic twin studies which haveconsistently shown concordance between co-twins for age at onset(reviewed in Hayden, 1981).1.2.4 AnticipationAnother striking, and perhaps telling, property of the HD mutationis anticipation. Anticipation denotes a genetic phenomena where thedisease phenotype is expressed with increasing severity and/or atearlier ages at onset in successive generations. Once rejected onideological grounds and ascribed to ascertainment bias (Penrose, 1948)the recent molecular characterization of the myotonic dystrophy(Harley et at., 1992; Brook et al., 1992; Eu et at., 1992; Mahadevan etat., 1992) and fragile X mutations (Kremer et al., 1991; Fu et al., 1991)have not only established a biological basis for anticipation but haveidentified a new class of human mutation called dynamic mutations(Richards et al., 1992). Dynamic mutations result from the inherentgenetic instability associated with simple tandem trinucleotiderepeats (reviewed in Richards and Sutherland, 1992; Sutherland andRichards 1992).An example of anticipation is seen in the HD pedigree shown infigure 1-4. In this pedigree the transmitting grandfather has onset at80 yrs, his son has onset at 35 yrs and his grandson has juvenile HD.Intriguingly, the father has been found to be the transmitting parent in9Figure 1-4The phenomena of anticipation is apparent in this family which soughtpredictive testing in Vancouver. The deceased grandfather developedmild chorea in his eighth decade, whereas, his son experienced onset ofclassical HD at the age of 50, and his grandson is severely affected at theage of 20.50’sOnset 201080% of juvenile HD cases (Merrit et at., 1969; Ridley et at, 1988; Ridleyet al., 1991) which has led to speculation that some epigeneticphenomena such as imprinting (Reik et at., 1987; Sapienza et at., 1987)is acting at the HD locus (Ridley et at., 1991). However, the recentdiscovery that dynamic mutations cause the anticipation associatedwith both fragile X (Kremer et al., 1991; Fu et al., 1991) and myotonicdystrophy (Harley et al., 1992; Brook et al., 1992; Eu et al., 1992;Mahadevan et al., 1992) has fueled speculation that an unstabletrinucleotide repeat may be causing anticipation in HD.1.2.5 HeterogeneityIf a genetic disease can manifest as a consequence of mutationsin multiple genes then that disease is said to display geneticheterogeneity. Retinitis pigmentosa (RP), for example, describes agenetically heterogenous group of human retinopathies displayingautosomal dominant, autosomal recessive and X-linked modes ofinheritance (Humphries et al., 1992). Moreover, in resolving anapparent paradox arising from conflicting genetic mapping data, Ott etat. (1989) determined that the X-linked form of RP is itself geneticallyheterogenous with mutations in either of two genes separated by anestimated 10Mb resulting in RP.Unrecognized genetic heterogeneity can complicate positionalcloning strategies and have devastating consequences for familiesundergoing predictive testing protocols. For these reasons Conneally etat. (1989) performed an intensive assessment of the HD locus todetermine if locus heterogeneity is a factor in HD. No evidence forlocus heterogeneity was revealed although a second rare locus could11not be ruled out (Conneally et at., 1989). In this regard it is interestingto note a recent report by Pritchard et al. (1992) in which haplotypeanalyses on a family segregating HD suggests the possibility of asecond unlinked HD locus.1.3.0 Positional Cloning1.3.1 OverviewThe absence of pre-existing knowledge concerning thebiochemical basis of HD precluded the utilization of gene cloningstrategies based on either peptide sequence or functionalcomplementation. Fortunately, a genetic approach termed “positionalcloning” has emerged as a highly successful strategy for theidentification of disease genes where no biochemical knowledge existsa priori. A few of the genes identified by positional cloning includechronic granulomatous disease (Royer-Pokora et al., 1986), Duchennemuscular dystrophy (Monaco et al., 1986), retinoblastoma (Friend et at.,1986), cystic fibrosis (Rommens et al., 1989; Riordan et at., 1989;Kerem et al., 1989), neurofibromatosis type 1 (Wallace et at., 1990;Cawthon et al.,1990) and myotonic dystrophy (Harley et al., 1992;Buxton et aL, 1992; Aslanidis et al.,1992; Brook et at., 1992; Fu et at.,1992; Mahadevan et al., 1992) see table 1-1. Because positionalcloning strategies follow a systematic protocol in which one movesfrom the phenotype to the mutant gene it can be conveniently dividedinto three distinct phases. In the first phase a heritable disease locusis genetically mapped; in the second phase physical mapping andmolecular cloning are performed; and in the third phase candidate genes12are identified and assessed by sequence analysis for the presence ofmutations. This process is diagrammed as a flow chart in figure 1-5.13Figure 1-5Positional cloning flow chart illustrating steps typically required toclone a disease gene for which no information concerning thebiochemical defect is available. Note that genes identified andexcluded, in a positional cloning strategy, may subsequently becomecandidate genes for other genetic diseases.Candidates ForOther Diseases141.3.2 Genetic LocalizationIn the first phase of positional cloning the disease gene islocalized to either an entire chromosome or a chromosome region byperforming linkage analyses. This is accomplished by first identifyingpedigrees in which the disease gene is segregating followed by thescreening of these pedigrees with polymorphic markers (Botstein et at.,1980; Weber and May, 1989) and the subsequent calculation of thelogarithm of the odds radio, or LCD score (Morton, 1955). A LCD scoreequal to or greater than 3 means that the odds of linkage are 1000:1and is interpreted as evidence for linkage. Evidence suggesting thepossibility of linkage is given by a LCD score of 2 whereas a LCD scoreof -2 is considered evidence against linkage.Having established chromosomal linkage, refined genetic mappingis required to identify the smallest possible “candidate region” likelyto encode the disease loci. Initially this may simply entail repeatingthe linkage analysis with additional genetically and or physicallylocalized informative polymorphic markers with the goal of localizingthe disease loci to a single chromosome band. Ultimately, however, theresolution of genetic linkage mapping is limited by the number ofinformative meioses in the pedigree material. As a consequence themaximum resolution of linkage mapping is at best restricted to 1-2 cMwhich precludes the mapping of neighboring loci.Commonly employed methodologies supplanting genetic linkagefor refined genetic localization may include in situ hybridization,somatic cell hybrid mapping, the study of recombinant chromosomes,the assessment of nonrandom allelic association, and the search forcytogenetically resolvable chromosomal rearrangements. The latter of15these approaches has been especially productive facilitating theidentification of virtually every human disease gene isolated, bypositional cloning, with only two exceptions; the myotonic dystrophygene (Harley et al., 1992; Buxton et al., 1992; Aslanidis et al.,1992;Brook et aL, 1992; Eu et al., 1992; Mahadevan et at., 1992)., and thecystic fibrosis gene (Rommens et at., 1989; Riordan et at., 1989; Keremet al., 1989).1.3.3 Physical Mapping and CloningOnce a heritable disease locus has been genetically mapped thenext phase in most positional cloning strategies involves theconstruction of physical maps and chromosome walking. A necessaryprerequisite for both the construction of physical maps andchromosome walking is the development of molecular markers from theregion of interest. This process is accelerated by the isolation ofcloned DNA from genetic libraries enriched for the chromosomal regionunder investigation. Such libraries include chromosome-specificlibraries (Deaven et al., 1986), libraries constructed from somatic cellhybrids carrying limited human chromosomal content (Midgeon andMiller, 1968) and recently, chromosome band specific libraries (Guan etal., 1992). While the isolation of regionally localized markers iscrucial for constructing physical maps it also contributes to thecontinuous process of refined genetic mapping (Fig. 1-6).Physical mapping involves the construction of long-rangerestriction maps and the subsequent ordering of molecular markersrelative to each other and to known chromosomal landmarks. This isaccomplished by digesting genomic DNA with rare-cutting restriction16endonucleases such as Noti, BssHll and Eagl. These enzymes cleavemammalian genomic DNA infrequently because they recognize relativelylong recognition sequences containing one or two CpG dinucleotideswhich are underepresented in genomic DNA. In addition methylation ofthe cytosine at CpG dinucleotides inhibits rare-cutter restrictionendonuclease cleavage, further reducing the frequency at which theseenzymes cleave genomic DNA. The digested chromosomal DNA is thenfractionated by pulsed field gel electrophoresis (PFGE) (Schwartz andCantor 1984; Carle and Olson, 1984) which can resolve DNA moleculesas large as 10 Mb. Following pulsed field fractionation the cleaved DNAis Southern blotted and probed with available markers for thesubsequent derivation of long-range restriction maps (Gemmill et al.,1987).The significance of PFGE to physical mapping is that it bridgesthe resolution gap between the capabilities of genetic mapping andmolecular biology (Barlow and Lehrach, 1987). This is importantbecause genetically determined intervals can frequently be assignedprecise physical distances, it provides the resolution required todetect chromosomal abnormalities not detectable by either Southernblot or cytogenetic analysis, CpG islands are identified whichfacilitates the identification of genes and chromosome walking canproceed with reference to mapped chromosomal landmarks.A chromosome walk (Bender et al., 1983) across a candidateregion is performed so that a focused search for genes as well as smallchromosomal rearrangements may proceed. A chromosome walk isinitiated by screening a large insert genomic library constructed ineither cosmid (Collins and Hohn, 1978), yeast artificial chromosme17(YAC) (Burke et al., 1987), P1 (Sternberg.,1990 ) or bacterial artificialchromosome (BAC) (Shizuya et al., 1992 ) cloning vectors with amapped probe followed by the isolation of hybridizing recombinants.End fragments are then prepared from the isolated large insert clonesand used to rescreen the library so as to clone adjacent DNA. Becauseof the bidirectional nature of most chromosome walks, reference to thephysical map is essential for orienting the contig and establishing adirected chromosome walk. Chromosome walking generally proceeds formultiple rounds culminating in a series of overlapping contiguousclones, which span the candidate region, and are conceptually arrangedin the same order as they occur on the chromosome.1.3.4 Identification and Assessment of Candidate Genes.In the last phase of positional cloning “candidate genes” areidentified and assessed for mutations. Several techniques currentlyexist for the identification of genes in genomic DNA including classicalcDNA library screening, exon trapping (Duyk et al., 1990), region-specific cDNA libraries (Corbo et al., 1990; Liu et al., 1989) directcDNA selection (Parimoo et al., 1991; Lovett et al., 1991; Lovett et al.,1992) and the use of computer programs to predict the presence ofexons in genomic sequence (Hutchinson, G.B. and Hayden, M.R. 1992;Uberbacher, E.C. and Mural, R.J., 1992). Each of these techniques bringvery specific advantages and limitations to the identification of genesin naive genomic DNA (reviewed in Hochgeschwender and Brennan, 1991;Hochgeschwender,1 992).181.3.5 Zoo Blot AnalysisIn assessing a candidate region for the presence of putativecandidate genes methods are required that are both efficient andindependent of the transcriptional status of any encoded genes.Genomic fragments of 40 kb can be rapidly assessed for the presence ofexons by zoo bot analysis (Monaco et aL, 1986; Fig. 3-9). The principalof zoo blot analysis is that coding DNA will be conserved betweenphylogenetically related species whereas noncoding DNA, for exampleintrons and repetitive elements, will not be conserved. Because zooblot analysis is a rapid, technically simple, assay not influenced by agene’s transcriptional status, it remains the method of choice foridentifying the presence of exons in naive genomic DNA.1.3.6 CpG IslandsThe search for genes in naive genomic DNA, by zoo blot analysis,is focused by the identification of landmarks on the physical mapcalled CpG islands (Bird, 1986;1987). CpG islands are regions of DNAgreater than 200 bp in length that display a G+C content in excess of50% with a ratio of observed to expected CpG dinucleotides close tothat statistically predicted (Bird, 1987). As a consequence of theirbase composition, recognition sites for restriction endonucleases suchas BssHlI, Sacll, Eagi and NotI, which occur infrequently in bulkgenomic DNA, tend to cluster in CpG islands. CpG islands have beenfound to be associated with all housekeeping genes and 40% of tissuespecific genes (Bird, 1987; Gardiner-Garden and Frommer, 1987; Larsonet al., 1992). Because of this strong association one can move rapidly19from the identification of CpG islands, on the physical map, to theidentification of associated genes through zoo blot analysis.1.3.7 cDNA Library ScreeningThe identification of genomic fragments encodingphylogenetically conserved DNA often represents the critical first steptoward the isolation of candidate disease genes. While sequencing andsubsequent analysis of such genomic fragments may reveal encodedexons (Hutchinson and Hayden, 1992) and, in some cases, identities toknown genes, this approach is severely limited by the average lengthand organization of most mammalian genes, the paucity of availablesequence data and by current sequencing technologies.The screening of cDNA libraries by hybridization withradiolabeled genomic fragments, encoding phylogenetically conservedDNA, has proved to be a highly productive alternative to direct genomicsequencing. However, uncertainties concerning a gene’s spatial andtemporal pattern of expression can complicate the isolation ofcorresponding cDNAs. For many disease genes, the associated pathologyhas correctly suggested the primary site of expression (Monaco et al.,1986; Riordan et al., 1989; Vulpe et at., 1993), however, for otherdisease genes, such as HD the best assumptions may prove misleading.Transcript abundance and sequence complexity can alsocomplicate the isolation of cDNA clones. Consider for example, that ofthe estimated 50,000 mammalian genes only 10,000 are thought to beexpressed in any given cell type and at levels which range from200,000 mRNA molecules per cell to less than one mRNA molecule percell (Bishop et at., 1974; Galau et al., 1977). Moreover, one third of20these will be expressed as rare transcripts having an abundance of onlyone to ten mRNA molecules per cell. Providing that several hundredthousand recombinants are screened it is possible to isolate cDNAscorresponding to rare transcripts (Sambrook et al., 1989). However, ifone is working with a complex tissue having a heterogeneous cellpopulation then alternatives to cDNA library screening including directsequencing may be required. Clearly, the importance of zoo blotanalysis to this process can not be overstated. Phylogeneticconservation implies the presence of encoded genes and consequentlydictates that alternative methodologies be considered should cDNAlibrary screening not be productive. Moreover, it serves to focusresources on small defined loci within the larger chromosomal regiondelineated by positional cloning.1.3.8 Mutation AssessmentEvidence that a gene, identified through positional cloning, isindeed responsible for disease is ascertained through thecharacterization of structural lesions within the gene. Moreover, thedemonstration, in affected families, that the lesion and diseasecosegregate combined with a negative finding in a control populationsuggests a causal relationship between the lesion and subsequentpathology.A growing number of techniques exist for mutation detection.The identification of gross structural rearrangements, such as thoseassociated with Charco-Marie-Tooth (Lupski et aL, 1991) andlipoprotein lipase defiency (Devlin et al., 1990) can be accomplished bySouthern blot analysis. Mutations involving one to several base pairs21can be detected by denaturing gradient gel electrophoresis (Myers etal., 1985), RNase mismatch cleavage (Myers et a!., 1985; Winter et al.,1985), chemical cleavage (Saleeba et al., 1992), mismatch repairenzyme cleavage (MREC) (Lu and Hsu, 1992), single-strand conformationassays (SSCP) (Orital et al., 1989; Orita et al., 1989) and by directsequencing of PCR amplification products (Wong et al., 1987).Ultimately, the mutational assessment of a candidate gene may involvea combination of these methodologies with chromosomalrearrangements being scrutinized by Southern analysis and more subtlemutational events detected by SSCP and confirmed by directsequencing.To date every human disease gene identified by positional cloninghas been found to contain mutations in the transcribed portion of thegene. However, it is reasonable to expect that mutations in regulatoryelements quite distant from the transcription unit will be identified insome disease genes. It has been noted by Collins (1992) that dominantdisorders such as HD represent “particulary good candidates for thisworrisome scenario”.It is “worrisome” because although expression studies utilizingNorthern blot, RNA slot blot and quantitative PCR anayses may elevatethe status of a candidate gene, ultimate proof, in positional cloning,requires sequence analysis and the demonstration of mutations. Thisrequirement will add a significant degree of difficulty to themutational assessment of candidate disease genes encoding defectiveregulatory elements.221.4 Toward the HD GeneIn 1983 the HD mutation was localized to chromosome 4 bygenetic linkage anaysis and somatic cell hybrid mapping using a randomgenomic fragment G8 (D4S1O) (Gusella et al., 1983). D4S1O wassubsequently localized to chomosome band 4p16.1-4p16.3 by in situhybridization (Wang et al., 1986) while multipoint linkage analysis inconcert with mieotic mapping (Gilliam et al., 1987) localized the HDmutation to the 6 Mb region flanked by D4S1O and the 4p telomere.In a historical context, the mapping of HD is extremelysignificant as it represents the first linkage of a disease locus to anautosome using RFLP analysis. Prior to Gusella’s success many in thefield estimated it would require 10-50 years and the testing ofhundreds of markers to establish genetic linkage using RFLP analysis(Harper, 1991).The failure to identify cytogentically resolvable chromosomerearrangements associated with HD restricted searches for the gene tothe use of nonrandom allelic association and the identification ofinformative crossovers in HD families. Paradoxically, both thesemethods have consistently revealed not one but rather two mutuallyexclusive candidate regions separated by approximately 3Mb (MacDonaldet al., 1989; Skraastad et al., 1989; Whaley et aL, 1991; Bates et al.,1991; Snell et al., 1992; Weber et al., 1993; Theilman, et al., 1989;Snell et aL, 1989; Andrew et al., 1992). These studies have led tospeculation that there may be two closely linked HD genes oralternatively a single HD gene having an enormous physical expanse.Neither of these situations is without precedent.23Figure 1-61MbDiagrammatic representation of chromosome 4pi6.3 illustrating HDcandidate regions I and II. HD candidate region II has been definedthrough the analysis of recombinant chromosomes (shaded area),nonrandom allelic association (star), and by the discovery of a 4pl6.3recombination in a patient suffering from a sporadic case of HD(darkened area). Similarly, the study of recombinant chromosomes hasdelineated a second 2.2 Mb HD candidate region flanked by D4S98 andD4S1O (shaded area). Moreover, nonrandom allelic association studiessuggest that a mutation causing HD likely maps to the 700 kb intervalflanked by D4S182 and D4S180 (darkened area). Multiple independentstudies have consistently revealed evidence for nonrandom allelicassociation within the 700 kb interval at D4S95 (star). See text forreferences.*corC’C’Jc-C’ILflIN*OLOO. ‘-IQ.cncn44I I III I ICu,ODC’.J C‘— -III-4p24Analysis of recombinant HD chromosomes has defined candidateregion I as the 2.2 Mb interval flanked by D4S1O and D4S98 (Snell et al.,1992) illustrated in figure 1-6. Supporting this candidate region aremultiple independent assessments of nonrandom allelic associationbetween the HD mutation and D4595 (Theilmann et al., 1989; Snell etal., 1989; Andrew et al., 1992; Adam et al., 1991; Barron et al., 1991;Novelletto et al., 1991; MacDonald et al., 1991). These studies haveconsistently revealed a significant deviation from Hardy-Weinbergequilibrium suggesting that D4S95 and the HD mutation lie in closephysical proximity to each other.In an attempt to delineate the shortest segment within candidateregion I likely to encode the HD gene MacDonald et al. (1992) analyzed78 HD chromosomes by haplotype analysis using several multi-allelemarkers spanning region I. Twenty six haplotypes emerged includingtwo putative core haplotyes. The authors of this study correctlyconcluded, that if true ancestral core haplotypes had indeed beenidentified, that the HD gene most likely maps to the 700 kb intervalflanked by D4S180 and D4S182 (MacDonald et al., 1992).Compelling evidence for a second distal HD candidate region II hasbeen obtained by Weber et al. (1992) upon reexamination of an apparentnew mutation for HD (Wolff et al., 1989). Wolff et al. (1989) reported asporadic case of HD in a very large three generation family with noprior history of HD (Fig. 1-7). Long range haplotyping on the patient’sand 11 of her sibling’s DNA with polymorphic markers spanning the 6Mbp between D4S1O and D4S141 revealed that the affected individualhad inherited a recombinant chromosome 4p16.3 from one of her25Figure 1-72_NI 6 8.,o,.‘Ac10 70 07 1087 f 81 1895 ‘90 ‘70I • 2 3•5 6 . 7I,. 8 9 . 10.111 121. 131 14 18 15.19 16 Ii20 21 222cf13 24 25 ‘6 27 20 30 132 T 36 37 38 41I 23 47BJ9 IO,1112 )314I5 l&jI8 19 23 28 29 30 31 37 3343S36 40 4) 42 43 44 45 46 4?37 33313025244035312925283635312724 33 3) 302430c303429 282530242010<30)81524131413 2 25241 2 3 4 S101 2)098753A five generation 80 member pedigree constructed by Wolff et al.(1989). The proband suffering from an apparent new mutation at the HDlocus is indicated by an arrow. The probands parents died of cardiacarrest at ages 87 and 81. None of the probands 16 siblings, whose agesrange from 48 to 69, exhibit symptoms of HD. Numbers at the lowerright of pedigree members indicate year of birth. Age at death isindicated by a number at the lower left of the deceased pedigreemembers. Numbers directly below pedigree members indicates theirage at the time of the original report. A black dot at the upper right ofpedigree members indicates that serological and DNA typing wasperformed by Wolff et at. (1989).26unaffected parents (Weber et al., 1992). Moreover, this study localizedthe site of recombination to a 50 kb interval 700 kb from the 4ptelomere in region II.Additional corroborating evidence for candidate region II has beenobtained through the study of recombinant HD chromosomes andnonrandom allelic association. The characterization of threerecombinant HD chromosomes (MacDonald et al., 1989; Weber et at.,1992) places the HD gene distal to the marker D4S111 in candidateregion II. Furthermore, Andrew et at. (1992) have revealed significantnanrandom allelic association between the HD mutation and twoindependent loci located within 700 kb of the 4p telomere in region II.The genetic data led to the conclusion that there are twocandidate regions for the HD gene separated by 3 Mbp. Toward theresolution of this seeming paradox, detailed genetic (MacDonald et at.,1989; Buetow et al., 1991) and physical maps have been constructed forboth regions I and II (Bucan et al., 1990; Bates et al., 1991; Trask et al.,1992) employing numerous molecular markers spanning the whole of4p16.3 (Wasmuth et al., 1988; PohI et al., 1988; Richards et at., 1988;Smith et at., 1988; Whaley et al.,1988; MacDonald et al., 1989;,Pritchard et at., 1989; Youngman et at., 1988; Lin et al., 1991; Weber etat., 1992) and culminating in the molecular cloning of each region aseither YAC or cosmid contigs (Bates et at.,1992; Pritchard et al., 1992;Weber et at., 1991).The cloning of HD candidate regions I and II in cosmid and YACcontigs shifted efforts to the identification and characterization ofcandidate HD genes in these regions. Recent large scale sequencingprojects on chromosomes 4p16.3 (McCombie et al., 1992) and 19q13.327(Martin-Gallardo et at., 1992) indicate that HD candidate regions I andII may have the capacity to encode one gene every 20 kb suggesting thatHD region I could encode 100 or more genes. Despite these formidablenumbers the HD Collaborative Group, (1993) recently identified the HDgene in candidate region I where it spans 200 kb including D4S95.Moreover, mutation analysis has revealed that the HD gene encodes atrinucleotide repeat that is expanded in all affected individualsassessed. Ironically, because Huntingtin shares no sequence similiarityto any known proteins its cellular function remains a completemystery.We shall not cease from explorationAnd the end of all our exploringWill be to arrive where we startedAnd know the place for the first timeT.S. EliotFour Quartets15.O Statement of PurposeIt has been the objective of the research described herein toidentify and characterize genes mapping within a candidate region forthe HD gene. The discovery in an individual with spontaneous HD of arecombinant chromosome 4p16.3 (Weber et at., 1992) focused thisresearch to the distal candidate region. The dual observations of anextremely low mutation rate at the HD locus (Hayden 1981) and a lowerthan average recombinaton rate in the terminal 1 Mbp of chromosome4p16.3 (Buetow et al., 1991) led to the formal hypothesis that a singleevent, namely, an unequal recombination at the HD locus, in one of the28parental meiosies, had itself caused HD in the affected individual. Thishypothesis is given credence by the well established causal linkbetween unequal recombination and subsequent human pathology (Hu andWorton et al., 1992). In reaching the stated objective a chromosomewalk was performed, a pair of informative RFLPs were isolated, a 80 kbcandidate region was defined and two genes in a candidate region forthe HD gene were identified and characterized.29Table 1-1Disease Loci Cloned By Positional Cloning1Year Cloned Disease Gene A B C Reference(s)1986 Chronic granulomatous disease + - 11986 Duchenne muscular dystrophy ÷ - - 21 986 Retinoblastoma + - - 31989 Cystic fibrosis - - + 41990 Wilms tumor + - - 51 990 Testis determining factor ÷ - - 61990 Choroideremia + - - 71990 Neurofibromatosis + - - 81991 Fragile X syndrome + + 91991 Familial polyposis coli + - - 1 01991 Kallmann syndrome + 111991 Aniridia + 121 992 Myotonic dystrophy - + + 1 31993 Norrie disease + - - 1 41993 Menkes disease + - - 151 993 Huntirigtons disease - + + 1 61= Table modified from Collins (1992).A= Cytogenetic rearrangement.B= Trinucleotide repeat.C= Nonrandom allelic association.1. Royer-Pokora et al. 1986; 2. Monaco et al., 1986; 3. Friend et al., 1986; 4. Rommens etal., 1989; Riordan et at., 1989; Kerem et at., 1989; 5. Rose et al., 1990; 6. Sinclair et al.,1990; 7. Cremers et al., 1990; 8. Wallace et al., 1990; Cawthon et al., 1990; 9. Verkerket al., 1991; Fu et al., 1990; 10. Kinzler et al., 1991; Groden et al., 1991; 11. Franco etal., 1991; Legouis et al., 1991,; 12. Ton et al., 1991; 13. Harley et al., 1992; Buxton etat., 1992; Aslanidis et at., 1992; Brook et al., 1992; Fu et al., 1992; Mahadevan et al.,1992; 14, Berger et al., 1993; Chen et al., 1993; 15. Vulpe et al., 1993; 16. TheHuntington’s disease collaborative group, 1993.301.6.0 ReferencesADAM, S., THEILMANN, J., BUETOW, K., HEDRICK, A., COLLINS, C., WEBER,B., HUGGINS, M., AND HAYDEN, M.R (1991). Linkage disequilibrium andmodification of risk for Huntington disease. Am. J. Hum. Genet. 48:595-603.ALBIN, R.L., REINER, A., ANDERSON, K.D., DURE, L.S., HANDELIN, B.,BALFOUR, R., WHETSELL, W.O., PENNEY, J.B., AND YOUNG, A.B. (1992).Preferential loss of striato-external pallidal projection neurons inpresymptomatic Huntington’s disease. Ann. Neurol. 31: 425-430.ANDREW, S., THEILMANN, J., HEDRICK, A., MAH, D., WEBER, B., ANDHAYDEN, M.R. (1992). Nonrandom association between Huntingtondisease and two loci separated by about 3 Mb on 4p16.3. Genomics 13:301-311.ASLANNIDIS, C., JANSEN, G., AMEMIYA, C., SHUTLER, G., MAHADEVAN, M.,TSILFIDIS, C., CHEN, C., ALLEMAN, J., WORMSKAMP, N.G.M., VOOIJS, M.,BUXTON, J., JOHNSON, K., SMEETS, H.J.M.., LENNON, G.G., CARRANO, A.V.,KORNELUK, R.G., WIERINGA, B., AND DE JONG, P.J. (1992). Cloning theessential myotonic dystrophy region and mapping of the putativedefect. Nature 355: 548-551.BARLOW, D.P., AND LEHRACH, H. (1987). Genetics by gel electrophoresis:The impact of pulsed field gel electrophoresis on mammalian genetics.Trends Genet. 3:167-171.BARRON, L, CURTIS, A., SHRIMPTON, A.E., HOLLOWAY, S., MAY, H., SNELL,R.G., AND BROCK, D.J. (1991). Linkage disequilibrium and recombinationmake a telomeric site for the Huntington’s disease gene unlikely. J. Med.Genet. 28: 520-522.BATES G.P, VALDES J., HUMMERICH, H., BAXENDALE, S., LE PASLIER, D.L.,MONACO, A.P., TAGLE, D., MACDONALD, M.E., ALTHERR, M., ROSS, M.,BROWNSTEIN, B.H., BENTLEY, D., WASMUTH, J.J., GUSELLA, J.F., COHEN, D.,COLLINS, F., AND LEHRACH, H. (1992). Characterization of a yeastartificial chromosome contig spanning the Huntington’s disease genecandidate region. Nature (Genetics) 1:180-187.BATES, G.P., MACDONALD, M.E., BAXENDALE, S., YOUNGMAN, S., LIN, C.,WHALEY, W.L, WASMUTH, J.J., GUSELLA, J.F., AND LEHRACH, H.(1991).31Defined physical limits of the Huntington disease gene candidateregion. Am. J. Hum. Genet. 49: 7-16.BENDER, W., AKAM, M., KARCH, F., BEACHY, P.A., PEIFER, M., SPIERER, P.,LEWIS, E.B., AND HOGNESS, D.S. (1983). Molecular genetics of thebithorax complex in drosophila melanogaster. Science 221: 23-29.BENDER, W., SPIERER, P., AND HOGNESS, D.S. (1983). Chromosomalwalking and jumping to isolate DNA from the ace and rosy loci and thebithorax complex in drosophilia melanogaster. J. Mol. Biol. 168:17-33.BIRD, A.P. (1987). CpG islands as gene markers in the vertebratenucleus. Trends Genet. 3: 342-347.BISHOP, J.O., MORTON, J.G., ROSBASH. M., AND RICHARDSON, M. (1974).Three abundance classes in HeLa cell messanger RNA. Nature 250:199-204.BOTSTEIN, D., WHITE, R.L., SKOLNICK, M., AND DAVIS, R.W. (1980).Construction of a genetic map in man using restriction fragment lengthpolymorphisms. Am. J. Hum. Genet. 32: 314-331.BROOK, J.D., MCCURRACH, M.E., HARLEY, H.G., BUCKLER, A.J., CHURCH, D.,ABURATANI, H., HUNTER, K., STANTON, V.P., THIRION, J.-P., HUDSON, T.,SOHN, R., ZEMELMAN, B., SNELL, R.G., RUNDLE, S.A., CROW, S., DAVIES, J.,SHELBOURNE, P., BUXTON, J., JONES, C., JUVONEN, V., JOHNSON, K.,HARPER, P.S., SHAW, D.J., AND HOUSMAN, D.E. (1992). Molecular basis ofmyotonic dystrophy: expansion of a trinucleotide repeat at the 3’ end ofa transcript encoding a protein kinase family member. Cell 68: 799-808.BRUYN, G.W. (1968). Huntington’s chorea. Historical, clinical andlaboratory synopsis, in: Vinken PJ, Bruyn GW (eds): Diseases of thebasal ganglia. Handbook of clinical neurology, Vol. 6. Amsterdam: NorthHolland, pp. 298-378.BUCAN, M., ZIMMER, M., WHALEY, W. L., POUSTKA, A., YOUNGMAN, S.,ALLITTO, B.A., ORMONDROYD, E., SMITH, B., POHL, T. M., MACDONALD, M.,BATES, G. P., RICHARDS, J., VOLINIA, S., GILLIAM, T. C., SEDLACEK, Z.,COLLINS, F.S., WASMUTH, J.J., SHAW, D.J., GUSELLA, J.F., FRISCHAUF,A.M., AND LEHRACH, H. (1990). Physical maps of 4p16.3, the areaexpected to contain the Huntington disease mutation. Genomics 6:1-15.32BUETOW, K.H., SHIANG, R., YANG, P., NAKAMURA, Y., LATHROP, G.M.,WHITE, R., WASMUTH, J.J., WOOD, S., BERDAHL, L.D., LEYSENS, N.J., RITTY,T.M., WISE, M.E., AND MURRAY, J.C. (1991). A detailed multipoint map ofhuman chromosome 4 provides evidence for linkage heterogeneity andposition-specific recombination rates. Am. J. Hum. Genet. 48: 911-925.BURKE, D.T., CARLE, G. F., AND OLSON, M.V. (1987). Cloning of largesegments of exogenous DNA into yeast by means of artificialchromosome vectors. Science 236:806-811.BUXTON, J., SHELBOURNE, P., DAVIES, J., JONES, C., VAN TONGEREN, T.,ASLANIDIS, C., DE JONG, P., JANSEN, G., ANVRET, M., RILEY, B.,WILLIAMSON, R., AND JOHNSON K. (1992). Detection of an unstablefragment of DNA specific to individuals with myotonic dystrophy.Nature 355: 547-548.CARLE, G.E., AND OLSON, M.V, (1984). Separation of chromosomal DNAmolecules from yeast by orthogonal-field-alternation gelelectrophoresis. Nucliec Acids Res. 12: 5647-5664.CAWTHORN, R.M., WEISS, R., XU, G., VISKOCHIL, D., CULVER, M., STEVENS,J., ROBERTSON, M., DUNN, D., GESTELAND, R., O’CONNEL, P., AND WHITE, R.(1990). A major segment of the neurofibromatosis type 1 gene: cDNAsequence, genomic structure, and point mutations. Ce!162: 193-201.CHEMKE, J., NISANI, R., FEIGL, A., GARTY, R., COOPER, M., BARASH, Y., ANDDUKSIN, D. (1984). Homozygosity for autosomal dominant Marfansyndrome. J. Med. Genet. 21: 173-177.COLLINS, F.S. (1992). Positional cloning: Let’s not call it reverseanymore. Nature (Genetics) 1: 3-6.COLLINS, J. AND HOHN, B. (1978). Cosmids: a type of plasmid gene-cloning vector that is packageable in vitro. Proc. Nat!. Acad. Sci. USA.75: 4242-4250.CONNEALLY, P.M., HAINES, J.L., TANZI, R.E., WEXLER, N.S., PENCHASZADEH,G.K., HARPER, P.S., FOLSTEIN, S.E., CASSIMAN, J.J., MYERS, R.H., ANDYOUNG, A.B., HAYDEN, M.R., FALEK, A., TOLOSA, E.S., CRESPI, S., Dl MAIO,L., HOLMGREN, G., ANVRET,M., KANAZAWA, I., AND GUSELLA, J.E. (1989).Huntington disease: no evidence for locus heterogeneity. Genomics 5:304-308.33CORBO, L., MALEY, A.M. NELSON, D.L., AND CASKEY, C.T. (1990). Directcloning of human transcripts with HnRNA from hybrid cell lines Science249: 652-654.CREMERS, F.P., VAN DE POL, D.J., VAN KERKHOFF, L.P., WIERINGA, B., ANDROPERS, H.H. (1990). Cloning of a gene that is rearranged in patientswith choroideraemia. Nature 347: 674-647.DEAVEN, L.L., VAN DILLA, M.A., BARTHOLDI, M.F., CARRANO, A.V., CRAM,L.S., FUSCOE, J.C., GRAY, J.W., HILDEBRAND, C.E., MOYZIS, R.K., ANDPERLMAN, J. (1986). Construction of human chromosome-specific DNAlibraries from flow sorted chromosomes. Cold Spring Harbor Symposiaon Quantitative Biology 51: 159-167.DEVLIN, R.H., DEEB, S., BRUNZELL, J., AND HAYDEN, M.R. (1990). Partialgene duplication involving exon-Alu interchange results in lipoproteinlipase deficiency. Am. J. Hum. Genet. 46: 112-119.DUYK, G.M., KIM, S., MYERS, R.M., AND COX, D.R. (1990). Exon trapping: Agenetic screen to identify candidate transcribed sequences in clonedmammalian genomic DNA. Proc. Nat!. Acad. Sci. USA 87: 8995-8999.EDWARDS, J.A., AND GALE, R.P. (1972). Comtobrachydactyly: a newautosomal dominant trait with two probable homozygotes. Am. J. Hum.Genet. 24: 464-474.ELLIOTSON, J. (1832). St Vitus’s dance. Lancet 1:162-1 65FARRER L.A., AND CONNEALLY, P.M. (1985). A genetic model for age atonset in Huntington disease. Am. J. Hum. Genet. 37: 350-357.FRANCO, B., GUIOLI, S., PRAGLIOLA, A., INCERTI, B., BARDONI, B.,TONLORENZI, R., CARROZZO, R., MAESTRINI, E., PIERETTI, M., ANDTAILLON-MILLER, P. (1991) A gene deleted in Kallmann’s syndromeshares homology with neural cell adhesion and axonal path-findingmolecules. Nature 353: 529-536.FRIEND, S.H., BERNARDS, R., ROGEU, S., WEINBERG, R.A., RAPAPORT, J.M.,ALBERT, D.M., AND DRYJA, T.P. (1986). A human DNA segment withproperties of the gene that predisposes to retinoblastoma andosteosarcoma. Nature 323: 643-646.34FU, ‘(-H., PIZZUTTI, A., FENWICK, R.G., KING, J., RAJNARAYAN, S., DUNNE,P.W., DUBEL, J., NASSAR, G.A., ASHIZAWA, T., DE JONG, P., WIERINGA, B.,KORNELUK, R., PERRYMAN, M.B., EPSTEIN, H.F., AND CASKEY, C.T. (1992).An unstable triple repeat in a gene related to myotonic musculardystrophy. Science 255: 1256-1258.FU, Y.-H., KUHL, D.P.A., PUZZUTI, A., PIERETTI, M., SUTCLIFFE, J.S.,RICHARDS, S., VERKERK, A.J.M.H., HOLDEN, J.J.A., FENWICK, R.G., WARREN,S.T., OOSTRA, B.A., NELSON, D.L., AND CASKEY, C.T. (1991). Variation ofthe CGG repeat at the fragile X site results in genetic instability:resolution of the Sherman paradox. Cell 68: 1047-1058.GALAU, G. A., KLEIN, W.H., BRITTEN, R.J., AND DAVIDSON, E.H. (1977).Significance of rare mRNA sequences in liver. Arch. Biochem. Biophys.179: 585-599.GARDINER-GARDEN, M., AND FROMMER, M. (1987). CpG islands invertebrate genomes. J. Mo!. Biol. 196: 261-262.GEMMILL, R.M., COYLE-MORRIS, J.F., MCPEEK, J.R., F. D., WARE-URIBE, L. F.,AND HECHT, F. (1987). Construction of long-range restriction maps inhuman DNA using pulsed field gel electrophoresis. Gen. Anal. Techn.4:119-131.GILLIAM, T.C., TANZI, R.E., HAINES, J.L., BONNER, T.l., FARYNIARZ, A.G.,HOBBS, W.J., MACDONALD, M.E., CHENG, S.V., FOLSTEIN, S.E., CONNEALLYP.M., WEXLER, N.S., AND GUSELLA, J. (1987). Localization of theHuntington’s disease gene to a small segment of chromosome 4 flankedby D4S1O and the telomere. Cell 50: 565-571.GRODEN, J., THLIVERIS, A., SAMOWITZ, W., CARLSON, M., GELBERT, L.,ALBERTSEN, H., JOSLYN, G., STEVENS, J., SPIRIO, L., AND ROBERTSON, M.(1991). Identification and characterization of the familial adenomatouspolyposis coli gene. Cell: 66: 589-600.GUSELLA, J.F., WEXLER, N.S., CONNEALLY, P.M., NAYLOR, S.L., ANDERSON,M.A., TANZI, R.E., WATKINS, P.C., OTTINA, K., WALLACE, M.R., SAKAGUCHI,A.Y., YOUNG, A.B., SHOULSON, I,, BONILLA, E., AND MARTIN J.B. (1983). Apolymorphic DNA marker genetically linked to Huntington’s disease.Nature 306: 234-238.HARLEY, H.G., BROOK, J.D., RUNDLE, S.A., CROW, S., REARDON, W.,BUCKLER, A.J,. HARPER, P.S,, HOUSMAN, D.E., AND SHAW D.J. (1992).35Expansion of an unstable DNA region and phenotypic variation inmyotonic dystrophy. Nature 355: 545-546.HARPER, P.S. (1991). Major Problems in Neurology, Huntington Disease,22nd ed. W.B. Saunders Company.HAYDEN, M.R., SOLES, J.A., WARD, R.H. (1985). Age of onset in siblings ofpersons with juvenile Huntington disease. Gun. Genet. 28:100-105.HAYDEN,M.R. (1981). Huntington’s chorea, Berlin Heidelberg: Springer-Verlag.HERSKOWITZ, I. (1987). Functional inactivation of genes by dominantnegative mutations. Nature 329: 219-222.HOCHGESCHWENDER, U. (1992). Toward a transcriptional map of thehuman genome. Trends Genet. 8:41-44.HODGSON, S.V.,AND SAUNDERS, K.E. (1980). A probable case of thehomozygous condition of the aniridia gene. J .Med .Gene. 17: 478-480.HUMPHRIES, P., KENNA, P., AND FARRAR, G.J. (1992). On the moleculargenetics of retinitis pigmentosa. Science 256: 804-808.HUNTINGTON G. (1872). On chorea. Med .Surg .Rep. 26: 317-321.HUTCHINSON, G.B., AND HAYDEN, M.R. (1992). The prediction of exonsthrough analysis of spliceable open reading frames. Nucleic Acids Res.20: 3453-3462.HU, X., AND WORTON, R.G. (1992). Partial gene duplication as a cause ofhuman disease. Human Mutation 1: 3-12JENKINS, J.B., AND CONNEALLY, P.M. (1989). The paradigm of Huntingtondisease. Am. J .Hum .Genet. 45: 169-175.KEREM, B-S, ROMMENS, J.M., BUCHAN, J.A., MARKIEWICZ, D., CCX, T.K.,CHAKRAVARTI, A., BUCHWALD, M., TSUI, L-C. (1989). Identification ofthe cystic fibrosis gene: genetic analysis. Science 245: 1073-1080.KINZLER, K,W., NILBERT, M.C., S.U., L.K., VOGELSTEIN, B., BRYAN, T.M.,LEVY, D.B., SMITH, K.J., PREISINGER, A.C., HEDGE, P., AND MCKECHNIE, D.36(1991). Identification of FAP locus genes from chromosome 5q21.Science 253: 661-665.KREMER, E.J., PRITCHARD, M., LYNCH, M., YU, S., HOLMAN, K., BAKER, E.,WARREN, S.T., SCHLESSINGER, D., SUTHERLAND, G.R, AND RICHARDS, R.l.(1991). Mapping of DNA instability at the fragile X to a trinucleotiderepeat sequence p(CCG)n. Science 252: 1711-1714.KUSSMAUL, A., AND NOTHNAGEL, C.W.H (1872).Virchow-Hirsch’sYahrbuch für 1872. New York p. 175.LARSON, F., GUNDERSEN, G., LOPEZ, R., AND PRYDZ, H. (1992). CpG islandsas gene markers in the human genome. Genomics 13: 1095-1107.LEGOUIS, R., HARDELIN, J.P., LEVILLIERS, J., CLAVERIE, J.M., COMPAIN, S.,WUNDERLE, V., MILLASSEAU, P., LE PASLIER, D., COHEN, D., ANDCATERINA, D.(1991). The candidate gene for the X-linked Kallmannsyndrome encodes a protein related to adhesion molecules. Cell 67:423-435.LIN, C.S., ALTHERR, M., BATES,G., WHALEY, W.L., READ, A.P., HARRIS, R.,LEHRACH H., WASMUTH, J.J., GUSELLA. J.F., AND MACDONALD, M.E. (1991).New DNA markers in the Huntington’s disease gene candidate region.Somat. Cell. Mol. Genet.17: 481 -488.LIU, P., LEGERSKI, R., AND SICILIANO, M. J. (1989). Isolation of humantranscribed sequences from human-rodent somatic cell hybirds. Science246: 813-815.LU, A.-L., AND HSU, I-C. (1992). Detection of single DNA base mutationswith mismatch repair enzymes. Genomics 14: 249-255.LUPSKI, J.R., DE OCA-LUNA, R.M., SLAUGENHAUPT, S., PENTAO, L.,GUZZETTA, V., TRASK, B., SAUCEDO-CARDENAS, 0., BARKER, D.F.,KILLIAN, J.M., GARCIA, C.A., CHAKRAVARTI, A., AND PATEL, P.!. (1991).DNA duplication associated with Charcot-Marie-Tooth disease type 1A.Cell 66: 219-232.MACDONALD, M.E., CHENG, S.V., ZIMMER, M., HAINES, J.L., POUSTKA. A.,ALLITTO, B., SMITH, B., WHALEY ,W.L., ROMANO, D.M., AND JAGADEESH, J.,LEHRACH, H., WASMUTH, J.J., FRISCHAUF, A-M., AND GUSELLA, J.F.(1989). Clustering of multiallele DNA markers near the Huntington’sdisease gene. J. Clin. Invest. 84:1013-1015.37MACDONALD, M.E., UN, C.S., SRINIDHJ, L, BATES, G., ALTHERR, M.,WHALEY,, W.L., LEHRACH, H., WASMUTH, J., AND AND GUSELLA, J.F.(1991). Complex patterns of linkage disequilibrium in the Huntingtondisease region. Am. J. Hum. Genet. 49: 723-734.MACDONALD, M.E., NOVELLETTO, A., LIN, C., TAGLE, D., BARNES, G., BATES,G., TAYLOR, S., ALLITTO, B., ALTHERR, M., MYERS, R., LEHRACH, H.,COLLiNS, F., WASMUTH, J.J., FRONTALI, M., AND GUSELLA, J.F (1992). TheHuntington’s disease candidate region exhibits many differenthaplotypes. Nature (Genetics) 1: 99-103.MACDONALD,M.E., HAINES,J.L., ZIMMER, M,. CHENG, S.V., YOUNGMAN, S.,WHALEY W.L., WEXLER, N,. BUCAN, M., ALLITTO, B.A., SMITH, B., LEAVITT,J., POUTSKA, A., HARPER, P., LEHRACH, H., WASMUTH, J.J., FRISHAUF, AM, AND GUSELLA, J.F.. (1989). Recombination events suggest potentialsites for the Huntington’s disease gene. Neuron 3: 183-190.MAHADEVAN, M., TSILFIDIS, C., SABOURIN, L., SHUTLER, G., AMEMIYA, C.,JANSEN, G., NEVILLE, C., NARANG, M., BARCELO, J., O’HOY, K., LEBLOND, S.,EARLE-MACDONALD, J., DE JONG, P.J., WIERINGA, B., AND KORNELUK, R.G.(1992). Myotonic dystrophy mutation: an unstable CTG repeat in the 3’untranslated region of the gene. Science 255: 1253-1255.MALTZSBERGER, J.T. (1961). Even unto the twelth generation-Huntington’s chorea. Journal of the History of Medicine and AlliedSciences. 16: 1-17MARTIN-GALLARDO, A., MCCOMBIE, W.R., GOCAYNE, J.D.,FITZGERALD, M.G., WALLACE, S., LEE, B.M.B., LAMBERDIN, J., TRAPP,S, KELLEY, J.M., LIU, L-l., DUBNICK, M., JOHNSTON-DOW, L.A.,KERLAVAGE, A.R., DE JONG, P., CARRANO, A., FIELDS, C., ANDVENTER, J.C. (1992). Automated DNA sequencing and analysis of106 kilobases from human chromosome 19q13.3. Nature (Genetics)1: 34-39.MCCOMBIE, W.R., MARTIN-GALLARDO, A., GOCAYNE, ID., FITZGERALD, M.,DUBNICK, M., KELLEY, J.M., CASTILLA, L., LIU, L.l., WALLACE, S., TRAPP,S., TAGLE, D., WHALEY, W. L., CHENG, S., GUSELLA, J., FRISCHAUF, A.-M.,POUSTKA, A., LEHRACH, H., COLLINS, F.S., KERLAVAGE, A.R., FIELDS, C.,AND VENTER, J.C. (1992). Expressed genes, Alu repeats and38polymorphisms in cosmids sequenced from chromosome 4p16.3. Nature(Genetics). 1: 348-353.MCKUSICK, V.A. (1986). Mendelian inheritance in man: Catalogs ofautosomal dominant recessive, autosomal recessive, and X-linkedphenoypes (Baltimore: The Johns Hopkins University Press). p. xi.MELTZER, P.S., GUAN, X.-Y., BURGESS, A., AND TRENT, J.M. (1992). Rapidgeneration of region specific probes by chromosome microdissectionand their application. Nature (Genetics) 1: 24-28.MIGEON, B.R. AND MILLER, S.C. (1968). Human-mouse somatic cellhybrids with single human chromosome (group E): link with thymidinekinase activity. Science 162: 1005-1006.MONACO, A.P., NEVE, R.L., COLLETTI-FEENER, C., BERTELSON, C.J.,KURNITT, D.M., AND KUNKEL, L.M. (1986). Isolation of candidate cDNAsfor portions of the Duchenne muscular dystrophy gene. Nature 323:646-650.MORTON, N.E. (1955). Sequential tests for the detection of linkage. Am.J.. Hum. Genet. 7: 277-318.MYERS, R.M., LUMELSKY, N., LERMAN, L.S., AND MANIATIS, T. (1985).Detection of single base substitutions in total genomic DNA. Nature313: 495-498.MYERS,R.H., LEAVITT, J., FARRER,L.A., JAGADEESH, J., MCFARLANE,H.,MASTROMAURO, C.A., MARK, R.J., AND GUSELLA, J.F. (1989). Homozygotefor Huntington disease. Am. J. Hum. Genet. 45: 61 5-618.NOVELLETTO, A., MANDICH, P., BELLONE, E., MALASPINA,P., VIVONA, G.,AJMAR, F., AND FRONTALI, M. (1991). Non-random association betweenDNA markers and Huntington disease locus in the Italian population. Am.J. Med. Genet. 40: 374-376.OBERLE, I, ROUSSEAU, F, HEITZ, D, KRETZ, C, DEWS, D, HANAUER, A,BOUE, J, BERTHEAS, ME, AND MANDEL JL. (1991). Instability of a 550-base pair DNA segment and abnormal methylation in fragile X syndrome.Science 252: 1097-1102.ORITA, M., IWAHANA, H., KANAZAWA, H., HAYASHI, K., AND SEKIYA, T.(1989). Detection of polymorphisms of human DNA by gel39electrophoresis as single-strand conformation polymorphisms. Proc.Nat!. Acad. Sd. USA: 86: 2766-2770.ORITA, M., SUZUKI, Y., SEKIYA, T., AND HAYASHI, K. (1989). Rapid andsensitive detection of point mutations and DNA polymorphisms usingthe polymerase chain reaction. Genomics 5: 874-879.OTT, J., BHATTACHARYA, S., CHEN, J.D., DENTON, M.J., DONALD, J., DUBAY,C., FARRAR, G.J., FISHMAN, G.A., FREY, D., GAL, A., HUMPHRIES, P., JAY, B.,M., JAY, LITT, M., MACHLER, M., MUSARELLA, M., NEUGEBAURER, M.,NUSSBAUM, R.L., TERWILLINGER, J.D., WELEBER, R.G., WIRTH, B., WONG, F.,WORTON, R.G., AND WRIGHT, A.F. (1990). Localizing multiple Xchromosome-linked retinitis pigmentosa loci using multilocushomogeneity tests. Proc. Nat!. Acad. Sd. USA 87: 701-704.PAULI, R.M. (1983). Dominance and homozygosity in man. Am. J. Med. Gen.16: 455-458.PAULI, R.M., CONROY, M.M., LANGER, LO., MCLONE, D.G., NAIDICH, T.,FRANCIOSI, R., RATNER, l.M. AND COPPS, S.C. (1983). Homozygousachondroplasia with survival beyond infancy. Am. J. Med. Genet. 16:459-473.PENROSE, L.S. (1948). The problem of anticipation in pedigrees ofdystrophia myotonica. Ann. Eugenics 14: 125-132.POHL, T.M., ZIMMER, M., MACDONALD, M.E., SMITH, B., BUCAN, M., POUSTKA,A., VOLINIA, S., SEARLE, S., ZEHETNER, G., AND WASMUTH, J.J., GUSELLA,J., LEHRACH, H., FRISCHAUF, A-M. (1988). Construction of a NotI linkinglibrary and isolation of new markers close to the Huntington’s diseasegene. Nucleic Acids Res. 16: 9185-9198PRITCHARD, C., ZHU, N., ZUO, J., BELL, L., PERICAK-VANCE, M.A., VANCE,J.M., ROSES, A.D., MILATOVITCH, A., FRANCKE, U., COX, D.R., AND MYERS,R.M. (1992). Recombination of 4p.16 DNA markers in an unusual familywith Huntington disease. Am. J. Hum. Genet. 50:1218-1 230.PRITCHARD, C.A., CASHER, D., UGLUM, E., COX, D.R., AND MYERS, R.M.(1989). Isolation and field-inversion gel electrophoresis analysis ofDNA markers located close to the Huntington disease gene. Genomics 4:408-4 18.40REED, T.E., CHANDLER, J.H., HUGHES, E.M., AND DAVIDSON, R.T. (1958)Huntington’s chorea in Michigan. I. Demography and genetics. Am. J. Hum.Gen. 10: 201-225.REIK, W. (1988) Genomic imprinting: a possible mechanism for theparental origin effect in Huntington’s chorea. J. Med. Genet. 25: 805-808.RICHARDS, J.E., GILLIAM, T.C., COLE, J.L., DRUMM, M.L., WASMUTH, J.J.,GUSELLA, J.F., AND COLLINS, F.S. (1988) Chromosome jumping fromD4S1O (G8) toward the Huntington disease gene. Proc. Nat!. Acad. Sc!. U.S. A. 85: 6437-6441.RICHARDS, R.I., AND SUTHERLAND, G.R. (1992). Dynamic mutations: a newclass of mutations causing human disease. Ce!170: 709-712.RIDLEY, R.M., FRITH, C.D., CROW, T.J., AND CONNEALLY, P.M. (1988).Anticipation in Huntington’s disease is inherited through the male linebut may originate in the female. J. Med. Genet. 25: 589-595.RIDLEY, R.M., FRITH, C.D., FARRER, L.A., AND CONNEALLY, P.M. (1991).Patterns of inheritance of the symptoms of Huntington’s diseasesuggestive of an effect of genomic imprinting. J. Med. Genet. 28: 224-231.RIORDAN, J.R., ROMMENS, J.M., KEREM, B.-S., ALON, N., ROZMAHEL, R.,GRZELCZAK, Z., ZIELENSKI, J., LOK, S., PLAVSIC, N., CHOU, J.-L., DRUMM,M.L., IANNUZZI, M.C., COLLINS, F.S., AND TSUI, L.-C. (1989). Identificationof the cystic fibrosis gene: cloning and characterization ofcomplementary DNA. Science 245: 1066-1073.ROMMENS., J.M., IANNUZZI, M.C., KEREM, B.-S., DRUMM, M.L., MELMER, G.,DEAN, M., ROZMAHEL, R., COLE, J.L., KENNEDY, D., HIDAKA, N., ZSIGA, M.,BUCHWALD, M., RIORDAN, J.R., TSUI, L.-C., AND COLLINS, F.S. (1989).Identification of the cystic fibrosis gene: chromosome walking andjumping. Science 245: 1059-1064.ROSE, E.A., GLASER, T., JONES, C., SMITH, C.L., LEWIS, W.H., CALL, K.M.,MINDEN, M., CHAMPAGNE, E., BONETTA, L., AND YEGER, H. (1990).Complete physical map of the WAGR region of llpl3 localizes acandidate Wilms tumor gene. Ce1160:495-508.41ROYER-POKORA, B., KUNKEL, L.M., MONACO, A.P., GOFF, S.C., NEWBURGER,P.E., BAEHNER, R.L., COLE, ES., CURNUTTE, J.T., AND ORKIN, SH. (1986).Cloning the gene for an inherited human disorder-chronicgranulomatous disease-on the basis of its chromosomal location.Nature 322: 32-37.SALEEBA, J.A., RAMUS, S.J., AND COTTON, R.G.H. (1992). Completemutation detection using unlabeled chemical cleavage. Human Mutation1: 63-69.SAMBROOK, J., FRITSCH, E.E., AND MANIATIS, T. (1989). “Molecularcloning: A laboratory manual,” 2nd ed., Cold Spring Harbor LaboratoryPress, Cold Spring Harbor, New York.SAPIENZA, C., PETERSON, A.C., ROSSANT, J., AND BALLING, R. (1987).Degree of methylation of transgenes is dependent on gamete of origin.Nature 328: 248-251.SCHWARTZ, D.C., AND CANTOR, R.C. (1984). Separation of yeastchromosome-sized DNAs by pulsed field gel electrophoresis. Cell 37:67-75.SHIZUYA, H., BIRREN, B., KIM, U-J., MANCINO, V., SLEPAK, T., TACHIIRI, Y.,AND SIMON, M. (1992). Cloning and stable maintenance of 300-kilobase-pair fragments of human DNA in Escherichia coil using an F-basedvector. Proc. NatI. Acad. Sd. USA 89: 8794-8797.SINCLAIR, A.H., BERTA, P., PALMER, M.S., HAWKINS, J.R., GRIFFITHS, B.L.,SMITH, M.J., FOSTER, J.W., FRISCHAUF, A.M., LOVELL-BADGE, R., ANDGOODFELLOW P.N. (1990). A gene from the human sex-determining regionencodes a protein with homology to a conserved DNA-binding motif.Nature 346: 240-244.SKRAASTAD, M.I., BAKKER, E., DE LANGE L.E., VEGTER-VAN DER VLIS, M.,KLEIN-BRETELER, E.G., VAN OMMEN, G.J., AND PEARSON, P.L. (1989).Mapping of recombinants near the Huntington disease locus by using GB(D4S1O) and newly isolated markers in the D4S1O region. Am. J. Hum.Genet. 44: 560-566.SMITH, B., SKARECKY, D., BENGTSSON, U., MAGENIS, R.E., CARPENTER, N.,AND WASMUTH, J.J (1988). Isolation of DNA markers in the direction ofthe Huntington disease gene from the G8 locus. Am. J. Hum. Genet. 42:335-344.42SNELL, R.G., THOMPSON, L.M., TAGLE, D.A., HOLLOWAY, T.L., BARNES,G., HARLEY, H.G., SANDKUIJL, L.A., MACDONALD, M.E., COLLINS, F.S.,GUSELLA, J.F., HARPER, P.S., AND SHAW D.J. (1992). Arecombination event that redefines the Huntington disease region.Am. J. Hum. Genet. 51: 357-362.SNELL., R.G., LARAZOU, L.P., YOUNGMAN, S., QUARRELL, O.W., WASMUTH,J.J., SHAW, D.J., AND HARPER, P.S. (1989). Linkage disequilibrium inHuntington’s disease: an improved localisation for the gene. J. Med.Genet. 26: 673-675.STERNBERG, N. (1990). Bacteriophage P1 cloning system for theisolation, amplification, and recovery of DNA fragments as large as 100kilobase pairs. Proc. Nat!. Acad. Sd. USA 87: 103-107.SUTHERLAND, G.R., AND RICHARDS, R.l. (1992). Anticipation legitimized:unstable DNA to the rescue. Am. J. Hum. Genet. 51: 7-9.TARTOF, K.D., AND HENIKOFF, 5. (1991). Trans-sensing effects fromdrosophila to humans. Cell 65: 201-203.THE HUNTINGTON’S DISEASE COLLABORATIVE GROUP. (1993). A novelgene containing a trinucleotide repeat that is expanded and unstable onHuntington’s disease chromosomes. Cell 72: 971-983.THEILMANN, J., KANANI, S., SHIANG, R., ROBBINS, C., QUARRELL, 0.,HUGGINS, M., HEDRICK, A., WEBER, B., COLLINS, C., AND WASMUTH, J.J.(1989). Non-random association between alleles detected at D4S95 andD4S98 and the Huntington’s disease gene. J. Med. Genet. 26: 676-681.TON, C.C., HIRVONEN, H., MIWA, H., WElL, M.M., MONAGHAN, P., JORDAN, T.VAN HEYNINGEN, V., HASTIE, N.D., MEIJERS-HEIJBOER, H., DRECHSLER, M.(1991). Positional cloning and characterization of a paired box- andhomeobox-containing gene from the aniridia region. Ce!167:1059-1074.UBERBACHER, E.C. AND MURAL, R. (1991). Locating protein-codingregions in human DNA sequences by a multiple sensor-neural networkapproach. Proc. Nat!. Acad. Sd. USA 88: 11261-11265.VAN DEN ENGH, G., SACHS, R., AND TRASK, B. (1992). Estimating genomicdistance from DNA sequence location in cell nuclei by a random walkmodel. Science 257: 1410-1411.43VERKERK, A.J., PIERETTI, M., SUTCLIFFE. J.S., EU, Y.H., KUHL, D.P.,PIZZUTI, A., REINER, 0., RICHARDS, S., VICTORIA, M.E., AND ZHANG, E.P.(1991) Identification of a gene (EMR-1) containing a CGG repeatcoincident with a breakpoint cluster region exhibiting length variationn fragile X syndrome. Cell 65:905-914.VULPE, C., LEVINSON, B., WHITNEY, S., PACKMAN, S., AND GITSCHIER, J.(1993). Isolation of a candidate gene for Menkes disease and evidencethat it encodes a copper-transporting ATPase. Nature (Genetics) 3: 7-13.WALLACE, M.R., MARCHUK, D.A., ANDERSEN, L.B., LETCHER, R., ODEH, H.M.,SAULINO, A.M., FOUNTAIN, J.W., BRERETON, A., NICHOLSON, J., MITCHELL,A.L., BROWNSTEIN, B.H., AND COLLINS, F.S. (1990). Typelneurofibromatosis gene: identification of a large transcript disruptedin three NFl patients. Science 249: 181-186.WANG, H.S, GREENBERG C.R, HEWITT J., KALOUSEK D., HAYDEN M.R. (1986).Subregional assignment of the linked marker G8 (D4S1O) for Huntingtondisease to chromosome 4p16.1-16.3. Am. J. Hum. Genet. 39: 392-396.WASMUTH, J.J., HEWITT, J., SMITH, B., ALLARD, D., HAINES, J.L.,SKARECKY, D., PARTLOW, E., HAYDEN, M.,R. (1988). A highly polymorphiclocus very tightly linked to the Huntington’s disease gene. Nature 332:734-736.WEBER, B., COLLINS, C., KOWBEL, D., RIESS, 0., HAYDEN, M.R. (1991).Identification of multiple CpG islands and associated conservedsequences in a candidate region for the Huntington disease gene.Genomics 11:1113-1124.WEBER, B., HEDRICK, A., ANDREW, S., RIESS, 0., COLLINS, C., KOWBEL, D.,AND HAYDEN, M.R. (1992). Isolation and characterization of new highlypolymorphic DNA markers from the Huntington disease region. Am. J.Hum. Genet. 50: 382-393.WEBER, B., RIESS, 0., WOLFF, G., ANDREW, S., COLLINS, C., GRAHAM, R,.THEILMANN, J., AND HAYDEN, M.R. (1992). Delineation of a 50 kilobaseDNA segment containing the recombination site in a sporadic case ofHuntington disease. Nature (Genetics) 2: 216-222.44WEBER, J.L., AND MAY, P.E. (1989). Abundant class of human DNApolymorphisms which can be typed using the polymerase chain reaction.Am. J. Hum. Genet. 44: 388-396.WEXLER, N.S., YOUNG, A.B., TANZI, R.E., TRAVERS, H., STAROSTARUBINSTEIN, S., PENNEY, J.B., SNODGRASS, S.R., SHOULSON, I., GOMEZ, F.,RAMOS ARROYO M.A., PENCHASZADEH, G.K., MORENO, H., GIBBONS, K.,FARNIARZ, A., HOBBS, W., ANDERSON, M.A., BONILLA, E., CONEALLY, P.M.,AND GUSELLA, J.F. (1987). Homozygotes for Huntington’s disease. Nature326:1 94-1 97.WHALEY W.L., BATES, G.P., NOVELLETTO, A., SEDLACEK, Z., CHENG, S.,ROMANO, D., ORMONDROYD, E., ALLITTO, B., LIN, C.,YOUNGMAN, S.,BAXENDALE, S., BUCAN,M., ALTHERR, M., WASMUTH, J.J., WEXLER, N.S.,FRONTALI, M., FRISCHAUF, A-M., LEHRACH, H., MACDONALD, M.E. (1991).Mapping of cosmid clones in Huntington’s disease region of chromosome4. Somat. Cell Mo!. Genet. 17: 83-91.WHALEY, W.L., MICHIELS, F., MACDONALD, M.E., ROMANO, D., ZIMMER, M.,SMITH, B., LEAVITT, J., BUCAN, M., HAINES, J.L.,GILLIAM, T.C., ZEHETNER,G., SMITH, C., CANTOR, C.R., FRISCHAUF, A.M., WASMUTH, J.J., LEHRACH,H., AND GUSELLA, J.F. (1988). Mapping of D4S98/S114/S113 confinesthe Huntington’s defect to a reduced physical region at the telomere ofchromosome 4. Nucleic Acids Res. 16:11769-11780.WHITE, R.A.H., AND AKAM, M.E. (1985). Contrabithorax mutations causeinappropriate expression of ultrabithorax products in drosophila. Nature318: 567-569.WINTER, E., YAMAMOTO, F., ALMOGUERA, C., AND PERUCHO, M. (1985). Amethod to detect and characterize point mutation in transcribed genes:amplification and overexpression of the mutant c-Ki-ras allele inhuman tumor cells. Proc. Nat!. Acad. Sd. USA: 82: 7575-7579.WOLFF, G., DEUSCHL, G., WIENKER, T.F., HUMMEL, K., BENDER, K., LUCKING,C.H., SCHUMACHER, M., HAMMER, J., AND QEPEN, G.(1989). New mutationto Huntington’s disease. J. Med. Genet. 26:18-27.WONG, C., DOWLING, C.E, SAIKI, R.K., HIGUCHI, R.G., ERLICH, H.A., ANDKAZAZIAN, H.H., JR. (1989). Characterization of b-thalassemiamutations using direct genomic sequencing of amplified single copyDNA. Nature 330: 384-386.45YOUNGMAN, S., SARFARAZI, M., BUCAN, M., MACDONALD, M., SMITH, B.,ZIMMER, M., GILLIAM, C., FRISCHAUF, A.M., WASMUTH, J.J., GUSELLA, J.F.,LEHRACH, H., HARPER, P., AND SHAW, D. (1989). A new DNA marker(D4S90) is located terminally on the short arm of chromosome 4, closeto the Huntington disease gene. Genomics 5: 802-809.YU, S., PRITCHARD, M., KREMER, E., LYNCH, M., NANCARROW, J.,BAKER, E., HOLMAN, K., MULLEY, J.,C., WARREN, S.T., SCHLESSINGER,D., SUTHERLAND, G.R., AND RICHARDS, R.l. (1991). Fragile Xgenotype characterized by an unstable region of DNA. Science 252:1179-1181.ZUO, J., ROBBINS, C., TAILLON-MILLER, P., CCX, D.,R., AND MYERS,R.M.(1992). Cloning of the Huntington disease region in yeast artificialchromosomes. Hum. Mo!. Genet. 1:149-159.46Chapter 2Materials and Methods47a2.1.0 MaterialsTwo cosmid libraries were used in this study. A flow-sortedchromosome 4 cosmid library (Cell source: UV2O HL21-27, hamster-human hybrid line containing human chromosomes 4, 8 and 21) waskindly provided by Los Alamos National Laboratory. This library wasconstructed in the cosmid vector sCos-1 (Evans et al., 1989) andpropagated in the Escherichia coil host strain HB1O1 (hsdS2O, supE44,aral4, galKl2, lacYl, proA2, rspL2O, xyl-5, mtl-1, recAl3, mcrB, mcrA,mrr). A second cosmid library, constructed in vector pWE15 (Wahi etal., 1987) and propagated in the Escherichia coil host strain AG1(recAl, endAl, gyrA96, thi-1, hsdRl7, (rk-, mk+), supE44, relAl), waspurchased from Stratagene.Eight cDNA libraries were utilized in this study. The fetal brain(14 wk) and the fetal eye (14 wk) cDNA libraries were the gift of Dr.David Kurnit (University of Michigan). The adult retina cDNA library wasconstructed by Dr. Jeremy Nathans (Johns Hopkins University) and thecaudate and putamen cDNA libraries were purchased from Clontech. Thebasal ganglia, cerebellum and mouse total brain cDNA libraries wereobtained through the American Type Culture Collection. All of theselibraries are constructed in the phage lambda vector gtl 1 andpropagated in E.coIi Y1090r- (z(lac)U169, araDl39, strA, supF, mcrA,trpC22::TnlO, [pMC9}, mcrB, hsdR) except the adult retina cDNA librarywhich was cloned in the phage lambda vector gtlO and propagated in E.coil C600 (supE44, thi-1, thr-1, leuB6, lacYl, tonA2l, mcr).4722.0 Methods2.2.1 Cosmid Library Plating and ScreeningTo isolate cosmids for chromosome walking the cosmid librarieswere plated at high density (50,000 colony forming units/l5Omm X15mm plate), on Luria-Bertani (LB) media supplemented with 200ug/mlampicillin and incubated at 37°C until the colonies were approximately1 mm in diameter. Three sets of relica filters were then made, fromeach plate, using 150 mm Hybond N (Amersham) discs exactly asdescribed (Sambrook et al., 1989) and the master plates regrown forshort term harvesting (< 1 month). For each library one pair of replicafilters were prepared for hybridization as described in (Sambrook etal., 1989). Briefly, replica filters were immersed in denaturingsolution (1 .5M NaCI, 0.5M NaOH) for 30 seconds and then immersed inneutralizing solution (1 .5M NaCI, 0.5 M Tris-HCI [pH 8.0]) for 5 minutes.Bacterial debris was then scrubbed from the filters in 2 X SSCP with agloved hand. The remaining sets of replica filters were overlayed onplates containing 25% glycerol and regrown for subsequent archiving at-70°C. Cosmid library hybridizations and prehybridizations wereperformed in 0.5M sodium phosphate buffer, pH7.2, 7% SDS and 1mMEDTA at 65°C (Church and Gilbert, 1984). Post hybridization washeswere at a final stringency of 1X SSC, 0.1 % SDS at 65°C for 1 hour.Autoradiography was for 12-24 hours.2.2.2 cDNA Library Plating and ScreeningApproximately one million plaque forming units of each librarywere plated onto 5 25cm X 25cm bioassay trays (Gibco) and either twoor four sets of replica filters were made using Hybond N+ nylon filters48(Amersham) by standard methods (Sambrook et al., 1989). The replicafilters were prepared for hybridization as described for cosmid libraryscreening. cDNA library prehybridizations and hybridizations were donein 0.5M sodium phosphate buffer, pH7.2, 7% SDS and 1 mM EDTA at 65°C(Church and Gilbert, 1984). Following overnight hybridization the cDNAfilters were washed to a final stringency of O.1X SSC at 65°C.Autoradiography was done for 24-72 hrs. Plaques producing duplicatehybridization signals were plaque purified and subcloned into the EcoRlsite of Bluescript II KS+.(Stratagene).2.2.3 cDNA Library Screening by the Polymerase ChainReactionOne million phage particles were lysed at 100°C for 3 mm andthen used in a 50u1 PCR reaction using Promega Taq polymerase bufferand 0.5U BRL Taq polymerase, 200uM each of dTTP, dCTP, dGTP anddATP (Promega), and 30 pmoles of primer p2 and primer p3 (Table 5-1).Thermal cycling (94°C 1mm, 72°C 1mm; 30 or 40 cycles) wasperformed in an Ericomp Twin Block thermal cycler. Identicalconditions with 30 cycles of amplification were employed to screenthe adult retinal and fetal muscle cDNA libraries with primers pla andp3. The position and sequence of the primers in figure 5-3 and table 5-1 were chosen to minimize sequence similarity to other myosin lightchain genes.The inside-out PCR (Rosenberg et al., 1991) used to isolate the 5’end of the fetal muscle MLC was performed in a reaction identical tothose aforementioned except for the inclusion of lambda GT1 1 vectorprimer (A) 5’-TCCTGGAGCCCGTCAGTATC-3’ (DahI et al., 1990) and49primer p1 (Table 5-1). Thermal cycling (94°C 1mm., 62°C 1mm, 72°C1mm.; 30 cycles) was performed in a Perkin Elmer Cetus DNA thermalcycler. The products of this reaction were subcloned directly into theTA cloning vector (Invitrogen) according to the manufacturer’s protocoland transformed into subcloning efficiency E.coIi DH5o competent cells(BRL) according to the manufacturer’s protocol.To demonstrate that the PDEB gene’s primary transcript isalternately spliced, adult retinal, fetal retinal, putamen, temporalcortex, fetal brain, and mouse cDNA libraries were screened asdescribe for the MYL5 gene. Primers Ki (5’-GGCTCCTCGATTTTGGAGCG-3’) and F (5’-GCCCAGCACCCAAGTCTTCAACCTG-3’) were used in PCRreactions (94°C 1mm., 62°C 1mm, 72°C 1mm.; 40 cycles) in an EricompTwin Block thermal cycler. The products of this reaction werefractionated on a 1.5% agarose gel and photographed (Fig. 4-9).2.2.4 Northern Blot AnalysisRNA was isolated using the single step method of homogenizationin guanidinium isothiocyanate as described (Chomczinski and Sacchi,1987). PolyA+ RNA was prepared using oligo(dT) drip columns(Stratagene) and fractionated on 1% agarose gels containing 2.2Mformaldehyde. The structural integrity of the fractionated RNA wasconfirmed by the presence of intact 28S and 18S RNA bands and thedistinct banding of other abundant messages. Alkaline transfer wasdone to a Hybond N+ membrane according to the manufacturer’sinstructions and transfer was confirmed by direct observation of thetransferred RNA with ultraviolet irradiation. Hybridization of theNorthern blots with a J3-actin internal control probe provided final50confirmation that the RNA was intact and had transferred. The probesused in the Northern analyses were used immediately followingdenaturation without preannealling. Prehybridization and hybridizationwere done in 0.5M sodium phosphate buffer, pH7.2, 7% SDS and 1 mMEDTA (Church and Gilbert, 1984) at 65°C. Following hybridization theNorthern blot was washed in 0.1X SSC, 0.1% SDS at 65°C for one hour.Autoradiography was overnight for the MYL5 Northern blot and 14 daysfor the PDEB Northern blot.2.2.5 Hybridization ProbesDNA restriction fragments used in the screening of cosmid libraries,Southern blots, zoo blots, northern blots and cDNA libraries wereprepared by either electroelution from agarose gels or by isolationfrom low melting-point agarose gel (BRL) slices. All DNA probes werelabeled by the methods of Feinberg and Vogelstein (1983;1984) andpurified on a G-25 spin column (Sambrook et al., 1989). Single copyprobes were denatured by boiling for 2 minutes prior to hybridization.Uncharacterized probes and those known to contain repetitive elementswere blocked for repetitive elements prior to hybridization by boilingthe probe together with 300ug of sonicated total human DNA, for 2minutes, followed by preannealing at 65°C for 15 minutes to 1 hour.Oligonucleotide probes were synthesized on a PCR-Mate 391 DNAsynthesizer (Applied Biosystems, Inc.) and purified by reverse phasechromatography (Sep-Pak C18, Waters) as described (Atkinson andSmith, 1984). Labeling was carried out with T4 polynucleotide kinaseand [y32P]ATP for 30 minutes at 37°C as described in Sambrook et al.(1989).512.2.6 DNA Isolation and Southern Blot AnalysisGenomic DNAs were prepared from either white blood cells ormuscle tissue according to published methods (Kunkel et aL, 1977) andcosmid DNAs were prepared using a 30X scale up of the alkaline lysismethod described by Birnboim and Doly (1979). To produce Southernblots five micrograms of each DNA was digested to completion with theindicated restriction endonuclease (BRL), fractionated on a 0.8%agarose gel by electrophoresis and transferred to a Hybond N+membrane (Amersham) as described by Southern (1975). Southern blotswere prehybridized and hybridized in 0.5M sodium phosphate buffer, pH7.2, 7% SDS, and 1mM EDTA (Church and Gilbert, 1984) at either 65°C or42°C for oligonucleotide probes. Post hybridization humangenomic/cosmid Southern blots were washed to a final stringency of0.1 X SSC, 0.1% SDS at 65°C except for oligonucleotide probes whichwere washed to a final stringency of 1 X SSC, 0.1% SDS at 42°C. Zooblots were washed to a final stringency of either 1 X SSC, 0.1% SDS at65°C or 1 X SSC (figures 3-9, 3-12, 3-13 ), 0.1% SOS at 50°C (figure 3-11). Autoradiography was from overnight to 72 hours.2.2.7 SequencingFor sequencing cDNAs were first subcloned into the EcoRl site ofpBluescript KS(+) (Stratagene) and transformed into subcloningefficiency E. coil DH5x host cells (BRL) according to themanufacturers protocol. Sequencing templates were prepared by themethod of Lee and Rasheed (1991).Double-stranded DNA sequencing was carried out by the dideoxynucleotide chain termination method (Sanger et al., 1977) using a52Sequenase Kit (USB). Briefly, 2ug-5ug of supercoiled template was wascombined with 0.5pmol of sequencing primer in 40mM Tris-HCI pH 7.5,20mM MgCl2, 50mM NaCI and heated to 65°C for 2 minutes and thencooled to -70°C. The annealed primer was then extended and labeledwith Sequenase in the presence of 1.5uM dGTP, dTTP, dCTP, and 5uCi ofx-[35S]-dATP for 5 minutes at room temperature. Chain terminationwas accomplished by adding 3.5 UI of the labeling reaction mixture to2.5u1 of 8OuM dNTPs and 8uM of one of the dideoxynucleotides.Termination reactions were performed at 37°C for 5 minutes andstopped by addition of stop solution containing (95% formamide, 20mMEDTA, 0.05% bromphenol blue, 0.05% xylene cyanol FF). Labeledproducts were fractionated through a 6% polyacrylamide gel containing1 X TBE and 8M urea at 32W. Following fractionation the gel wastransferred to Whatmann 3MM chromatography paper, vacuum dried andexposed to film (Kodak XAR) for 16-72 hours at 23°C.Initial sequencing reactions were performed using T3 and T7primers and sequence obtained from these reactions was used to designinternal cDNA oligonucleotide primers for continued sequencing. Thesynthesis and purification of sequencing primers was accomplishedexactly as described for the synthesis of olignucleotide probes.Sequence analysis was carried out using MacVector (IBI) sequenceanalysis software and the amino acid alignments were performed usingprograms developed by Hem (1990) for the PDEB protein and Higgins andSharp (1989) for the MYL5 protein532.3 ReferencesATKINSON, T. AND SMITH, M. (1984). Solid phase synthesis ofoligodeoxyribonucleotides by the phosphitetriester method. In M.J. Gait(Ed.), “Oligonucleotide synthesis: a practical approach”. IRL Press,Oxford, pp. 35-81.CHOMCZINSKI, P., AND SACCHI, N. (1987). Single-step method of RNAisolation by acid guanidinium thiocyanate-phenol-chloroformextraction. Anal. Biochem. 162: 156-159.CHURCH, G.M., AND GILBERT, W. (1984). Genomic sequencing. Proc. Nat!.Acad. Sd. USA. 81: 1991-1995.DAHL, H.M.D., BROWN, R. M., HUTCHINSON, W.M., MARAGOS, C., AND BROWNG. K. (1990). A testis-specific isoform of the human pyruvatedehydrogenase Ela subunit is coded for by an intronless gene onchromosome 4. Genomics 8: 225-232.EVANS, G.A., LEWIS, K., AND ROTHENBERG, B.E. (1989). High efficiencyvectors for cosmid microcloning and genomic analysis. Gene 79: 9-20.FEINBERG, A.P., AND VOGELSTEIN, B. (1983). A technique forradiolabelling DNA restricion endonuclease fragments to high specificactivity. Anal. Biochem. 132: 6-13.FEINBERG, A.P., AND VOGELSTEIN, B. (1984). Addendum: A technique forradiolabeling DNA restriction endonuclease fragments to high specificactivity. Anal. Biochem. 137: 266-267.HEIN, J. (1990). Statistical tests of molecular phylogenies. Methods InEnzymology 183:626-645.HIGGINS, D. G. AND SHARP, P. M. (1988) Clustal: a package forperforming multiple sequence alignments on a microcomputer. Gene 73:237-244.LEE, S.Y., AND RASHEED, S. (1990). A simple procedure for maximumyield of high-quality plamid DNA. Biotechniques 9: 676-679.ROSENBERG, H.F., CORRETrE, S.E., TENEN, D.G., AND ACKERMAN, S.J.(1991). Rapid cDNA library screening using the polymerase chainreaction. BioTechniques 10: 53-54.54SAMBROOK, J., FRITSCH, E.F., AND MANIATIS, T. (1989). “MolecularCloning: A Laboratory Manual,” 2nd ed., Cold Spring Harbor Laboratory,Cold Spring Harbor, NY.SANGER, F., NICKLEN, S., AND COULSON, A.R. (1977). DNA sequencing withchain-terminating inhibitors. Proc. Nat!. Acad. Sd. USA 74: 5463-5467.SOUTHERN, E. M. (1975). Detection of specific sequences among DNAfragments seperated by gel electrophoresis. J. Mo!. Biol. 98: 503-517.WAHL, G.M., LEWIS, K.A., RUIZ, J.C., ROTHENBERG, B., AND ZHAO, J. (1987).Cosmid vectors for rapid genomic walking, restriction mapping, andgene transfer. Proc. Nat!. Acad. Sc!. USA 84: 2160-2164.55Chapter 3Cloning a 126 Kb Segment of HDCandidate Region II56a3.1.0 IntroductionNew mutations at the HD locus are rare (Hayden, 1981) with themajority of reported cases resulting from misdiagnosis, non-paternity,or inaccurate family histories. However, a true de novo mutationresulting from a chromosomal rearrangement would provide the criticalsubstrate from which to finally delineate the HD gene. Consequently,the discovery of a recombination within chromosome 4p16.3 in apatient with an apparent new mutation at the HD locus was deemedworthy of further study (Weber et al.,1992).In their analysis Weber et al. (1992) screened this sporadicpatient and her eleven siblings with 14 markers spanning chromosome4p16.3 identifying 6 informative RFLPs at the loci D4S1O, D4S125,D4S95, D4S115, D4S111 and D4S141. Five of these markers D4S1O,D4S125, D4S95, D4S115 and D4S111 identified four distinct haplotypesdistributed amongst the twelve siblings enabling inference of theparental genotypes (Fig. 3-1). Moreover, three siblings #12, #13 and#15 were found to share identical (aid) haplotypes at the fiveinformative markers, however, inclusion of the most distal markerD4S141 revealed a recombinant (d-c) haplotype in the patient (Fig. 3-1). This observation suggests that the patient inherited a chromosome4, from one of her unaffected parents, that had undergone a meioticrecombination between D4S1 11 and D4S1 41 (Weber et al., 1992).The discovery of a terminal 4p16.3 recombination event in apatient suffering from an apparent new mutation at the HD locus led tospeculation that perhaps the two events were related. However, prior56Figure 3-1là 2 3 68 9 1 0 1 1 3 1 1liii IU 1111111111 IU IU I! IUac ac bd ac ac ac ac ac ad ad a d!c b dThe results of DNA haplotyping on a pedigree reported to contain anapparent new mutation (111-15) at the HD locus using markers betweenD4S1O and the 4p telomere. For the order of the markers used in thisstudy see figure 1-6. Shown below each family member are the fourconstructed haplotypes, each represented by an individual vertical bar.The bracketed haplotypes are inferred. Distal to D4S228, patient 111-15shares the ‘C’ haplotype with seven siblings whereas proximal tomarker D4S227 both haplotypes of patient 111-15 are identical tosiblings 111-12 and 111-13 (‘a’ ‘d’ haplotypes). Assignment of thehaplotypes correspond to previously established haplotype dataobtained at D4S1O (Wolff et al., 1989)57to formulating a hypothesis linking these two events the possibility ofeither non-paternity or misdiagnosis was considered. Wolff et al.(1989), in reporting this sporadic case, assessed multiple polymorphicmarkers including blood groups, red cell and serum proteins, HLAantigens and DNA markers concluding that there was no indication ofnon-paternity. Similarly, long-range haplotyping by Weber et a!. (1992)provided no evidence for non-paternity. Together these analysesprovide robust evidence that the proband is indeed a full sib, thusexcluding non-paternity.Stevens and Parsonage (1969) established four criteria whichmust be met prior to the report of a new mutation being givencredibility. First, the clinical features and neuropathology must betypical of HD. Second, non-paternity must be excluded. Third, theparents must be confirmed as either being or as having been free of HDwell beyond the average age of onset. Fourth, transmission of HD to thenext generation should occur consistent with an autosomal dominantmode of inheritance. Importantly, the sporadic case reported by Wolffet at. (1989) meets these stringent criteria with the exception thattwo offspring, below the average at onset, remain unaffected (Fig. 1-7).Because this sporadic case largely conforms to the strict criteria setby Parsonage (1969) we concurred with Wolff et at. (1989) inconcluding that this case did indeed result from a de novo mutation atthe HO locus.Confident of the patient’s paternity and clinical phenotype, ahypothesis was postulated relating the terminal 4p16.3 recombinationevent to the occurrence of a new mutation at the HO locus. It wasdeemed improbable that two rare events, namely, a de novo mutation at58the HO locus (Hayden, 1981) and a terminal 4p16.3 recombination event(Buetow et al., 1991) had occurred independently in the sporadicpatient (Weber et al., 1992). Instead a hypothesis was formulatedlinking these two events in a causal relationship. Importantly, thefoundation of this hypothesis rests on the experimentally testableprediction that an unequal recombination, at the HO locus, is directlyresponsible for HD in the sporadic patient.3.2.0 RESULTS3.2.1 Chromosome WalkingA chromosome walk was initiated, to test this hypothesis, byscreening a cosmid library with the probe D4S133 (Pritchard et al.,1989). This resulted in the isolation of three cosmid clones; cQ, cD andcN. As a first step in establishing the extent of collinearity betweenthe cosmids restriction end-fragments had to be identified for eachcosmid. To obtain this information Southern blots containing EcoRlrestricted DNA, from each cosmid, were hybridized with radiolabeledT3 and T7 oligonucleotides (see Methods).These experiments revealed that cosmids cQ, cD and cN eachproduced similar size 25-26 kb end-fragments upon digestion withEcoRl, suggesting that the cosmids overlap within these EcoRl endfragments. The subsequent mapping of D4S1 13 to each of the threeEcoRl end-fragments clearly established a collinear relationshipbetween the three cosmids within the large EcoRl end-fragments.Hybridization of isolated cosmid end-fragments to Southern blots ofthe three EcoRl digested cosmids revealed that the cosmids overlappedto form a contig with the general organization depicted in figure 3-2.59Moreover, because cosmid end fragments cDT7 and cQT7 hybridized onlyto their cognate cosmids it was determined that these restrictionfragments defined the terminal boundaries of the contig.To extend the chromosomal walk into adjacent chromosomal DNA,the contig end fragments cDT7 and cQT7 were used to screen achromosome 4 specific cosmid library (Los Alamos NationalLaboratory) resulting in the isolation of several additional cosmids.This process of chromosome walking was repeated until the 126 kbcontig consisting of the 7 cosmids shown in figure 3-2 wasestablished. Two of the cosmids in this contig cN and cQ were found tobe unstable upon liquid culture.In an effort to align and characterize the five stable cosmids,forming the 126 kb contig, each cosmid was double digested withEcoRl/Miul and EcoRl/Noti and fingerprinted as shown in figures 3-3 to3-6 (Stallings et al., 1990). Fingerprinting was performed with fourprobes, an Alu repetitive element (Blur8); a (GT/CA)10 oligonucleotide;a telomeric repeat (CCCTAA)4 oligonucleotide and a long interspersedrepeat (LINE) 3’-Ll probe. These experiments provided valuableinsights into the chromosomal organization of the cloned region whilesimultaneously facilitating the alignment of the individual cosmidscomprising the contig. For example, fingerprinting revealed thatcosmids cD and cDl3 share a common 10 kb fragment which is notcleaved by MIul and contains an interstitial telomeric repeat whereascosmids cDp and cL3 share a similar sized fragment that is cleaved byMIul and lacks the telomeric repeat. Similarly, fingerprinting with the(GT/CA)10 oligonucleotide not only revealed and/or confirmedoverlapping restriction fragments but also functioned to localize a60number of potentially highly polymorphic microsatellites within thecontig (Weber, 1990). The combined results of the fingerprintingexperiments are shown in figure 3-2.The restriction map of the cosmid contig illustrated in figure 3-2was derived by performing single and double digests with severalenzymes including EcoRl, Miul, NotI and Nru I followed by hybridizationwith T3, T7 and multiple internal and end fragment probes. Thisprocess combined with the fingerprinting revealed that the cosmidcontig spans 126 kb of chromosome 4p16.3 approximately 700 kb fromthe 4p telomere.61Figure 3-2T3 cDp TiT7 cQBI T3A T3 cD T7Ti cL4 T3 T3 cN T7T7 cQ T3T3 cDl3 T7B tel. ii . cen.CpG CpG CpGI II IIIrn-iC A AA AA A AA AA. Seven overlapping cosmids comprising the 126 kb contig. B. TheEcoRl restriction map of the contig with the positions of three CpGislands indicated. C. The result of fingerprinting the cosmids formingthe contig. Open boxes indicate (GT/CA)n microsatellites, the stipledbox indicates the single copy probe BS-1, and the striped box indicatesthe position of the interstitial telomeric repeat. Black arrowheadsindicate that the corresponding EcoRl fragment contains a minimum ofone Alu element.62Figure 3-3Ethidium bomide stained gel containing the five stable cosmids formingthe 126 kb contig digested with EcoRl+Notl and EcoRl+Mlul,respectively. Lanes 1+2 cDl3, 3+4 CD, 5+6 cDp, 7+8 cQBI and 9+10 cL4.A Southern blot of this gel was used to obtain the autoradiographicimages seen in figures 3-4, 3-5, 3-6, 4-5 and 4-6.1 2 3 4 5 6 7 8 9 1063Figure 3-4The five stable cosmids forming the 126 kb contig, digested withEcoRl-i-Notl and EcoRl+Mlul, respectively, and fingerprinted with the Aluelement (BlurB). Lanes 1+2 CD, 3+4 cDl3, 5+6 cDp, 7+8 cQBI and 9+10cL4. Hybridization and washing was performed at high stringency. Ahigh density of Alu elements throughout the 126 kb is apparent as areoverlapping restriction fragments shared between contiguous cosmids.I 2 3F64Figure 3-51 2 3 4 5 6 7 8 9 10.—*._—b.—b.The five stable cosmids forming the 126 kb contig, digested withEcoRI+NotI and EcoRl+MIul, respectively, and fingerprinted with the(GT/CA)10oligonucleotide probe. Reduced stringency was used here toidentify all possible sources of polymorphism. Lanes 1÷2 cDl3, 3+4 cD,5+6 cDp, 7+8 cQBI and 9+10 cL4. Signals confirmed by sequenceanalysis and/or higher stringency washing (Weber et al., 1991) areindicated by arrows.:65Figure 3-6L2 3 4 5 6 7 8 9 10-$-IThe five stable cosmids forming the 126 kb contig, digested withEcoRI+Notl and EcoRl+MIul, respectively, and fingerprinted with the(CCCTAA)4oligonucleotide probe. A single signal is observed atapproximately 10 kb in cosmids cDl3 and CD.663.2.2 Identification of Informative RFLP’sAs each cosmid was mapped and assigned to the contig it wasanalyzed for the presence of single copy restriction fragments thatwould detect informative RFLPs in the sibship under investigation.This was performed by isolating mapped EcoRl fragments from cosmidswithin the contig and digesting these fragments with multiple frequentcutting enzymes. Single-copy restriction fragments were subsequentlyidentified by probing Southern blots containing the restricted EcoRlfragments with radiolabeled total human DNA. Putative single-copyfragments not hybridizing to the probe were then isolated and used toscreen Southern blots of control DNA and/or sibship DNA digested withmultiple enzymes.This methodology was first applied to the 10 kb EcoRl fragmentcontained within cosmids cD and cDl3. Several micrograms of cosmidcD DNA were digested with EcoRl and fractionated through an agarosegel. The 10 kb EcoRl fragment (cDE2) was then cut from the gel,purified by electroelution and recleaved with the enzymes Rsal, Hincli,Mbol and AccI followed by electrophoretic separation and Southernblotting. Hybridization of the Southern blot with radiolabeled humangenomic DNA identified a 2.8 kb Rsal fragment cDE2Rs1 (D4S227) whichappeared to be single-copy. Hybridization of this fragment to Southernblots of control DNA, digested with multiple enzymes, not onlyconfirmed that this fragment was indeed single-copy but, moreover,identified a four allele BamHl polymorphism. Screening the twelveavailable members of the affected sibship with D4S227 revealed thatthe BamHl polymorphism is informative in this family. Significantly,individuals #12, #13 and #15 were found to share the same allele67suggesting that the recombination in the sporadic case (#15) had in allprobability occurred distal to D4S227 (Fig. 3-7).The next informative polymorphism identified by this protocolwas detected with the single copy-probe cDpE3H14 (D4S228). Thisprobe was isolated from the 7 kb T3 EcoRl end-fragment of cosmid cDpexactly as described for D4S227 (Fig. 3-7). Screening the twelveavailable members of the affected sibship with D4S228 revealed aninformative Rsal polymorphism. At this locus individuals #12 and #13were found to be homozygous for allele 2 of D4S228, whereas, D4S227detected both alleles 1 and 2 in the sporadic case #15. These findingscombined with the long-range haplotype study indicated that therecombination event hypothesized to underlie HD in the sporadic casemapped to the 70 kb interval flanked by D4S228 and D4S227.Because localization of the site of recombination is crucial totesting the hypothesis, Dr. B. Weber cloned the interval flanked byD4S227/D4S228 from each chromosome 4 of the affected individualand searched for informative sequence variants by sequence analysis(Weber et al.,1992). This led to the identification of an informativepolymorphism, delMC, which narrowed the critical interval from 70 kbto 50 kb (Fig. 3-13). Within this 50 kb interval, 13 additionalpolymorphisms were identified including three RFLPs, a singleinsertion deletion polymorphism, three highly polymorphic (GTICA)microsatellites, and six sequence variants identified by single strandconformational polymorphism (SSCP) (Orita et al., 1989).Unfortunately, however, none of these variants proved informative inthe affected sibship.68*C C) U) rC’J CO r CO 0)DpE3HI4(D4S228) — 2.6c. :*J‘Haplotype analysis on siblings from the sporadic HD pedigree usinginformative markers at D4S1O, D4S227 and D4S228. The recombinationevent is clearly mapped to the 50 kb interval flanked by DpE3Hi4 anddelMC in patient 111-15. Proximal markers D4S227 and D4S1O detectidentical alleles in siblings 111-12, 111-13 and 111-15, whereas, thedistal markers DpE3HI4/delMC (D4S228) detect a different allele inpatient 111-15.Figure 3-7deIMC(D4S228)c160/E2Rs1(D4S227)—0.4—4.0—4.9—3.7kb/69Unable to further refine the site of recombination Weber et al.(1992) assessed the 50 kb interval for evidence of an unequalrecombination event. To identify the predicted rearrangement Weber etal. (1992) hybridized mapped subcontig restriction fragments toSouthern blots containing double-digested cosmid DNA spanning the 50kb segment on each of the patients chromosome 4s. Differencesbetween the two alleles were identified; however, Southern blotanalysis of the affected sibship revealed these to be simplepolymorphisms. Because this analysis utilized eight differentrestriction endonucleases, which cleave frequently within the 50 kbinterval, it was concluded that a rearrangement of greater than 200 bphad been excluded.Narrowing the interval containing the site of recombination fromover 1 Mb to approximately 50 kb had significant implications for thechromosome walk. First, because the interval containing the site ofrecombination was entirely contained within the cosmid contig thechromosome walk was terminated. Second, because D4S227 andD4S228 flank the site of recombination and detect haplotypes d and crespectively, it was possible to determine the orientation of the contigwithin chromosome 4p16.3. Finally, because the site of recombinationhad been reduced to a 50 kb interval lacking any detectablechromosomal rearrangement, the focus of my investigation turned tothe identification of individual genes. It was reasoned that amutational event resulting from an unequal recombination would bereadily identifiable if it disrupted a gene.703.2.3 Identifying CpG Islands.Three CpG islands were discovered upon characterizing the 50 kbinterval, flanked by the markers D4S2271D4S228, and encompassingthe site of recombination in the sporadic case (Fig. 3-14). To identifythese CpG islands contiguous cosmids spanning the 50 kb interval weredouble digested with EcoRl and BssHlI, Eagi, Miul, Noti or Sacli andelectrophoretically fractionated through an agarose gel. Analysis ofthe resulting ethidium bromide restriction patterns revealed a 14 kbEcoRl fragment in cQBl that is cleaved to nearly identical 7 kbrestriction fragments by BssHll, EagI and Sacll suggesting the presenceof a CpG island. Moreover, because the T7 EcoRl end fragment of cQ isnot cleaved by these enzymes whereas the T3 EcoRl end fragment ofcOp is, the CpG island could be localized to the distal 2 kb of cosmidcDp (Table 3-1 and Fig. 3-8).Hybridization of a Southern blot made from the above gel with aradiolabeled T3 oligonucleotide probe confirmed the presence of theCpG island, in cosmid cQBI, and revealed the presence of two additionalputative CpG islands (Table 3-1 and Fig. 3-8). The first of this pair ofCpG islands, CpG island II, maps within 2 kb of the T3 terminus ofcosmid cD where restriction sites for SaclI, EagI and BssHll clusterwithin a 1 Kb interval. The presence of CpG island Ill was suggested bya pair of tightly linked restriction sites for EagI and BssHll within 2 kbof cosmici cQ’s T3 terminus. In addition restriction sites for NruI andMIul were localized to within this CpG island.71Table 3-1cQ cD cQBI cDpEcoRl 25.0 kb 26.0 kb .4 kb 7.8 kbB s s H II 1.70kb 1.3 kb .4 kb 2.0 kbEagI 1.6 kb .6 kb .4 kb 2.0 kbSacIl 9.5 kb .4kb .4kb 2.5 kbM I UI - 24.0 kb .4 kb 7.8 kbNotI 25.0 kb 26.0 kb .4 kb 7.8 kbFigure 3-8cOI I I II IIL*RI RI RI RIRI MBESB Eir(A) cQBI*Iii I I I IRIRI RI RI RI RI*B,rSI I IcDpRI RI RI RIRII II IIITable 3-1 summarizes the data obtained following hybridization of aSouthern blot containing the indicated cosmids, double-digested withEcoRI plus BssHII, EagI, MluI, NotI and Sacil, with a radiolabeled T3oligonucleotide. Figure 3-8 is a schematic representation of the datapresented in table 3-1. Stars indicate the radiolabeled T3 cosmid endfragments. The 14 kb Eco RI fragment found to be cleaved into nearlyidentical 10 kb and 4 kb restriction fragments by BssHII, EagI and SaclIis labelled (A). Mapped CpG islands are numbered with Romannumerals. Note that cosmid cD is truncated in this figure.723.2.4 The Identification of Phylogentically Conserved DNATo initially survey the DNA in the vicinity of the three CpGislands for the presence of exons, cosmid cQ which approximatelyspans the CpG islands was used to probe a zoo blot as shown in figure3-9. It is evident from the results of this zoo blot analysis thatphylogenetically conserved DNA, suggesting the presence of exons, isencoded within the region spanned by cosmid cQ. Although thisexperiment provided evidence for the presence of exons within the 50kb interval, no positional information as to where in the interval theexons map was obtained because a 48 kb cosmid probe was used.To more precisely localize the phylogenetically conserved DNAdetected by cosmid cQ, a zoo blot analysis was performed utilizingsubcontig restriction fragments. The first probe tested is the 10 kbEcoRl fragment DpE2 common to cosmids cDp, cQBI and cQ (Fig. 3-13).The rationale for assessing DpE2 was based on the dual observationsthat DpE2 is flanked by CpG islands and that DpE2 detects weaknoncognate restriction fragments on Southern blots (Fig. 3-10). Fromthese observations it was hypothesized that DpE2 might encode amember of a multigene family.As a preliminary test of this hypothesis a Bios somatic cellhybrid mapping panel was hybridized at reduced stringency with DpE2(Fig. 3-11). The results of this experiment did reveal that DpE2 detectscross-species conservation in hamster and, moreover, providedevidence that DpE2 does indeed encode a member of a multigene family.Furthermore, hybridization of DpE2 to a zoo blot identified crossspecies conservation in rat, hamster, cat, bovine and whale (Fig. 3-12).73A second subcontig restriction fragment tested for phylogeneticconservation c26H1 (Fig. 3-14) also detected strong cross-speciesconservation upon zoo blot analysis (Fig. 3-13). Significantly, c26H1was found to map approximately 8 kb proximal to CpG island II adjacentto the 3.3 kb single copy probe BS-1 which detects significantnonrandom allelic association with the HD mutation (Andrew et aL,1992).-Figure 3-9A zoo Southern blot of EcoRl digested genomic DNA ,from lane 1 cat;lane 2 rabbit; and lane 3 quail, probed with the whole radiolabeledcosmid cQ. The location of cosmid cQ in the 126 kb contig is shown infigure 3-2.74Figure 3-1010kbSouthern blot of human genomic DNA digested with EcoRl and hybridizedwith the genomic probe DpE2. Hybridization signals emanating from apair of noncognate restriction fragments are indicated by arrows.-DpE275Figure 3-11Two lanes from a Bios somatic cell hybrid mapping panel hybridizedwith DpE2 subsequent to blocking for repetitive elements. Bothhybridization and washing were performed at reduced stringency. A 10kb human 4p16.3 specific signal, in lane 2, is indicated by the upperarrow whereas the corresponding murine homolog seen in both lanes isindicated by the lower arrow. Numerous additional murine and/orhuman hybridization signals are evident below the murine specificsignal in both lanes.-I76Figure 3-1212345-Detection of phylogenetically conserved DNA in five species ofvertebrates using the hybridization probe DpE2. Cross-speciesconservation is observed in rat, hamster, cat, bovine and whale DNA.The genomic DNA on this zoo blot was digested with Hincli.Hybridization and washing were performed under stringent conditions.Lanes (1) rat, (2) hamster, (3) cat, (4) bovine and (5) whale.77Figure 3-13123456‘1Detection of phylogenetic conservation using the hybridization probec26 Hi. The map position of c26H1 is shown in figure 3-14. Thegenomic DNA on this zoo blot was digested with Hincil and hybridizedand washed under stringent conditions. Lanes (1) hamster, (2) cat, (3)bovine, (4) whale, (5) salmon, and (6) quail.11783.2.5 Transcript IdentificationThe sub-localization of phylogenetically conserved DNA to a pairof restriction fragments DpE2 and c26HI served to focus the search forgenes. Because DpE2 detected cross-species conservation and appearedto encode a member of a gene family, DpE2 was used to screen a humancaudate cDNA library. This resulted in the isolation of six cDNAs whichwere revealed by sequence analysis to encode the human homolog of thebovine and murine f—subunits of rod photoreceptor cGMPphosphodiesterase (Bowes et al.,1990; Lipken et aI.,1990; Collins et al.,1992). The identification and characterization of this gene forms thebasis of chapter 4.Northern blot analysis using the 3.3 kb single copy probe BS-1(Andrew et al., 1992) identified an abundant transcript specific toAfrican Green Monkey skeletal muscle. The subsequent screening of ahuman fetal muscle cDNA library with BS-1 resulted in theidentification of a second gene encoding for a novel regulatory myosinlight chain (Collins et al., 1992). Chapter 5 describes theidentification and characterization of this gene.Despite the screening of multiple cDNA libraries includingcaudate, putamen, basal ganglia, fetal whole brain, fetal retina andadult retina, no evidence for additional genes within the 50 kb intervalwas obtained. Similarly, Northern blot analysis performed with probesspanning the interval produced no evidence for additional genes.793.3 DISCUSSIONA hypothesis was proposed to explain the occurrence of aterminal 4pi6.3 recombination (Weber et at., 1992) in a patientreported to be suffering from a spontaneous mutation at the HD locus(Wolff et al., 1989). This hypothesis predicted that an unequalrecombination at the HD locus was directly responsible for theoccurrence of HO in the sporadic patient.As a first step toward testing this hypothesis a chromosomewalk was performed culminating in the molecular cloning of 126 kb ofchromosome 4p16.3. The cosmids comprising the contig werefingerprinted to facilitate the alignment of the cosmids, to identifysources of potential polymorphism and to gain insights to thestructural organization of the chromosomal region. Fingerprintinglocalized a minimum of three (GT/CA)n microsatellites, a high densityof Alu repeats, no Li elements and an interstitial telomeric repeatwithin the contig. The high density of Alu elements and lack of Lielements is consistent with the finding of Korenberg and Rykowski(1988) that Alu elements cluster in Giemsa tight bands, whereas, Lielements tend to predominate Giemsa dark bands.The identification of an interstitial telomeric repeat within thecontig (Fig. 3-6) was interesting in view of a hypothesis proposed byLaird (1990) to explain the genetics of HD. In this model the HO genebecomes associated with telomeric or facultative heterochromatineffecting cis - inactivation of the gene’s transcription. Subsequenttissue specific somatic pairing between the HD and HD+ alleles leads tothe trans -inactivation of the wild type allele through the spread ofheterochromatic proteins. Precedent for such a mechanism (termed80dominant-postion effect variegation) derives from the brown locus inDrosophila melanogaster (Henikoff and Dreesen, 1989). However, thisfinding is most likely a chance association since the telomeric repeatwas identified in cosmids cloned from normal human DNA.Initially we had expected to exploit the polymorphic nature of themicrosatellites and Alu elements for refining the site of recombinationin the sporadic patient but the elements characterized proved to beuninformative in the sibship. However, a pair of RFLPs D4S227 andD4S228 isolated from the contig and informative in the affectedsibship did narrow the site of recombination from over 1 Mb to 50 kb.Despite intense efforts, this 50 kb interval could not be narrowedfurther and, in addition, no evidence for an unequal recombinationevent, detectable by Southern blot analysis, was identified.A search for genes encoded within the 50 kb interval wasconducted because it was hypothesized that if the predicted unequalrecombination involved even a single nucleotide it would havedetectable consequences in a gene. To identify genes, restrictionfragments associated with phylogenetically conserved DNA, were usedto screen cDNA libraries and/or Northern blots. This resulted in theisolation and characterization of the PDEB (Weber et aL, 1991; Collinset al., 1992) and MYL5 (Collins et al., 1992) genes. The identificationof two genes within the 50 kb interval is consistent with recentstudies on chromosmomes 4p16.3 (Carlock et al., 1992; McCombie et al.,1992) in HD candidate region I, 6p (Kendall et al., 1990) and 19q13.3(Martin-Gallardo, et al., 1992) revealing coding capacities ofapproximately one gene per 20 kb.81Figure 3-14Schematic Representation of Major FindingsPDEB cDNAs MYL5 cDNAsThe major findings of this chapter are summarized in figure 3-14. Aregion of chromosome 4p16.3 known to be recombinant in a patientwith a new mutation at the HD locus has been cloned by chromosomewalking and an EcoRl (RI) restriction map constructed. Informativemarkers (arrows) at the loci D4S227 and D4S228 were isolated andused to define a 50 kb interval containing the site of recombination inthe patient. Characterization of this chromosomal region revealedthree CpG islands associated with phylogenetically conserved DNA. Thescreening of cDNA libraries with cDpE2 and BS-1 led to theidentification multiple cDNA5 encoded by the PDEB and MYL5 genes. Aninterstitial telomeric repeat was localized within the 50 kb interval(striped box).1D4S228‘- 1 1HD? 4D4S2271CpG mCpG CpG823.4 ReferencesANDREW, S., THEILMANN, J., HEDRICK, A., MAH, D., WEBER, B., ANDHAYDEN, M.R. (1992). Nonrandom association between theHuntington disease and two loci separated by about 3 MBP.Genomics 13: 301-311.BOWES, C., TIANSEN, L, DANCIGER, M., BAXTER, L.C., APPLEBURY,M.L., AND FARBER, D. B. (1990). Retinal degeneration in the rd mouseis caused by a defect in the 13—subunit of rod cGMPphosphodiesterase. Nature 347: 677-680.BUETOW, K.H., SHIANG, R., YANG, P., NAKAMURA, Y., LATHROP, G.M.,WHITE, R., WASMUTH, J.J., WOOD, S., BERDAHL, L.D., LEYSENS, N.J.,RITTY, T.M., WISE, M.E., AND MURRAY, J. (1991) A detailedmultipoint map of human chromosome 4 provides evidence forlinkage heterogeneity and position-specific recombination rates.Am. J. Hum. Genet.48: 911-925.CARLOCK, L., WISNIEWSKI, M., PANDRANGI, A., AND VO, T. (1992).An estimate of the number of genes in the Huntington disease generegion and the identification of 13 transcripts in the 4p16.3segment. Genomics 13: 1108-1118.COLLINS, C., HUTCHINSON, G., KOWBEL, D., RIESS, 0., WEBER, B., ANDHAYDEN, M.R. (1992). The human 13—subunit of rod photoreceptor cGMPphosphodiesterase: complete retinal cDNA sequence and evidence forexpression in brain.Genomics 13: 698-704.COLLINS, C., SCHAPPERT, K., AND HAYDEN, M.R. (1992). The genomicorganization of a novel regulatory myosin light chain gene (MYL5) thatmaps to chromosome 4p16.3 and shows different patterns of expressionbetween primates. Hum. Mo!. Genet. 1: 727-733.HAYDEN, M.R. (1980). Huntington’s chorea. Springer-Verlag, NewYork.HENIKOFF, S. AND DREESEN, T.D. (1989). Trans-inactivation of theDrosophila brown gene: evidence for transcriptional repression andsomatic pairing dependence. Proc. Nat!. Acad. Sd. USA 86: 6704-6708.83KENDALL, E., SARGENT, C.A., AND CAMPBELL, R.D. (1990). Humanmajor histocompatability complex contains a new cluster of genesbetween the HLA-D and complement C4 loci. Nucleic Acids Res. 18:7251-7257.KORENBERG,J.R., AND RYKOWSKI, M.C. (1988). Human genomeorganization: Alu, Lines and the molecular structure of metaphasechromosome bands. Cell 53: 391-400.LAIRD, C.D. (1990). Proposed genetic basis of Huntington’s disease.Trends Genet. 6: 242-247.LIPKIN, V.M., KHRAMTSOV, N.V., VASILEVSKAYA, l.A., ATABEKOVA,K.G., MURADOV, Li, T., JOHNSTON, J. P., VOLPP, K. J., ANDAPPLEBURY, M. L. (1990). 3—subunit of bovine rod photoreceptorcGMP phosphodiesterase. J. Biol. Chem. 263: 12955-12959.MARTIN-GALLARDO, A., MCCOMBIE, W.R., GOCAYNE, J.D.,FITZGERALD, M.G., WALLACE, S., LEE, B.M.B., LAMBERDIN, J., TRAPP,S., KELLEY, J.M., LIU, L-I., DUBNICK, M., JOHNSTON-DOW, L.A.,KERLAVAGE, A.R., DE JONG, P., CARRANO, A., FIELDS, C., ANDVENTER, J.C. (1992). Automated DNA sequencing and analysis of106 kilobases from human chromosome 19q13.3. Nature (Genetics).1: 34-39.McCOMBIE, W.R., MARTIN-GALLARDO, A., GOCAYNE, J.D., FITZGERALD,M., DUBNICK, M., KELLEY, J.M., CASTILLA, L., LIU, L.l., WALLACE, S.,TRAPP, S., TAGLE, D., WHALEY, W. L., CHENG, S., GUSELLA, J.,FRISCHAUF, A.-M., POUSTKA, A., LEHRACH, H., COLLINS, F.S.,KERLAVAGE, A.R., FIELDS, C., AND VENTER, J.C. (1992). Expressedgenes, Alu repeats and polymorphisms in cosmids sequenced fromchromosome 4p16.3. Nature (Genetics) 1: 348-353.ORITA, M., SUZUKI, V., SEKIYA, T., AND HAYASHI, K. (1989). Rapidand sensitive detection of point mutations and DNA polymorphismsusing the polymerase chain reaction. Genomics 5: 874-879.PRITCHARD, C.A., CASHER, D., UGLUM, E., COX, D.R., AND MYERS, R.M.(1989). Isolation and field inversion gel electrophoresis analysisof DNA markers located close to the Huntington’s disease gene.Genomics 4: 408-418.84STALLINGS, R.L., TORNEY, D.C., HILDEBRAND, G.E., LONGMIRE, J.,DEAVEN, L., JETT, J., DOGGET, N., AND MOYZIS, R. (1990). Physicalmapping of human chromosomes by repetitive sequencefingerprinting. Proc. Nat!. Acad. Sc!. USA 87: 6218-6222.STEVENS, D.L., AND PARSONAGE, M. (1969). Mutation in Huntington’schorea. J. Neurol. Neurosurg. Psychiatry. 32: 140-143.WEBER, B., COLLINS, C., KOWBEL, D., RIESS, 0., AND HAYDEN, M. R.(1991). Identification of multiple CpG islands and associatedconserved sequences in a candidate region for the Huntingtondisease gene. Genomics 11: 1113-1124.WEBER, B., HEDRICK, A., ANDREW, S., RIESS, 0., COLLINS, C.,KOWBEL, D., AND HAYDEN, M.R. (1991). Isolation andcharacterization of new highly polymorphic DNA markers from acandidate region for the Huntington disease gene. Am. J. Hum.Genet. 50: 382-393.WEBER, B., RIESS, 0., HUTCHINSON, G., COLLINS, C., BIAQYANG, L.,KOWBEL, D., ANDREW, S., SCHAPPERT, K., AND HAYDEN, R.M. (1991).Genomic organization and complete sequence of the human geneencoding the 13—subunit of the cGMP phosphodiesterase and itslocalisation to 4p16.3. Nucleic Acids Res. 19: 6263-6268.WEBER, B., RIESS, 0., WOLFF, G., ANDREW, S., COLLINS, C., GRAHAM, R.,THEILMANN, J., AND HAYDEN, M.R. (1992). Delineation of a 50 kilobaseDNA segment containing the recombination site in a sporadic case ofHuntington’s disease. Nature (Genetics) 2: 216-222.WOLFF, G., DEUSCHL, G., WIENKER, T.F., HUMMEL, K., BENDER, K., LUCKING,C.H., SCHUMACHER, M., HAMMER, J., AND OEPEN, G. (1989). New mutationto Huntington’s disease. J. Med. Genet. 26: 18-27.85Chapter 4Identification and Characterizationof the Gene Encoding for the p-subunitof Rod Photoreceptor cGMPPhosphodiesterase86a4.1 introductionRod cell cGMP PDE is a heterotrimeric peripheral membrane boundprotein composed of a catalytic a- and 13-subunit, and two inhibitory ysubunits (Baehr et aL, 1979; Deterre et al., 1988; Fung et al., 1990).Expression of the cGMP phosphodiesterase f3-subunit gene haspreviously been described in rod cells of the mouse and bovine retina(Bowes et al., 1990; Lipkin et al., 1990). Rods are highly specializedretinal cells which function in the detection of photons and are able toconvert a single photon into a neural signal. The rod proteins rhodopsin,transducin and cGMP phosphodiesterase function together in modulatingthe phototransduction cascade. The absorption of a photon causesphotoexcited rhodopsin to interact with the GTP binding proteintransducin which in turn disinhibits cGMP phosphodiesterase. ActivatedcGMP phosphodiesterase then hydrolyzes 3’, 5’-cyclic GMP to 5’ GMP andintracellular cGMP levels decline. Reduced cytoplasmic cGMP results inthe closure of rod plasma membrane cation channels and a transienthyperpolarization of the plasma membrane (Hurley, 1987; Stryer,1986).In the course of characterizing the 50 kb HD candidate region II(Fig. 1-6), delineated by Weber et al. (1992), the gene encoding the 13-subunit of the rod photoreceptor cGMP phosphodiesterase (PDEB) wasidentified. I report here on the isolation of multiple cDNAs spanningthis gene, the full length retinal cDNA sequence, evidence for alternatesplicing and on the genomic organization of the PDEB gene. Moreover, itis shown for the first time that this gene is expressed in brainsuggesting that the encoded protein may play a role in cGMP mediatedsignal transduction in the brain. Putative functional domains within86the human protein have been identified by comparing the primarystructure of the PDEB to the primary structures of other photoreceptorPDE proteins.4.2.0 RESULTS4.2.1 Transcript Identification and Isolation of cDNAsA 50 kb segment of chromosome 4p16.3 thought to encode the HDgene has been genetically defined (Weber et al.,1992; Andrew, et al.,1992) and cloned by chromosome walking (Weber et al., 1991).Subsequently, the 50 kb interval was scrutinized for the presence ofencoded genes through the identification of CpG islands and by zoo blotanalysis (Chapter 3). Significantly, a 10 kb EcoRl fragment DpE2, probe(E) figure 4-1, located approximately 3 kb distal to CpG island II, wasfound to detect phylogenetic conservation (Chapter 3). Moreover, thisprobe (E) together with a probe isolated from fragment (E) by the interAlu PCR (Weber et. al., 1991b) was hybridized to a caudate cDNA libraryand six positive clones were identified and isolated.One of these cDNA clones (Ki) was used as a hybridization probeon a Northern blot of human RNA isolated from retina, frontal cortex,basal ganglia, caudate, skeletal muscle, lung and adrenal gland. Aprominent signal of 3.5 kb and a less abundant 4.5 kb transcript wasobserved following an overnight exposure in retinal RNA (Fig. 4-2).Following a 14 day exposure, hybridization signals became evident infrontal cortex, basal ganglia and caudate at approximately 2.9 kb. Nosignal was observed in RNA from skeletal muscle, lung or adrenal gland(Fig. 4-2).In order to isolate additional cDNAs corresponding to the87Figure 4-1The Number of Hybridization Signals IdentifiedPer lxi 06 PFU Hybridized With GenomicFragment A, B, C & D, E (shown below)cDNA Libraries Hybridized A B C & 0 EAdult Retina 35 0 35 153Fetal Eye- 0 0 30Fetal Brain- 0 0 2Caudate- 0 0 6Putamen 0 08 88 8 8 8 8WfWW E+—Telomere I A ii B I I I I Centromere —,4.0 14.0 4.0 4.2 10 (kb)The results of screening 5 human cDNA libraries with radiolabelledgenomic fragments A, B, C+D and E. Libraries that were not screenedwith a particular probe are indicated by (-). Hybridization probe B, a7.8 kb cosmid end fragment, is indicated by a bold line.88transcripts observed on the Northern blot, an adult retinal cDNAlibrary, a fetal eye (l4wk) cDNA library, a fetal brain (l4wk) cDNAlibrary, an adult caudate and an adult putamen cDNA library werescreened with the genomic probes shown in figure 4-1. The number ofcDNAs identified upon screening all five cDNA libraries is shown infigure 4-1. Six cDNAs were isolated from the caudate cDNA library andfive of these were determined to be identical (data not shown). TwocDNAs were isolated from the fetal brain cDNA library and restrictionmapping revealed that these were independent but similar clones. Thelargest of these was sequenced for mapping relative to the retinalcDNA sequence. Screening of the adult retina cDNA library with theindicated probes resulted in a set of 6 overlapping cDNAs whichtogether span the PDEB gene (Fig. 4-3).4.2.2 The Full Length Sequence of the PDEB cDNAThe sequence of the full length cDNA for the PDEB gene is shownin figure 4-4. The translation initiation codon is underlined. Supportfor this being the true translation initiation site is derived from thealignment of the full length human 13—subunit cDNA sequence to thebovine and mouse 13—subunit cDNA sequences, by the alignment of thededuced human 13-subunit amino acid sequence to those of mouse andcow and by the nucleotide context which conforms to the Kozakconsensus sequence (Bowes et al., 1990; Lipken et al, 1990; Kozak,1987). A single open reading frame extends from the proposed startcodon to the first termination codon at position 2565. An Alu elementis present 130 bp downstream from the translation termination codonand in the opposite orientation to the transcription of the PDEB gene.89Figure 4-2wo—CD0 (O)WD x9.5‘ ii.14INorthern blot of total human RNA from retina (5ug) and caudate (2Oug)and 2ug each of polyA+ RNA from frontal cortex, skeletal muscle,adrenal glands, lung and basal ganglia. The signals observed wereobtained following a 14 day exposure. A photograph of the ethidiumbromide stained gel is shown below the autoradiograph.90KiI I4W FBiRestriction map of the complete PDEB coding region (box) and the 5’ and3’ untranslated regions (line) determined from overlapping adult retinacDNAs, and the relationship of the adult retinal, fetal retinal, caudateand the fetal brain cDNAs to the restriction map. The cDNA E2 isincompletely processed and contains intronic sequence indicated by thezig-zag line. The cDNA FBi was mapped by sequencing both it’s ends.One end maps to the 3’ untranslated region and one end represents cDNAcloning artifact (zig-zag). Triangles in FRi and FR2 indicate sites ofapparent alternate splicing.coo,000Cl)Figure 4-3C CO0 C (0(0 CowCl,o%ft TGAA.R 1’0.-43230bpA.R6A.R 2A.R2AR ‘AdultRetinacDNAsBraincDNAsFetalRetinacDNAsA.R7’F Caudate (E2, Ki)L Fetal Brain (FBi)V FRiI FR291No direct repeats flanking the Alu element are discernable in thesequence. The Alu element is underlined in figure 4-4. A potentialpolyadenylation signal, AATAAA, at position 3047 is underlined;however, it is unlikely that this is a functional polyadenylation signalsince no polyA tail has been identified. A computer generatedrestriction map of the full length PDEB cDNA is shown in figure 4- Genomic Organization of the PDEB GeneHybridization of cDNA clones AR-i and AR-2, which togethercomprise the full length PDEB cDNA sequence, to Southern blotscontaining the cosmids forming the 126 kb contig revealed that thePOEB gene spans approximately 40 kb of chromosome 4p16.3. cDNA AR2was found to be encoded for by the 10 kb EcoRl fragment DpE2 (data notshown), whereas, cDNA AR-i expands telomeric, from the 4 kb EcoRlfragment shared by cosmids Dp and cL3, to the 9 kb EcoRl fragment ofcosmid cL4 (Fig. 4-5). Importantly, exon one of the PDEB gene waslocalized to within this 9 kb fragment by hybridization of anoligonucleotide that spans the PDEB gene’s initiation codon to aSouthern blot of the contig cosmids (Fig. 4-6). Restriction mapping andsequencing of this 9 kb EcoRl fragment further localized exon 1 of thePDEB gene to a 1.8 kb BamHl end-fragment and, moreover, revealed thatthis exon is 469 bp in length. The subsequent sequencing of the PDEBlocus by Weber et al. (1991) led to the identification of 22 exonsspanning 43 kb. The complete genomic organization of the PDEB gene isdiagrammatically represented in figure 4-7.92Figure 4-4HSLSEEQARSPLDQNPDPARQYPGKKLSPENVGRGCEDGCPPDCDSLRDLCQVEESTALLELVQDMQESINHERVVFKVLRRLCTLLQADRCSLPHYRQRNGVAELATRLFSVQPDSVLEDCLVPPDSEIVPPLDIGVVGHVAQTKKHVNVEDVAEC PH PS S FADELTDYKTKHI4LATP II4HGK DVVAV IHAVHKLHGPFFTSEDEDVPLKYLHFATLYLKIYHLSYLHHCETKRGQVLLW SAHKVFEE LTD I EHQPHKAFYYVRAYLHC ER?SVGL LDHTKEKEPPDVWSVLHCESQ PYSOP RTP DOES I VPYKV I DY LI HOSES 1KV! PTP SADHWALRSOL P SYVASSOPIC H! 9(NASADEHPK PQ KOAL ODSOWL IKHVLSHP I VOlES £EIVOVATFYHRKDOKPPDEQDEVLHESL?QPLOWSVHHTDTYDKHHKLKHRKDIAQDHVLYHVKCDRDEIQLILPTRARLOKEPADCDEDELOEILKEELPOPTTPDIYEPHPSDLECTELDLVKCO I QSOYYELOVVRKFQ I PQEVLVRFLPS I 050TH HITYHHHHHGFHVAQTHPTLLHTOKLKSYYTDLEAPAHVYAOLCHDIDHKGTHHLYQNKSQHPLAKLHGSS ILEHHHLKFOKPLLSEETLHIYQHLHRKQHEHVIHLHDIAI IATDLALYPKSRAHPQKIVDESKHYQDKKSWVEVLSLETYRKEIVHAMHHTACDLSAITKPHEVQSEVALLVAAEPWEQODLEHTVLDQOP I PHHDRHKAAELPKLQVGPIDPVCTFVYKEPSRFHEE ILPHPDRLQHHRKEHKALADEYEAKVKALKEKEEEEKVAAEEVOTE I CHOOPAPKSSTC CI LGfltTGcCIOOCYATrIOCTACAAOAOOTTAOOAAGCCCHAOAMAItACItAAORItAntItOATADrPFAAnnrj-nin-n-nipziiiiuacA1-an’.C7CTOItACComplete nucleotide sequence of the fi—subunit of the human rod cGMPPDE cDNA with the encoded amino acid sequence shown below thenucleotide sequence. The translation initiation codon, terminationcodon, Alu element insertion and potential polyadenylation signal areunderlined.12134241743611144HZ1546011947212346412749HZ314109135412013941321434144147410615141HH155419015,41921434294167421617142241754240179425219342041276129813001312193Figure 4-52 3 4 5 6 7 8 9 109.0 kb.‘—‘4.2 kb—4.0 kb1*—3.0kb2.4 kbI1.5 kbRI RI RIRI RI RI RII 1111111 II il IMiul MIul NotiAR-i cDNA hybridized to a Southern blot of the five stable cosmids(Fig. 3-3), forming the 126 kb contig, digested with EcoRl+Notl andEcoRl +Mlul. Lane 2 cDl3, 3÷4 cD, 5+6 cDp, 7+8 cQBI and 9+10 cL4. Thesizes of the hybridizing restriction fragments are shown in kilobases.Two hybridizing cosmid end-fragments are demarcated with stars. Adiagramatic representation of the Southern blot data is shown belowthe autoradiograph. The minimum number and locations of the PDEBexons, suggested by the autoradiograph, are indicated by bold verticalbars.S94Figure 4-612345678910h•_.. .LLocalization of the PDEB genes first retinal exon to within the 9 kbEcoRl fragment of cosmid cL4. Hybridization of an end-labeledoligonucleotide (5’-CCTCACTGAGGCTCATGGTGCTGCC-3’), that spansthe initiation codon of the PDEB cDNA, to a Southern blot of the contigcosmids. Single hybridization signals of approximately 9 kb(EcoRl+Notl) and 8.5 kb (EcoRl+Mlul) are observed in the lanescontaining cosmid cL4. Signals remaining from a previoushybridization are indicated.95Figure 4-7.4-.‘II I’II’ ‘llhIrIIIlIlIIIIi____SE BSMCpG CpG CpGDiagramatic representation of the genomic organizationof the PDEB gene. The 22 exons of the PDEB gene are indicated byvertical bars. The largest exons, 1 and 22, are represented as boldvertical bars. EcoRl sites are indicated by (RI). Restriction mappingthe 9 kb EcoRl fragment, encoding exon 1, revealed restriction sites forEagi (E), Sacli (S), Miul (M) and BamHI (B). Note that only one of threeBarn HI sites is shown. Arrows demarcate the interval hypothesized toencode the HD gene. Mapped interstitial telomeric repeats arerepresented by a stippled box.964.2.4 Evidence for Alternate ProcessingAlignment of the sequenced adult retinal cDNAs provides evidencethat the primary PDEB gene transcript may be alternately processed inadult retina cells. Sequence comparison between cDNAs AR-i and AR-3identified twelve nucleotides (5’-TCCCTGACACA.-3’) present in cDNAAR-3 but deleted from cDNA AR-i (Fig. 4-8). Interestingly, thesetwelve nucleotides conform to the acceptor splice site consensussequence [(Y)6XAG] (Padgett et al., 1986) in 6 of 9 positions whereasthe murine (Bowes, et al., 1990) and bovine (Lipkin, et al., 1990)sequences conform in 7 of 9 positions (Fig. 4-8). Alignment of thetwelve nucleotides to the genomic PDEB sequence (Weber et at., 1991)revealed that the 12 nucleotides encode the first four codons of exon10 (Weber et al., 1991). Cumulatively, these findings suggest that thein-frame deletion, identified in cDNA AR-i, resulted from utilizationof an alternate acceptor splice site encoded in the first 12 nucleotidesof exon 10 of the PDEB gene.Additional evidence for alternate processing of the primary PDEBgene transcript was obtained upon alignment of the adult retinal andfetal retinal cDNA sequences. This alignment revealed a 65 bp deletionin exon 21 of the fetal retinal cDNA sequence, predicted to cause exon22 to be translated out of frame. To further investigate this finding,PCR reactions were performed on adult retinal, fetal retinal, putamen,temporal cortex, fetal brain, and mouse brain cDNA libraries utilizingprimers flanking the deleted interval. Two populations of moleculesdiffering by approximately 65 bp were amplified from each library (Fig.4-9). This result strongly suggests that exon 21, of the PDEB gene, is97Figure 4-8The first 12 nucleotides of exon 10 predicted to encode a functionalsplice site acceptor sequence.TCCCT[G]AACAGThe first 4 codons of exon 10 encoding 9 bp (bold) which conform to theconsensus [(Y)6XAG] (Padgett et at., 1986) slice site acceptor in 6(underlined) of 9 positions. Comparison of the 9 nucleotides to themurine and bovine sequences revealed a substitution of a C for thebracketed G in both these species. Moreover, these sequences conformto the consensus sequence in 7 of 9 positions.98Figure 4-9Demonstration of alternate splicing in exon 21 of the PDEB gene.1 2 3 4 5 6 7 8 9 10— —- —— — —710 bp645 bp1151 1 21\J 22Result of PCR amplification from 7 different cDNA libraries utilizingprimers flanking exon 21. Lane(s) 1. adult retina, 2. fetal retina, 3.fetal retina cDNA #3, 4 .adult retina cDNA #2, 5. temporal cortex, 6.putamen, 7. fetal brain, 8. fetal brain (Clontech) and 9. mouse brain.Two amplification products, in each lane, differing by 65 bp are evidentat 710 bp and 645 bp reflecting alternate splicing of the PDEB genesprimary transcript. The position of the primers in exons 15 and 22 isindicated by arrows The segment of exon 21 deleted by alternatesplicing is indicated by the bent arrow.99alternately spliced in the human central nervous system including theretina and, moreover, that this alternate processing has been conservedin phylogeny.4.2.5 Identification of PolymorphismsA comparison of the sequenced PDEB cDNAs with one another andwith the PDEB genomic sequence (Weber et at., 1991b) has identified 2polymorphisms in the coding portion of the cDNA. Two silentsubstitutions include a CGG to CGT transversion in codon 553 (whichcodes for arginine) and a GTG to GTA transition in codon 835 (whichcodes for valine). In the 3’ untranslated region, three polymorphismshave been revealed. An insertion of a G occurs at position 2592, a G toA transition is found at position 2598, and the insertion of T occurswithin the polyT tail of the Alu element.4.2.6 Amino Acid AlignmentsThe amino acid sequence deduced from the PDEB cDNA sequence isshown in figure 4-4. The predicted protein is composed of 854 aminoacids and has a predicted molecular mass of 98.4 KDa. The amino acidalignment of the 3—subunit of the human rod cGMP PDE with the a-subunit of human rod cGMP PDE (Pittler et aL, 1990), the a-subunit ofthe bovine rod cGMP PDE (Ovchinnikov et at., 1987; Pittler et al., 1990),the p-subunit of bovine rod cGMP PDE (Lipkin et al., 1990) and the 3-subunit of the mouse rod cGMP PDE (Bowes et at., 1990) is shown infigure 4-10. The amino acid sequence of the PDEB shares 90% identitywith the 13-subunit of the bovine rod cGMP PDE and 91% identity withthe 13-subunit of the mouse rod cGMP PDE. The least conservation100occurs at the N- and C-terminal regions of the protein. The PDEB alsoshares 71% identity with the x-subunits of cGMP PDE of human andcow, respectively.101r : :boV ijQM LA_________00 V --T” 1I -. -. N NFamlCn SIT 10- 0 U HO B 0 00- --0 00000NOP 000PRO VT KS Cbovcgrnppr. HIT V N S B C CO - NOA005500 GKSP•GGRA Vt VSFigure 4-10:bdvcgmp - LIE S H H 0 G C RgP.-Ef0tflA N ACJJC P E OT S P•EA VhUmCgplB OEVA E ES: S IS K VNLHYKAKLISDLLOAIC040V FSIYNSPSSUEl 0 LB F LOT Ed NbovcgflIppIt GEVIA E EI( S VS K VNLBYRAKVISDLLGPREAAV FSNYHARNS El D LB F 0 LOAKCV N________flumcgprb XWflh..,y.......NNLVOUKNOtP-tARLHON*ILERNNLEPOKPLLSIBI1PTB AGLCN0,0NII9TNNLyONkHO1J%VNKI.HONSlKIHHHLIFQKPI.t.ASOVBSnIp HOLCIlE,D(4HaTNNLVOUKNOlWLHGHl1LENHINl.NFOKPLI.Sh.IBIVgVH — AI .-Br.pDHaeTNuLyO.SUNlLAiJLHNSHILENHHLEFOKTLt.flbovogn,lIprb ._ ._. -. . . ..,.. ..__.,.. NYTA :oc,rDHRQTNULVO.Il.OIIWt’%rL-ItOSH I LGRHNLEPOKTILBhwTIScGmpBBPd.bOVCQITPhomcgpTabOVCgmpprTrn,-MRHoHEHVIH’.u-rAIVAlID I Dfl,N•VEDXKSWVIYLSLETTNKNI V.A...VAS UV IVONLNNBONEHVI BLUE IAI IA DENYEDBKIWYR-V-hS.LITTMKEIYMAUIUT0000 IFONLNRR0NII4AINUUDIAI - GIETTESEGEWTGVUIILEOTIIKEIVNANNUTACDO IFONLNNROHDHGINUNDI C QIKTYET0OVWTOYUULOOTRKEIVUAUUNTACOTPV BEEP SEP HE! II. PUEDPBTKEPNRPH!S I-L.P$FDSTE VYKEP UHF 15111, NEDTF BYE EF NH F H C N IT PULDTHY YB CF SN PH C N IT PULDAmino acid sequence alignment of the human 13-subunit of rod cGMP PDE(humcgprb), the mouse 13-subunit of the rod cGMP PDE (mmpde), the cow13-subunit of rod cGMP PDE (bovcgmp), the human x-subunit of rod cGMPPDE (humcgpra), and the cow a-subunit of rod cGMP PDE. Shadedregions indicate amino acid identities in three or more members of thecGMP phosphodiesterases. Dashes represent spaces introduced tofacilitate alignments.1024.3 DISCUSSIONMultiple overlapping PDEB cDNAs from a retinal cDNA library havebeen isolated and characterized. Together these span the full length ofthe PDEB gene. This gene is expressed at high levels in the retina withrare transcripts being detected in the frontal cortex and basal gangliaincluding the caudate nucleus. The deduced amino acid sequence of thisprotein has 91% and 90% identity with the n-subunits of cGMP PDE geneof mouse (Bowes et al., 1990) and bovine (Lipkin et al., 1990) origin,respectively. This similarity strongly supports the conclusion that theidentified cDNA5 encode for the human PDEB gene.Hybridization of cDNAs AR-i and AR-2 to Southern blots of thecosmids forming the 126 kb contig revealed that the retinal exonsencoded by the PDEB gene are distributed over approximately 40 kb ofchromosome 4p16.3. Restriction mapping the 9 kb EcoRl fragment ofcosmid cL4, encoding exon 1, localized exon 1 to within a 1.8 kb BamHlend-fragment and identified a putative CpG island, defined by singlerecognition sequences for Sacll and Eagi approximately 4.5 kb distal toexon 1 (Fig. 4-7). Sequencing of the PDEB locus by Weber et al. (1992)led to the identification of 22 retinal exons ranging from 48 bp to 469bp and 21 introns ranging from 80 bp to > 17 kb (Fig. 4-7).Interestingly, this 17 kb intron encodes CpG island II which suggeststhe possibility that an as yet unidentified gene is encoded within thisintron of the PDEB gene.The overall amino acid identity between the a (Pittler et al.,1990) and f3—subunits of the human rod cGMP PDE is 71%. However, thehomology between the proposed catalytic domain and the two proposednoncatalytic cGMP binding domains (Lipkin et al., 1990) for these103subunits is 86% and 76%, respectively. This similarity furthersupports the conclusion that the reported gene is part of thephosphodiesterase gene family and suggests that the a- and 13-subunitgenes are homologous, having evolved from a common ancestral genethrough a gene duplication event (Haldane., 1932; Muller.,1935; Ohno.,1970). A comparison of the genomic organization between the genescoding for the cx-subunit of bovine rod cGMP PDE and the 13-subunit ofmouse cGMP PDE suggests that these two genes did indeed evolve froma common ancestral PDE gene (Pittler and Baehr., 1991b).The greatest dissimilarity between the a- and 13- subunits of thehuman rod cGMP PDE occurs at the N- and C- terminal regions (aminoacid residues 1-85 and 824-854, respectively). A possible functionalsignificance for the divergence at the N- terminus is suggested by therecent finding that the inhibitory y-subunit, of the holoenzyme,interacts with this region of the a-subunit (Oppert et al, 1991). Thefunctional significance of the corresponding region in the 13-subunitprotein has not been biochemically defined.The position of the functional domains in the PDEB protein can bederived from their corresponding position in the 13-subunit of the bovinecGMP PDE protein (Lipken, et al. 1990). In this model there are twoproposed noncatalytic cGMP binding domains, cGMP binding domain I,and cGMP binding domain II (Charbonneau et al., 1990; Li et al., 1990)and a single proposed catalytic domain (Charbonneau et al., 1986;Ovchinnikov et at., 1987; Stroop et al., 1989; Oppert et at., 1991). Thefirst noncatalytic binding domain is coded for by residues 89-251 andthe second noncatalytic domain by residues 295-464. It has beensuggested that the two noncatalytic domains are related as a tandem104repeat (Charbonneau et at, 1990; Li et al, 1990) and in support of thiswe find a homology at the protein level, between the two domains, of21%.The position of the catalytic domain, which is the site of cGMPhydrolysis, has been assigned in the bovine protein to residues 555-790(Lipkin, et al., 1990) based upon strong sequence conservation betweenall members of the eucaryote PDE family (Charbonneau et al., 1986; Li,et al, 1990; Ovchinnikov, et al. 1987; Sass, et al., 1986). The catalyticdomain, in the human homolog, would then be assigned to residues 555-790 in the human protein based on this homology.Finally, residues 850-854 encode for a conserved CAAX motif, inwhich, C is cysteine, A is a mostly aliphatic residue, and X correspondsto the carboxy terminal amino acid. These motifs have been shown totrigger post-translational enzymatic prenylation of the cysteineresidue thereby facilitating membrane association of the modifiedpolypeptide (Vorburger et al., 1989; reviewed in Sinensky and Lutz,1992). Prenylated polypeptides acquire hydrophobic isoprenoid tailsconsisting of either a Ci 5 farnesyl moiety (Farnsworth et at., 1989) ora C20 geranylgeranyl moiety (Farnsworth et al., 1990; Rilling et al.,1990). Significantly, both the murine and human rod photoreceptor PDEn—subunits have identical C-terminal CAAX motifs (CCIL) which, in themouse, triggers geranylgeranylation and subsequent pdeb membraneanchoring (Qin et al., 1992). It seems highly probable that the humanPOEB polypeptide is similarly geranylgeranylated.Northern blot hybridization clearly demonstrates that the PDEBgene is predominantly expressed in the retina but also revealed asmaller size transcript expressed in the frontal cortex and basal105ganglia including the caudate. Two transcripts are observed in theretina, one giving a prominent band at 3.5 kb and the other a minorsignal at approximately 4.5 kb. This larger, less abundant, transcriptmay represent a differentially or incompletely processed p-subunittranscript.The observation, on a Northern blot, of a population of raretranscripts in brain RNA is supported by the identification of cDNAs inboth caudate and fetal brain cDNA libraries. 59 base pairs of thecaudate derived cDNA (E2) are identical to the last 60 codingnucleotides of the human PDEB gene. The single nucleotide differencecorresponds to the aforementioned silent polymorphism in codon 835.In addition this cDNA spans 153 bp of the 3’-untranslated region. Asecond caudate cDNA (Ki) shares complete sequence identity withnucleotides 1767-2485 of the proposed catalytic domain.To the author’s knowledge this is the first direct evidence for theexpression of the PDEB gene in a tissue other than retina. The low levelof expression observed on the Northern blot and reflected in thecaudate and fetal brain cDNA libraries may be due to low levels ofcellular expression. Alternatively, the low signal intensity may reflecta moderate to high level of expression in a very small subset of cellswithin a specific tissue.The observation that PDEB transcripts are smaller in brain than inretina may reflect tissue specific processing of the primary PDEB genetranscript. However, if the population of smaller transcripts observedin brain RNA is indeed the product of brain specific alternate splicing,then nucleotides 1767-2485 and 2524-2588 are not involved since thesequence of the two caudate derived cDNAs are identical to the retinal106PDEB sequence except for the single polymorphism. This would excludemost of the catalytic domain and the C-terminal domain from brainspecific alternate splicing. It is noteworthy, however, that the firstthree exons of the PDEB gene map distal to CpG island I and encode forthe noncatalytic cGMP binding domain I (Fig. 4-7). Transcription from abrain specific promoter at CpG island I ought to produce a transcript atleast 700 bp shorter than the retinal message and lacking cGMP bindingdomain I. Because cGMP binding by the two noncatalytic cGMP bindingdomains increases PDEB catalytic activity approximately thirty-fold aneuronal isozyme lacking domain I might exhibit a modified cGMPhydrolytic activity.Sequence analysis suggests that alternate processing of theprimary PDEB gene transcript may generate multiple PDEB isoenzymes,throughout the CNS and including the retina. Alignment of the adultretinal and fetal retinal cDNA sequences identified a 65 bp deletion inexon 21 of the fetal retinal cDNA sequence. PCR from multipleindependent cDNA libraries utilizing primers flanking the deletionidentified a population of similarly deleted PDEB cDNAs in adult retina,fetal retina, human brain and mouse brain. The exon 21 deletion ispredicted to result in exon 22 being read out of frame with thesubsequent loss of the exon 22 encoded isoprenylation signal. Thisfinding raises the intriguing possibility that alternate splicing of theprimary PDEB gene transcript may generate both membrane associatedand cytoplasmic PDEB isoenzymes.Sequence alignments of cDNA AR-i to cDNA AR-3 and to themurine and bovine PDEB homologs revealed a 12 bp deletion in AR-icDNA. Significantly, these 12 nucleotides conform to the acceptor107splice site consensus sequence [(Y)6XAG] (Padgett et aL, 1986) in 6 of 9positions. This finding suggests that the putative PDEB isoform mayresult from utilization of an alternate splice site acceptor sequenceencoded within the first 12 nucleotides of exon 10 of the PDEB gene.Support for this hypothesis comes from analysis of cDNAs from theprosaposin gene where the low frequency utilization of an introniccryptic slice site leads to the inclusion of 9 nucleotides in the maturemessage (Rorman et al., 1992). No biological significance is postulatedfor this apparent PDEB isozyme which may simply reflect theprobabilistic nature of the eucaryotic splicing machinery.It is noteworthy that the PDEB gene is expressed in both retinaand brain. Recent indirect evidence that the PDEB gene may beexpressed in the brain comes from the finding of a reduced number ofgranule cells in the hippocampal dentate gyrus of the rd mouse (Wimeret al., 1991). The PDEB gene maps to one of the candidate regionswhich, based on genetic analysis, have been shown to likely contain theHO gene (MacDonald et al., 1989; Andrew et al., 1992; Weber et al.,1992). It is of interest to consider the significance this could have forthe underlying pathogenesis of HO.In retina, reduced PDE activity leads to the accumulation ofcyclic GMP with subsequent neurotoxic effects culminating in selectivephotoreceptor cell death (Farber et al., 1974; reviewed in Pittler andBaehr., 1991 a). Theoretically, defective caudate PDEB activity couldalso lead to an increase in cyclic GMP and the subsequent selectiveneuronal death observed in HD (reviewed in Vonstattel, et al., 1985).One might then expect patients with HD to present with retinalproblems which, however, has not been found. Nevertheless, the108absence of retinal pathology in patients with HD could reflect differenttissue vulnerabilities or the effects of alternatively spliced geneproducts with differential cytotoxic effects.To test the hypothesis that an unequal recombination in the PDEBgene was, itself, responsible for the sporadic case of HD reported byWolff et al. (1992), the PDEB locus was exhaustively assessed forstructural alterations in the sporadic case and in numerous other HDpatients (Riess et al., 1992). Initially, Riess et al. (1992) screenedSouthern blots containing genomic DNA digested with EcoRl, BamHl,Sstl, and Pstl from 200 unrelated patients with cosmid subclonesspanning the entire PDEB gene. No genomic rearrangements wererevealed by Southern blot analysis, therefore, SSCP analysis wasemployed to investigate the PDEB genes 22 exons including 196 bp of 5’flanking sequence.No structural rearrangements were identifed upon assessing all22 exons from 16 unrelated HD patients, comprising 11 well definedethnic ancestries, and 31 controls (Riess et al., 1992). Furthermore, thefirst three PDEB exons map distal to delMC (D4S228) and, therefore,outside the 50 kb HD candidate region defined by Weber et al. (1992).The significance of this observation is twofold. First, it excludes thehypothesized unequal recombination from having occurred in a 5’regulatory element of the PDEB gene. Second, it similarly excludes thepostulated mutation from having disrupted either a brain specificpromoter element or a brain specific exon associated with CpG island Iin the 17 kb intron.The absence of any HD-specific structural rearrangements withinthe PDEB gene strongly supports the exclusion of mutations in the PDEB109gene as the cause for HD (Riess et at., 1992; Weber et al., 1992).Unfortunately, Riess et at. (1992) could not, absolutely, rule outmutations in either a brain specific PDEB exon(s) or an intronicregulatory element. If, however, the PDEB gene were in fact the HDlocus, it seems reasonable to expect HD causing mutations to haveoccurred in at least one of the 22 assessed retinal exons.The mutation causing autosomal recessive retinal degeneration(rd) in the mouse has recently been mapped to the gene encoding for the3—subunit of rod cGMP PDE (Bowes et al., 1990; Pittler and Baehr.,1991b). Significantly, the cloning of the PDEB gene will facilitate itsassessment, through the candidate gene approach, for a role in themolecular pathology of retinal disease.4.4 REFERENCESANDREW, S., THEILMANN, J., HEDRICK, A., MAH, D., WEBER, B., ANDHAYDEN, M. R. (1992). Nonrandom association between Huntingtondisease and two loci separated by about 3 MBP on 4p16.3. Genomics 13:301-311.BAEHR, W., CHAMPAGNE, M.S., LEE, A.K., AND PITTLER, S.J. (1991).Complete cDNA sequence of mouse rod photoreceptor cGMPphosphodiesterase a-and n—subunits, and identification of l’- aputative f—subunit isozyme produced by alternative splicing of the 3—subunit gene. FEBS 278: 107-114.BAEHR, W., DEVLIN, M.J., AND APPLEBURY, M.L. (1979). Isolation andcharacterization of cGMP phosphodiesterase from bovine rod outersegments. J. Biol. Chem. 245: 11669-11677.BOWES, C., TIANSEN, L., DANCIGER, M., BAXTER, L.C., APPLEBURY, M.L.,AND FARBER, D.B. (1990). Retinal degeneration in the rd mouse is causedby a defect in the 3—subunit of rod cGMP-phosphodiesterase. Nature347: 677-680.110CHARBONNEAU, H., BEIER, N., WALSH, K.A., AND BEAVO, J.A. (1986).Identification of a conserved domain among cyclic nucleotidephosphodiesterases from diverse species. Proc. Nat!. Acad. Sd. USA 83:9308-9312.CHARBONNEAU, H., PRUSTI, R.K., LETRONG, H., SONNENBURG, W.K.,MULLANEY, P.J., WALSH, K.A., AND BEAVO, J.A. (1990). Identification of anoncatalytic cGMP-binding domain conserved in both the cGMPstimulated and photoreceptor cyclic nucleotide phosphodiesterases.Proc. Nati. Acad. Sd. USA 87: 288-2 92.DAVIS R.L., AND DAUWALDER, B. (1991). The Drosophila dunce locus:learning and memory genes in the fly. Trends Genet. 7: 224-229.DAVIS, R.L., TAKAYASU, H., EBERWINE, M., AND MYRES, J. (1989). Cloningand characterization of mammalian homologs of the Drosophila duncegene. Proc. Nat!. Acad. Sd. USA 86: 3604-3608.DETERRE, P., BIGAY, J., FORQUET, F., ROBERT, M., AND CHABRE, M. (1988).cGMP phosphodiesterase of retinal rods is regulated by two inhibitorysubunits. Proc. Nat!. Acad. Sd. USA 85: 2424-2428.FARNSWORTH, C.C., WOLDA, S.L., GELB, M.H., AND GLOMSET, J.A. (1989).Human lamin B contains a farnesylated cysteine residue. J. Biol. Chem.264: 20422-20429.FARNSWORTH, C.C., GELB, M.H., AND GLOMSET, J.A. (1990). Identificationof geranylgeranyl-modified proteins in HeLa cells. Science 247: 320-322.FUNG, B.K., YOUNG, J.F., YAMANE, H.K., AND GRISWOL-PRENNER, I. (1990).Subunit stoichiometry of retinal rod cGMP phosphodiesterase.Biochemistry 29: 2657-2664.HALDANE, J.B.S. (1932). The causes of evolution. Longmans and Green,London.HURLEY, B.H. (1987). Molecular properties of the cGMP cascade ofvertebrate photoreceptors. Ann. Rev. Physiol. 49: 793-812.KOZAK, M. (1987). An analysis of 5’-noncoding sequences from 699vertebrate messenger RNAs. Nucleic Acids Res. 15: 8125-8148.111Li, T., VOLP, K., AND APPLEBURY, Mi. (1990). Bovine cone photoreceptorcGMP phosphodiesterase structure deduced from a cDNA clone. Proc.Nat!. Acad. Sd. USA 87: 293-297.LIPKIN, V.M., KHRAMTSOV, N.y., VASILEVSKAYA, l.A., ATABEKOVA, K.G.,MURADOV, Li, T., JOHNSTON, J.P., VOLPP, K.J., AND APPLEBURY, M. L.(1990). n—subunit of bovine rod photoreceptor cGMP phosphodiesterase.J. Biol. Chem. 263: 12955-12959.MACDONALD, M.E., HAINES, J.L., ZIMMER, M., CHENG, S. V., YOUNGMAN, S.,WHALEY, W.L., WEXLER, N., BUCAN, M., ALLITO, B.A., SMITH, B., LEAVITT,J., POUSTKA, A., HARPER, P., LEHRACH, H., WASMUTH, J.J., FRISCHAUF, AM., AND GUSELLA, J.F. (1989). Recombination events suggest potentialsites for the Huntington’s disease gene. Neuron 3: 183-190.MULLER, H.J. (1935). The origination of chromatin deficiencies asminute deletions subject to insertion elsewhere. Genetics 17: 237-252.OHNO, S. (1970). Evolution by gene duplication. Springer-Verlag, Berlin.OPPERT, B., CUNNICK, J.M., HURT, D., AND TAKEMOTO, D.J. (1991).Identification of the retinal cyclic GMP phosphodiesterase inhibitory ysubunit interaction sites on the catalytic a-subunit. J. Biol. Chem. 266:16607-16613.OVCHINNIKOV, Y.A., GUBANOV, N.V., KHRAMTSOV, K.A., ISCHENKO, V.E.,ZAGRANICHNY, V.E., MURADOV, K.G., SHUVAEVA, T.M., AND LIPKIN, V.M.(1987). Cyclic GMP phosphodiesterase from bovine retina: Amino acidsequence of the a-subunit and nucleotide sequence of the correspondingcDNA. FEBS Lett. 223: 169-173.PADGETT, R.A., GRABOWSKI, P.J., KONARSKA, M.M., SElLER, S., ANDSHARP, P.A. (1986). Splicing of messenger RNA precursors. Ann. Rev.Biochem. 55: 1119-1150.PITTLER, S.J., AND BAEHR, W. (1991a). The molecular genetics of retinalphotoreceptor proteins involved in cGMP metabolism. Prog. Gun. BiolRes. 362: 33-66.112PITTLER, S.J., AND BAEHR, W. (1991b). Identification of a nonsensemutation in the rod photoreceptor cGMP phosphodiesterase n—subunitgene of the rd mouse. Proc. Nat!. Acad. Sd. USA 88: 8322-8326.PITTLER, S.J., BAEHR, W., WASMUTH, J.J., MCCONNELL, D.G., CHAMPAGNE,M.S., VANTUINEN, P., LEDBETTER, D., AND DAVIS, R.L. (1990). Molecularcharacterization of human and bovine rod photoreceptor cGMPphosphodiesterase cc-subunit and chromosomal localization of thehuman gene. Genomics 6: 272-283.QIN, N., PITTLER, S.J., AND BAEHR, W. (1992). In vitro isoprenylation andmembrane association of mouse rod photoreceptor cGMPphosphodiesterase cx and f3 subunits expressed in bacteria. J. Biol. Chem.267: 8458-8463.RIESS, 0., NOERREMOELLE, A., COLLINS, C., MAH, D., WEBER, B., ANDHAYDEN, M.R. (1992). Exclusion of DNA changes in the p—subunit of thecGMP phosphodiesterase gene as the cause for Huntington’s disease.Nature (Genetics) 1: 104-108.RILLING, H.C., BREUNGER, E., EPSTEIN, W.W., AND CRAIN, P.F. (1985).Prenylated proteins, the structure of the isoprenoid group. Science247: 318-320.RORMAN, E.G., SCHEINKER, V., AND GRABOWSKI, G.A. (1992). Structureand evolution of the human prosaposin chromosomal gene. Genomics 13:312-3 18.SAMBROOK, J., FRITSCH, E.F., AND MAN1ATIS, T. (1989). “MolecularCloning: A Laboratory Manual,” 2nd ed., Cold Spring Harbor Laboratory,Cold Spring Harbour, NY.SASS, P., FIELD, J., NIKAWA, J., TODA, T., AND WIGLER, M. (1986). Cloningand characterization of the high-affinity cAMP phosphodiesterase ofSaccharomyces cerevisiae. Proc. Nat!. Acad. Sd. USA 83: 9303-9307.STROOP, S.D., CHARBONNEAU, H., AND BEAVO, J.A. (1989). Directphotolabelling of the cGMP-stimulated cyclic nucleotidephosphodiesterase. J. BioI. Chem. 264: 13718-13725.STRYER, L. (1986). Cyclic GMP cascade of vision. Annu. Rev. Neurosci. 9:87-119.113VONSTATTEL, J.P., MYERS, R.H., STEVEN, T.J., FERRANTE, R.J., BIRD, E. D.,AND RICHARDSON, E.P. (1985). Neuropathological classification ofHuntington’s disease. J. Neuropathol. Exp. NeuroL 44: 559-577.VORBURGER, K., KITTEN, G.T., AND NIGG, E.A. (1989). Modification ofnuclear lamin proteins by a mevalonic acid derivative occurs inreticulocyte lysates and requires the cysteine residue of the C-terminal CXXM motif. EMBO J. 8: 4007-4013.WEBER, B., COLLINS, C., KOWBEL, D., RIESS, 0., AND HAYDEN, M. R.(1991a). Identification of multiple CpG islands and associatedconserved sequences in a candidate region for theHuntington diseasegene. Genomics 11: 1113-1124.WEBER, B., RIESS, 0., HUTCHINSON, G., COLLINS, C., BIAOYANG, L.,KOWBEL, D., ANDREW, S., SCHAPPERT, K., AND HAYDEN, M.R. (1 991 b).Genomic organization and complete sequence of the human geneencoding the f3—subunit of the cGMP phosphodiesterase and itslocalization to 4p16.3. Nucleic Acids Res. 19: 6263-6268.WEBER, B., HEDRICK, A., ANDREW, S., RIESS, 0., COLLINS, C., KOWBEL, D.,AND HAYDEN, M.R. (1992). Isolation and characterization of new highlypolymorphic DNA markers from a candidate region for the Huntingtondisease gene. Am. J. Hum. Genet. 50: 382-393.WEBER, B., RIESS, 0., WOLFF, G., ANDREW, S., COLLINS, C., GRAHAM, R.,THEILMANN, J., AND HAYDEN, M.R. (1992). Delineation of a 50 kilobaseDNA segment containing the recombination site in a sporadic case ofHuntington’s disease. Nature (Genetics) 2: 216-222.WIMER, E.R., WIMER, C.C., ALAMEDDINE, L., AND COHEN, A.J. (1991). Themouse gene retinal degeneration (rd) may reduce the number of neuronspresent in the adult hippocampal dentate gyrus. Brain Res. 547: 275-278.114Chapter 5The Genomic Organization of aNovel Regulatory Myosin Light ChainGene (MYL5) that Maps toChromosome 4p16.3 and ShowsDifferent Patterns ofExpression BetweenPrimates11 5a5.1.0 IntroductionThe cellular motor protein myosin is a hexameric ATPasecomposed of two heavy chains (MHC), two nonphosphorylated alkalilight chains (MLC-1/3) and two phosphorylatable regulatory lightchains (MLC-2). Myosin has generally been classified as being ofskeletal, smooth, cardiac or of nonmuscie origin (Harrington andRogers, 1984). Each of the three myosin subunits is encoded bymultigene families which exhibit tissue specific, developmental, andphysiologically regulated patterns of expression (Barton andBuckingham, 1985; Emerson and Bernstein, 1987). In humans theskeletal isoforms MYH1, MYH2, MYH3, MYH4 and MYH8 have been assignedto the short arm of chromosome 17; MYH5 to 7cen-qll.2; cardiacisoforms MYH6 and MYH7 to chromosome 14q11.2-q13; fast skeletalMLC1/3 (MYL1) to chromosome 2q32.1-qter, the ventricular muscleisoform MYL3 to chromosome 3p; and the atrial muscle isoform MYL4 tochromosome 17; and cardiac muscle like MYLL1 to chromosome 8(McAlpine et al., 1991). The gene encoding ventricular myosinregulatory light chain (MYL2) has recently been assigned to chromosome12q23-q24.3 (Macera et al., 1992). The structurally different isoformsencoded by each of the multigene families presumably reflect thefunctional complexity of the myosin motor protein.The myosin complex is found in all eucaryotic cells. In musclecells myosin plays a central role in the contractile process (Emersonand Bernstein, 1987) and in nonmuscie cells it has been implicated incellular locomotion, cytoplasmic streaming, cytokinesis, the mobilityof cell surface receptors, photoreceptor disc morphogenesis andsynaptic plasticity (Pollard, 1981; Pasternak et al., 1989; Morales and115Fifkova, 1989; Miller et al., 1992; Williams, 1991). Perhaps as aconsequence of the stoichiometric assembly of these isomorphicmyosin complexs, myosin mutations are often dominant (GeisterferLowrence et al., 1992; Emerson and Bernstein, 1987). Significantly, inthe process of characterizing the 50 kb interval of chromosome 4p16.3,hypothesized to encode the HD gene, a novel myosin regulatory lightchain was identified.5.2.0 RESULTS5.2.1 Transcript IdentificationCharacterization of the cosmids spanning the 50 kb HD candidateregion, defined by Weber et al. (1992), revealed phylogeneticallyconserved DNA associated with CpG island II (Figs. 3-2 and 3-13).Moreover, Northern analysis with the single copy restriction fragmentBS-1 (Andrew et al., 1992) (Fig. 5-1), which detects the cross-speciesconservation, revealed an abundant 600 bp transcript in RNA extractedfrom African Green Monkey skeletal muscle (Fig. 5-2). Interestingly,and in striking contrast to the simian expression pattern, thetranscript was not found to be expressed at detectable levels in humanadult muscle. It was, however, found to be weakly expressed in humanfetal skeletal muscle (Fig. 5-2). Additional transcripts are evident ineach lane of figure 5-2. These were interpreted to represent probecross hybridization to transcripts encoded by related genes.Nonspecific cross hybridization to the 28S RNA is also evident. Inaddition to the tissues assessed in figure 5-2, Northern blot analysisrevealed no evidence for a detectable level of transcription in the116Figure 5-11 kbCpG Island1 23 4567Lii WT3tell Ill lIl IlIlIll I IcenBS-1The 15 kb cosmid end fragment within which the MYL5 gene waslocalized. A CpG island occurs approximately 3.5 kb telomeric to theMYL5 gene. The seven encoded exons are represented by vertical barswith the noncoding regions of exons 1 and 7 unshaded. The genomicfragment BS-1 used for both Northern analysis and cDNA libraryscreening is drawn as a line below the exons. Mapped restriction sitesare shown.117Figure 5-228S—18S—\obp—i123Northern blot of total RNA from (1) adult African Green Monkey skeletalmuscle, (2) human 20 wk fetal skeletal muscle, and (3) human adultskeletal muscle. Lanes two and three are overloaded relative to lane 1.Transcripts are evident in both lanes (1) and (2) at approximately 600bp. Nonspecific cross hybridization to the 28S RNA is seen in each lane.118cerebellum, occipital and parietal cortex, adrenal, thyroid, stomach,heart, testis, or jejunum of the African Green Monkey (data not shown).5.2.2 Isolation of cDNAsA human fetal (15 wk) skeletal muscle cDNA library was screenedwith the probe BS-1 to isolate cDNAs corresponding to the transcriptobserved in the human fetal skeletal muscle RNA. Two positive cDNAclones, fetal muscle-lA and fetal muscle-3A (Fig. 5-3), were isolatedfrom the fetal muscle cDNA library and sequenced. Compared to thetranscript observed in figure 5-2 it is apparent that these cDNAs arenot full length. Therefore, to obtain a full length cDNA, inside out PCR(Rosenberg et al., 1991) was performed utilizing primer p1 (Table 1 andFig. 5-3) in combination with a lambda GT1 1 primer (see Materials andMethods). This resulted in the isolation of 5’ extension cDNAs, one ofwhich spans the predicted start codon FM-PCR1O (Fig. 5-3). In order todetermine the representation of this gene in different cDNA libraries,adult retinal, basal ganglia, and cerebellum cDNA libraries werescreened by PCR using primers p2 and p3 (Table 5-1 and Fig. 5-3).Thirty cycles of amplification resulted in a strong 311 bpamplification product from the retinal cDNA library and following 40cycles, amplification products of the expected size were also obtainedfrom the cerebellum and basal ganglia cDNA libraries (Fig. 5-4). Theadult retinal cDNA library was subsequently hybridized with the probeBS-1 and 30 positive cDNAs were identified. One of these cDNAs, adult119Figure 5-35Obpc, C. C) C C —c a CD 00) CD CUCU CU CD 00 CD CDx z x xp2.1 •4—p3 [_—iAG AGAdult Retina-3Fetal Muscle-lAFetal Muscle-3AFM-PCR1 0FM-PCR1 2FM-PCR1 1Restriction map of the MYL5 cDNA. The coding region is drawn as a boxand the noncoding regions as lines. Arrowheads indicate the polarity ofeach PCR and sequencing primer (Table 5-1 and Fig. 5-5). SequencedcDNA5 are shown as lines.120Table 5-1cDNA Sequencing and PCR PrimersPrimer Sequence Annealing Temp.°Cpla 5’ GAGCTTAAGATGGGCAAGACCTGGGG 72p1 3’ GTGGATACGGAGGGACCCGTTCTGG 62p2 5’ CACCTATGCCTCCCTGGGCAAGACC 72p3 3’ CCCGTTGGACCTGATGTTCCGCGAG 72121Figure 5-412345369b p._.24Gb p—1 23 b p—IThe result of screening human cDNA libraries by the PCR using MYL5primers p2 and p3 (Table 5-1, Figs. 5-3 and 5-5). Lanes 1. 123 bpladder (BRL), 2. adult retina, 3. fetal muscle, 4. basal ganglia, and 5.cerebellum.122retina-3, was determined to be 661 bp in length and was presumed tobe full length.5.2.3 cDNA SequenceThe retinal cDNA sequence is shown in figure 5-5. Alignment ofthe fetal muscle and the adult retinal cDNA sequences showed thatthese cDNAs are identical and revealed no polymorphisms. A singleopen reading frame extending from an initiation codon at base 106 to atermination codon at base 625 and capable of encoding a 173 aminoacid peptide was identified. The predicted initiation codon conforms tothe Kozak consensus sequence (Kozak, 1987) and is preceded by two inframe termination codons. A polyadenylation signal AATAAA isencoded 15 bp 5’ from the polyadenylation site and is underlined infigure 5-5. A Genbank data base search found that the nucleotidesequence of these cDNAs share significant sequence identity to boththe human ventricular and rat cardiac myosin regulatory light chaingenes (Libera et al., 1989; Kumar et al., 1989).123Figure 5-5GGAGTGGCAG CcGOACrCrACTGTCCTG OGGGACCMG CAGGAG AGATGOGCAA GACCTGGGOC CTGOGCAGAcCTCACCGTC GGCCTCAGAC FrGAcAGOAc ccccTGGDrc GTcCTCOMT TCmaXGVF CTGGACcccG GGACCC1CT112CGCX1CM?G CAOGCAGMO CAGGC TO 000 AGC AGO MG ACC MG MG MG GM 000 GOT CCC crc COGGCCrAGTrrC CraDcrrCrrc GTCcG TAC CGO TCG TCC TrC TGG TfC TTC rrc crr CCC CCA COG GAG 0CCN A SR K TIC K K EGG AL A21GCC CAG AGA 000 TCA TCC MT 01C TTC TCC MC TTT GAG GAG ACT GAG MC GAG GAG T1C MG GACOG GTC TCr COG ACT AGO TTA GAG MG AGO flG MA C1C CrC TOA GTC TAG C1C CTC MG rrc CTA Q A A S S N V F S N F E Q T Q I Q £ F K E3___________GCA rrc ACA Cit ATG CAT GAG MC CiA GA? CCC ‘FTC k?T GAC MG GAG GAC CTG MG GAC ACC TATMG TOT GAG TAC Cm GTC ‘FF0 OCT CFA CCC MG TM Cit ‘FTC CTC Cro GAC ‘FTC Cit TOG AThA F T H ro Q N A D G F I D K DI L K 034CCC TCC Crc c4c MG ACC MC Cit AM GAC GAC GAG C10 GAC CCC MG Cit MA GAO 0CC TCG GGGAGO GAC ?TC TOG D1’G GAG ‘FTC Cit CTG CrC GM CTG COG TAC GAG ‘ITT CTC COG MC CCCA S L 0 K T N V K 0 0 5 L 0 A H LIX 5 A S415CCC ATC MC ‘FTC ACC ATG ii-r CT0 MC CTG ‘ITt’ 000 GAG MG C10 AGC cr ACC GAC CCC GAG GAG000 TAG TTG MG TGG TAC AM GAC TTG GAC AM CCC CrC ‘FTC GAC TCO C$DA TOG Cr0 COG CTC CitP I N F ‘F H Fl L N L F 0 E K L S G ‘F 0 AMC CCC ‘Fit MG ATG crc GAC CCC GM 000 AM cOG AM ATC MC MG GAGTOG TM GM rro COG MG ‘FTC TAC GAC CFG CCC C1G CCC ‘FF CCC ‘FFF TAG TTG TTC crc A* TAO‘F I L N A F K H LID P 0 G K G K I N K S YJi61MG COT CTO cr0 MO TCC GAG CC? GM MG ATO MG 000 GM GAG Cr0 GAC GAG MG ‘FTC GAG TTC‘FTC OCA GAC GAC TAC AGO GTC CGA crc TTC TAC TOG CCC CTT CTC CAC Cit GTC TAC MG G1C MGK ALL H SQ A OK H TA S S V 0 Q H F Q FCCC TCC ATC OAT CFG CCC GGC MC Cr0 GAC TM MG 000 CTC AGC TAC 010 ATC ACC CAC 000 GAGCOG AGO TAG CIA CM CCC CCC ‘FIG GAC CFG ATG ‘FTC CCC GAG TCG ATO CAC TAG TOG 070 CCC CrCIA S I 0 V A 0 N L 0 Y IdA L S Y V I ‘F H 0 5GAG MG GAG GAG TGA GAC CCAGCCGGGT CAATAA CCT GGACGCDTGG ACTC ‘FTC CTC CC ACT CTG GGTCCGCCCA GTTATTTGGA CCTGCGMCC TEKE S *Sequence of the MYL5 cDNA adult retina-3. The amino acid sequence ofthe encoded protein is shown below the nucleotide sequence. The fourEF-hand motifs are boxed. Exon boundaries are demarcated by verticallines and exons are numbered. Two in frame 5’ UTR stop codons, theATG initiation codon and the polyadenylation signal are underlined andthe termination codon is marked as a star. The location, in the cDNAsequence, of each primer (TabLe 5-1 and Fig. 5-3) is overlined.124Figure 5-6MYLS 1 )ASRJCTK-KJ(HUMOILC2 MAPKKAX-JRATMLC2 MSPKXADROMYLA HADEK1<KVKKKKTKEEGGTSETASAASEAATPAPAATPAPAA5ATGSKKIMYL5 10 EGGALRA- -HUMMLC2 AGGA NSNVFSMFEQrQIQZYKEAFT 1QNDGFIDIcwLRATMLC2 1EGG S SNVFSMFEQTIQEFKEAFT IMQNAGFIDZQJDLDROMYLA ASGGSRGSRKSKRAGSSVFsV?SQKQIA’jcEAPQIIGDLIINYL5 56 KDTYAS1GKTNVKDDLDAMLKEASGP INTMFLNLFGEKLSGTDA-- -EHUMMLCZ RDTFA.AL-RVNVKNEE IDEMIKAPGPZNTVTLMPGEKLKGADP-- -ZRATMLC2 RD17AALGRVNVKNEE IDEMIKEAPGP INTVFIMFREKLKGAGP---IDROMY LA RAAFDSVGK I A-NDKELDAMLGEASGPINFTQLLLFANRF1ATSGANDEDIIIMYL5 103 ETILNAFI(MLDPDGKGKINKEYIKRLLMSQADK).1TZEVDOQFVHUMMLC2 ET I LNAFKVFDPEGKG VLKADYVP.ENLTTQAERFS KEZVDQZ.WAAFPPDVRATMLC2 ETLLNAPKVFAP R-RR IDROMYLA EVVI AAFKTFDNDG--LIDGDKFRZNLMNFGDK’TMKEVDDAYDQMVIDDIvMYL5 153 AGNLD9(ALSYVIT-MGEEKEE---HUMMLC2 TGNLDYKNLVH I IT-HGEEKD -- - -RATMLC2 TGNLDYKNLVH I IT-RGEEKD- - --DROMYLA KNQID’TAALIEMLTGKGEEEEEEAAAmino acid sequence alignment of the predicted MYL5 protein, humanventricular regulatory myosin light chain (HUMMLC2) protein, ratcardiac regulatory myosin light chain (RATMLC2) protein, andDrosophila regulatory myosin light chain (DROMYLA) protein. Boldletters indicate amino acid identities in at least three of the proteins.A star indicates the serine at which phosphorylation is predicted tooccur and the four EF-hand motifs are overlined and numbered.1255.2.4 Genomic SequenceThe entire gene was localized to a 15 kb EcoRl cosmid endfragment by hybridization of the adult retina-3 cDNA to a Southern blotof cosmids isolated during a chromosome walk in this region (Weber etal., 1991). To further map the gene, the 15 kb end fragment wasdigested with several restriction endonucleases, Southern blotted andhybridized with the retinal cDNA shown in figure 5-3. Two contiguousBamHl genomic fragments of about 3.6 kb and 2.0 kb hybridized to thecDNA and were subsequently subcloned and sequenced by the sequencingcore facility of the Canadian Genetic Diseases Network. Sequenceanalysis of this 5.6 kb revealed a G+C content of 65% and no repetitiveelements. An alignment of the cDNA sequence to the genomic sequenceshowed that the gene is composed of 7 exons spanning approximately 4kb in the same transcriptional polarity as the PDEB gene (Weber et al.,1991). The 7 exons range in length from 49 bp to 136 bp, and the 6introns range in length from 189 bp to 875 bp (Table 5-2). All of thepredicted intron-exon junctions conform to the GT-AGrule (Table 5-2). In addition to sequencing the introns and exons 865 bpof the 5’ flanking region and 800 bp of the 3’ flanking region have beensequenced. A computer survey of the 5’ flanking 865 bp failed to revealthe presence of either CAAT or TATA box regulatory elements commonto other characterized MLC-2 genes (Nudel et al., 1984; Henderson et al,1989; Winter and Arnold, 1987). A map of the 15 kb cosmid endfragment encoding the 7 exons and the previously localized CpG islandis shown in figure 5-1.126Table 5-2Exon-Intron OrganizationIntron(bp)622No. Exon(bp)1081Sequence of Exon—Intron JunctionsGCgtgaagcagg2 108 189 cccaccgcagGCCAG.. .AGGAGgtgagacttt3 76 875 tggtccccagGCATT. . .CCTGGgtgaggtacc4 105 486 tctgtaacagGCAAG. . .GAGCGgtaggcaccg5 79 511 tttgtggcagGTACC.. .GAGTAgtgagtgccg6 49 642 tcgtcctcagCATCA. . . AAGAGgtctggcccg7 136 cccgccgcagGTGGAThe initiation codon in exon 1 is underlined. Exonic sequences areshown in capital letters, intronic sequences are in lowercase. Intronicjunction nucleotides conforming to the GT-AG rule are in bold.1275.3.0 Discussion5.3.1 Genomic OrganizationThe complete genomic organization of a new gene encoding a MLC2 isoprotein has been determined. This gene maps 700 kb from thechromosome 4p telomere approximately 7 kb proximal to and in thesame transcriptional polarity as the PDEB gene (Weber et al., 1991;Collins et al., 1992). Exon 1 encodes only the methionine initiationcodon, a characteristic of the family of genes encoding calcium bindingproteins which includes the regulatory myosin light chains (Emersonand Bernstein, 1987; Simmen et al., 1985).A single 519 bp open reading frame encoding a predicted peptideof 173 amino acids was identified in the cDNA sequence. Evidencesupporting this being the true reading frame was established byperforming an amino acid alignment between the predicted amino acidsequence and those of the human ventricular (MLC-2), rat cardiac (MLC2) and Drosophila (MLC-2) proteins (Fig. 5-6). The observation that theretinal cDNA sequence is identical to the fetal muscle cDNA sequencesuggests that this gene is probably not alternately processed in thesetissues and, indeed, members of this gene family have not been found tobe alternately processed (Kumar et al., 1989; Nudel et al., 1984; Kumaret al., 1986; Henderson et al., 1988; Henderson et al., 1989; Karess etal., 1991). Further, PCR from both human fetal muscle and adult retinalcDNA libraries using primers pla and p3 (Fig. 5-3 and Table 5-1)resulted in only single reaction products of the expected size (data notshown). It is likely, therefore, that all of the exons encoded by thisgene have been identified in this study.1285.3.2 Amino Acid AlignmentsMLCs belong to the super family of calcium binding proteinshaving evolved from a common origin (Collins, 1991). Homologousproteins within this family share four putative calcium binding motifsnamed EF-hand domains. EF-hand motifs are composed of an a-helix E,a 12 residue divalent cation binding ioop, and an a-helix F (Emerson andBernstein, 1987; Kretsinger, 1980). It has been shown that of the fourMLC-2 EF-hand domains, only the N-terminal motif retains the abilityto bind divalent cations (Collins, 1976; Reinach et al., 1986; Bagshawand Kendrick-Jones, 1979). Figure 5-6 provides evidence that theidentified gene is a member of the MLC-2 gene family and that theencoded protein may bind calcium (see below). The predicted peptideshares 63% overall sequence identity with the human ventricular MLC2, 56% overall sequence identity with the rat cardiac MLC-2 gene and,it one excludes a 41 residue N-terminal extension peptide unique toDrosophila (Parker et al., 1985; Toffenetti et al., 1987), the predictedpeptide shares 40% overall sequence identity with the Drosophila MLC2 sequence. The region of greatest similarity occurs between residues20 to 59 where both phosphorylation (Pearson et al., 1984; Michnoff etal., 1986) and calcium binding (Kretsinger, 1980) are predicted tooccur.Residues 33-62, which roughly correspond to EF-hand domain 1,share 92.5% identity with both the human ventricular MLC-2 and the ratcardiac MLC-2 and 53% identity with the Drosophila MLC-2.Significantly, the conserved amino acid residues at positions 1, 3, 5, 9,and 12 of the postulated calcium binding loop are consistent with thesebeing calcium ligands. Further, the conserved glycine at position 6 is129likely to be essential to maintaining the three dimensional structure ofthe loop (Kretsinger, 1980). This local homology adds additionalsupport to the conclusion that the reported gene encodes a MLC-2isoprotein which may bind calcium.5.3.3 ExpressionEvidence for the expression of this gene in both human adultretina and human fetal muscle has been ascertained through theisolation and sequencing of cDNAs from corresponding cDNA libraries.Northern analysis also revealed expression in fetal muscle but failed toshow expression in adult muscle suggesting that the expression of thisgene may be developmentally regulated. The functional significance ofexpression of this gene in the retina is unknown, although, myosin hasrecently been implicated in photoreceptor disc morphogenesis(Williams, 1991). We have also detected transcription of this gene inthe basal ganglia and cerebellum by screening cDNA libraries by PCR.The transcriptional activity of this gene in adult brain has beenquantitatively assayed and determined to be very low (S. Andrew,unpublished results) consistent with Northern blot hybridization data.This low level of expression may reflect either low levels of cellularexpression or alternatively a moderate to high level of expression in avery small subset of cells within a specific tissue. Recently, twononmuscle MLC-2 isoforms have been characterized from the rat brainand found to be under both cell specific and developmental regulation(Feinstein et al., 1991).The observation that the expression of this gene in skeletalmuscle appears to differ both quantitatively and temporally between130African Green monkies and humans is intriguing. The coding fraction ofthe human genome is estimated to share 89-98% sequence identity withother primates (Li and Tanimura, 1987). Therefore modified geneexpression might be expected to account for some of the phenotypicdistinctions observed within the primate lineage. In this regard it isnoteworthy that Lompre et al. (1981) were able to correlate speciesdifferences in cardiac muscle physiology to the species specificexpression of different cardiac myosin isoenzymes. The isolation andcharacterization of the African Green monkies homolog could provideimportant clues regarding the regulation of expression of this gene. Tomy knowledge this is the first example of a gene that is expresseddifferently between humans and other primates (Collins et al., 1992).5.3.4 Assessment as an HD Candidate GeneA positional cloning strategy has led to the identification andcharacterization of a new MLC-2 isoform mapping within a candidateregion for the HD gene. Evidence for this gene being a possiblecandidate for the HD gene was obtained by Andrew et at. (1992) whorevealed that the genomic fragment BS-1, encoding this gene, detecteda significant, nonrandom atlelic association with HD. Moreover, arecombination event, in a sporadic HD patient with no family history ofHD, was found to have occurred within a 50 kb interval including thisgene (Weber et al., 1992). In addition, the observations that the MYL5gene is expressed in brain and that myosin mutations are frequentlydominant (Geisterfer-Lowrence et at., 1992; Emerson and Bernstein,1987), were considered significant.131To assess the MYL5 gene, two BamHl restriction fragmentsencoding the MYL5 locus were used to screen Southern blots containingDNA digested with EcoRl, BamHl, Sstl, and Pstl from 200 unrelated HDpatients. Comparison of the subsequent hybridization patterns to thoseof controls revealed no rearrangements specific to HD (Riess et at. inpreparation). Each of the 7 exons of MYL5 gene and 500 bp of 5 flankingsequence were then subjected to analysis by SSCP. The HD populationselected for this study included the sporadic and 8 additional patientseach of distinct ethnic origin. No mutations specific to HD wereidentified (Dr. 0. Riess, unpublished data). Similarly, semiquanitativeRT-PCR failed to reveal any difference in the transcription of the MYL5gene in the brains of HD patients versus controls (Susan Andrew,unpublished data). These data strongly suggest that mutations at theMYL5 locus do not underlie the neuropathology of HD in any of the nineHD patients examined.1325.4 ReferencesANDREW, S., THEILMANN, J., HEDRICK, A., MAH, D., WEBER, B., ANDHAYDEN, M.R. (1992). Nonrandom association between Huntingtondisease and two loci separated by about 3 Mb on 4p16.3. Genomics 13:301-311.BAGSHAW, C.R., AND KENDRICK-JONES, J. (1979). Characterization ofhomologous divalent metal ion binding sites of vertebrate andmolluscan myosins using electron paramagnetic resonancespectroscopy. J. Mo!. Biol. 130: 317-336.BARTON, P.J.R., AND BUCKINGHAM, M.E. (1985). The myosin alkali lightchain proteins and their genes. Biochem. J. 231: 249-261.BIRD, A.P. (1986). CpG islands and the function of DNA methylation.Nature 321: 209-213.COLLINS, C., HUTCHINSON, G., KOWBEL, D., RIESS, 0., WEBER, B., ANDHAYDEN, M.R. (1992). The human 13—subunit of rod photoreceptor cGMPphosphodiesterase: complete retinal cDNA sequence and evidence forexpression in brain. Genomics 13: 698-704.COLLINS, J.H. (1976). Homology of myosin DTNB light chain with alkalilight chains, troponin C and parvalbumin. Nature 259: 699-700.COLLINS, J.H. (1991). Myosin light chains and troponin C: structural andevolutionary relationships revealed by amino acid sequencecomparisons. J. Musc!e Res. and Ce!! Moti!ity. 12: 3-25.EMERSON, C.P., AND BERNSTEIN, S.I. (1987). Molecular genetics ofmyosin. Annu. Rev, of Biochem. 56: 695-726.FEINSTEIN, D. L., DURAND, M., AND MILNER, R. J. (1991). Expression ofmyosin regulatory light chains in rat brain: characterization of a novelisoform. Mo!. Brain Res. 10: 97-105.GEISTERFER-LOWRANCE, A.A.T., KASS, S., TANIGAWA, G., VOSBERG, H-P.,MCKENNA, W., SEIDMAN C. E., AND SEIDMAN, J.G. (1990). A molecularbasis for familial hypertrophic cardiomyopathy: a b cardiac myosinheavy chain gene missense mutation. CeI!62: 999-1006.133HARRINGTON, W. F., AND RODGERS, M.E. (1984). Myosin. Annu. Rev.Biochem. 53: 35-73.HENDERSON, S.A., SPENCER, M., SEN, A., KUMAR, C., SIDDIQUI, M.A.Q., ANDCHIEN, K.R. (1989). Structure, organization, and expression of the ratcardiac myosin light chain-2 gene. J. Blot. Chem. 264: 18142-18148.HENDERSON, S.A., XU, Y.C., CHIEN, K.R. (1988). Nucleotide sequence of fulllength cDNAs encoding rat cardiac myosin light chain-2. Nucleic. AcidsRes. 16: 4722.KARESS, R.E., CHANG. X-J, EDWARDS, K.A., KULKARNI, S., AGUILERA, I.,AND KIEHART, D.P. (1991). The regulatory light chain of nonmusclemyosin is encoded by spaghetti-squash, a gene required for cytokinesisin Drosophila. Cell65: 1177-1189.KOZAK, M. (1987). An analysis of 5’-noncoding sequences from 699vertebrate messenger RNAs. Nucleic. Acids Res. 15: 8125-8148.KRETSINGER, R.H. (1980). Structure and evolution of calcium-modulatedproteins. CRC Crlf. Rev. Biochem. 8: 119-174.KUMAR, C.C., CRIBBS, L., DELANEY, P., CHIEN, K.R., AND SIDDIQUI, M.A.Q.(1986). Heart myosin light chain 2 gene. J. Blot. Chem. 261: 2866-2872.KUMAR, C.C., MOHAN, S.R., ZAVODNY, P.J., NARULA, S.K., AND LEIBOW(TZ,P.J. (1989). Characterization and differential expression of humanvascular smooth muscle myosin light chain 2 isoform in nonmusciecells. Biochemistry 28: 4027-4035.LI, W-H., AND TANIMURA, M. (1987). The molecular clock runs moreslowly in man than in apes and monkeys. Nature 326: 93-96.LIBERA, L.D., HOFFMANN, E., FLOROFF, M., AND JACKOWSKI, G. (1989).Isolation and nucleotide sequence of the cDNA encoding humanventricular myosin light chain 2. Nucleic. Acids Res. 17: 2360.LOMPRE, A.M., MERCADIER, J.J., WISNEWSKY, C., BOUVERET, P.,PANTALONI, C., D’ALBIS, A., AND SCHWARTZ, K. (1981). Species and age-dependent changes in the relative amount of cardiac myosin isoenzymesin mammals. Dev. Biol. 84: 286-290.134MACERA, M.J., SZABO, P., WADGAONKAR, R., SIDDIQUI, M.A.Q., AND VERMA,R.S. (1992). Localization of the gene coding for ventricular myosinregulatory light chain (MYL2) to human chromosome 12q23-q24.3.Genomics 13: 829-831.McALPINE, P.J., SHOWS, T.B., BOUCHEIX, C., HUEBNER, M., AND ANERSON,W. A. (1991). Update to the Tenth International Workshop on Human Genemapping. Cytogenet. Cell Genet. 58: 5-102.MICHNOFF, C.H., KEMP, B.E., AND STULL, J.T. (1986). Phosphorylation ofsynthetic peptides by skeletal muscle myosin light chain kinases. J.Biol. Chem. 261: 8320-8326.MILLER, M., BOWER, E. LEVITT, P., LI. D., AND CHANTLER P.D. (1992).Myosin II distribution in neurons is consistent with a role in growthcone motility but not synaptic vesicle mobilization. Neuron 8: 25-44.MORALES, M., AND FIFKOVA, E. (1989). In Situ localization of myosin andactin in dendritic spines with the immunogold technique. J. Comp.Neuro!. 279: 666-674.NUDEL, U., CALVO, J.M., SHANI, M., AND LEVY, Z. (1984). The nucleotidesequence of a rat myosin light chain 2 gene. Nucleic. Acids Res. 12:7175-7186.PARKER, V.P., FALKENTHAL, S., AND DAVIDSON, N. (1985).Characterization of the myosin light chain-2 gene of Drosophilamelanogaster. Mo!. Cell. Biol. 5: 3058-3068.PASTERNAK, C., SPUDICH, J.A., AND ELSON, E.L. (1989). Capping ofsurface receptors and concomitant cortical tension are generated byconventional myosin. Nature 341: 549-551.PEARSON, R.B., JAKES, R., JOHN, M., KENDRICK-JONES, J., AND KEMP, B.E.(1984). Phosphorylation site sequence of smooth muscle myosin lightchain (Mr=20000). FEBS Lett. 168: 108-112.POLLARD, T.D. (1981). Cytoplasmic contractile proteins. J. Ce!! BioL91:156S-165S.REINACH, E.G. , NAGAI, K., AND KENDRICK-JONES, J. (1986). Site-directedmutagenesis of the regulatory light-chain Ca2/Mg binding site andits role in hybrid myosins. Nature 322: 80-83.135SIMMEN, R.C.M., TANAKA, T., TS’Ul, K.F., PUTKEY, J.A., SCOTT, M.J., LAI.,E.G., AND MEANS, A.R. (1985). The structural organization of the chickencalmodulin gene. J. Biol. Chem. 260: 907-912.TOFFENETTI, J., MISCHKE, D., AND PARDUE, M.L. (1987). Isolation andcharacterization of the gene for myosin light chain two of Drosophilamelanogaster. J. Cell Biol. 104: 19-28.WEBER, B., COLLINS, C., KOWBEL, D., RIESS, 0., AND HAYDEN M.R. (1991).Identification of multiple CpG islands and associated conservedsequences in a candidate region for the Huntington disease gene.Genomicsll: 1113-1124.WEBER, B., RIESS, 0., HUTCHINSON, G., COLLINS, C., BIAOYANG, L.,KOWBEL, D., ANDREW, S., SCHAPPERT, K., AND HAYDEN, M.R. (1991).Genomic organization and complete sequence of the human geneencoding the f3—subunit of the cGMP phosphodiesterase and itslocalization to 4p16.3.. Nucleic. Acids Res. 19: 6263-6268.WEBER, B., RIESS, 0., WOLFF, G., ANDREW, S., COLLINS, C., GRAHAM, R.,THEILMANN, J., AND HAYDEN, M.R. (1992). Delineation of a 50 kilobaseDNA segment containing the recombination site in a sporadic case ofHuntington’s disease. Nature (Genetics) 2: 216-222.WILLIAMS, D.S. (1991). Actin filaments and photoreceptor membraneturnover. BioEssays 13: 171-178.WINTER, B.B., AND ARNOLD, H.H. (1987). Tissue-specific DNase Ihypersensitve sites and hypomethylation in the chicken cardiac myosinlight chain gene (L2-A). J. Biol. Chem. 262: 13750-13757.136Chapter 6Conclusions1 37a6.1.0 Summary of the Major Findings and ConclusionsThe studies described in this thesis were designed to test ahypothesis postulated by Weber et al. (1992) that an unequalrecombination in HD candidate region II, had caused HD in a patientsuffering from a new mutation at the HD locus (Wolff et al., 1989). Totest this hypothesis, a chromosome walk was initiated (Weber et al.,1991) within a 500 kb interval known to contain the site ofrecombination (Weber et al., 1992) and numerous polymorphic markerswere isolated (Weber et al., 1991). These efforts culminated in thenarrowing of HD candidate region II from 500 kb to 50 kb.Subsequently, Weber et al. (1992) scrutinized the 50 kb interval forrearrangements by Southern blot analysis, but failed to reveal anyevidence for the postulated unequal recombination in the patient.If indeed the development of HD in the sporadic case resultedfrom a small unequal recombination at the HD locus, then this shouldhave had measurable genetic consequences. This reasoning led to asearch for genes in the 50 kb interval which resulted in theidentification of three putative CpG islands. Significantly, two ofthese CpG islands were found to be associated with phylogeneticallyconserved DNA which, when used to screen cDNA libraries, identifiedthe PDEB and MYL5 genes. Characterization of the third CpG island hasrevealed that it is contained within a processed pseudogene (Dr. B.Weber, unpublished results).The isolation and sequencing of cDNAs comprising the full lengthPDEB cDNA revealed a single 2565 bp open reading frame coding for a854 amino acid protein (Collins et al., 1992). Northern blot analysisindicates the PDEB gene is expressed as an abundant 3.5 kb message in137retina and as a rare 2.9 kb message in brain. Moreover, the isolation ofPDEB cDNAs, from human brain libraries, confirms the brain as a site ofexpression for this gene (Collins et al., 1992). Southern blot analysisand sequencing allowed 22 PDEB exons spanning 43 kb to be identifiedand, importantly, localized PDEB exons 4-22 to within the 50 kbinterval hypothesized to encode the HD gene (Weber et al., 1991; Weberet at., 1992). However, hybridization of probes, from across the 43 kbPDEB locus to Southern blots of DNA from 200 unrelated ethnicallydiverse HD patients revealed no structural rearrangements specific toHD (Weber et at., 1993). Similarly, SSCP analysis of the 22 retinalPDEB exons failed to identify mutations specific to HD in 16 unrelatedHD patients (including the sporadic patient) comprising 11 ethnicgroups (Riess et al., 1992). This study provides no evidence for anunequal recombination in the sporadic patient’s PDEB gene and further,suggests that mutations at the PDEB locus do not underlie themolecular pathology of HD in either the sporadic or other HD patients.The MYL5 gene is encoded approximately 7 kb proximal to thePDEB gene in the middle of the 50 kb interval originally hypothesized toencode the HD gene. Structurally identical full length cDNAs wereisolated from human adult retina and fetal muscle cDNA libraries. A519 bp open reading frame was identified in the cDNA sequencesencoding a predicted protein of 173 residues. Sequence analysis of a5.6 kb genomic region that encodes these cDNAs revealed the presenceof 7 exons which span 4 kb. Expression of this gene has been detectedin human adult retina, cerebellum, basal ganglia, and fetal skeletalmuscle (Collins et at., 1992). Hybridization of genomic probes spanningthe MYL5 gene to Southern blots of DNA from 200 unrelated HD patients138revealed no structural rearrangements specific to HD. SSCP analysis ofthe 7 exons of the MYL5 gene failed to identify any mutations specificto HD in 9 unrelated HD patients comprising 9 ethnic groups (Dr. 0.Riess unpublished data). Finally, the transcriptional activity of theMYL5 gene was found to be identical in HD patients and controls by bothNorthern analysis and semi-quantitative RT-PCR (Susan Andrewunpublished data). These findings strongly support the conclusion thatmutations in the MYL5 gene also do not result in HD in either thesporadic or other HD patients.In summary, two genes have been localized to the 50 kb intervalof chromosome 4p16.3 that was found to be recombinant in a patientsuffering from a new mutation at the HD locus. Mutations in thesegenes have been excluded as a cause of HD and, moreover, no detectableduplications or deletions have been identified within the 50 kb interval.Together, these findings suggest that the new HD mutation and the4p16.3 recombination event in the sporadic case were, in fact, twoindependent events. This conclusion is supported by the recent cloningof the HD gene (The HD Collaborative Group, 1993) and its localizationto HD candidate region I (Fig. 1-7). Significantly, the HD CollaborativeGroup (1993) have defined the molecular lesion underlying HD to be anunstable trinucleotide repeat, the length of which correlates with theage at onset for HD.1396.2.0 Future Studies• Determine if the sporadic patient’s clinical phenotype isthe consequence of a mutation in the HO gene and, if so, what thenature of that mutation is. It will be important to determine if HDin the patient results from an expansion of the trinucleotide repeator, alternatively, a mutation elsewhere in the HD gene.• It will be important to clone and sequence the full lengthneuronal PDEB cDNA to determine if brain specific exons are encodedat the PDEB locus and why the PDEB transcripts observed in brainappear smaller than those in retina. Cloning the full length PDEBcDNA from brain would also help to determine if CpG island I, inintron 3, represents a brain specific promoter region. It will beimportant to investigate the PDEB gene’s temporal and spatialexpression in the brain using in situ hybridization in order to gaininsights into the biological significance of the PDEB genes neuronalexpression. It will be particularly interesting to investigatewhether the pdeb gene’s neuronal expression is altered in rd mice.Further, I think it will be of interest to construct mice withmutations that eliminate alternate splicing of the pdeb gene’s exons10 and 21 in an effort to determine the function of the alternatetranscripts produced by the PDEB gene. Lastly, it will be valuable todevelop antibodies against the PDEB protein lacking the CAAX motifto investigate its cellular localization.• Apply the candidate gene approach to the investigation ofthe PDEB gene’s role in human retinopathies.140• Determine if additional genes are encoded at the PDEBlocus by performing direct selection of cDNAs, exon trapping andsequence analysis using cosmids cL4 and cQBI.• I plan to map and sequence the promoter region of theMYL5 gene in human and African Green monkey to determine themechanism by which it’s expression has changed during primateevolution. For example, I want know if a mobile element hasinserted into the MYL5 gene’s promoter region or if the interstitialtelomeric repeat, identified in this study, represents the site of anancient chromosomal fusion which altered expression of the MYL5gene in humans. To investigate this latter question cosmids flankingthe interstitial telomeric repeat might be employed as fluorescentin situ hybridization probes (FISH) to determine if, in African Greenmonkey, one or two chromosomes are detected.6.3.0 ReferencesCOLLINS, C., HUTCHINSON, G., KOWBEL, D., RIESS, 0., WEBER, B., ANDHAYDEN, M.R. (1992). The human 13—subunit of rod photoreceptor cGMPphosphodiesterase: complete retinal cDNA sequence and evidence forexpression in brain. Genomics 13: 698-704.COLLINS, C., SCHAPPERT, K., AND HAYDEN, M.R. (1992). The genomicorganization of a novel regulatory myosin light chain gene (MYL5) thamaps to chromosome 4p16.3 and shows different patterns of expressionbetween primates. Hum. Mo!. Genet. 1: 727-733.THE HUNTINGTON’S DISEASE COLLABORATIVE GROUP. (1993). A novelgene containing a trinucleotide repeat that is expanded and unstable onHuntington’s disease chromosomes. CeI!72: 971-983.141HUTCHINSON, G.B., AND HAYDEN, M.R. (1992). The prediction of exonsthrough analysis of spliceable open reading frames. Nucleic Acids Res.20: 3453-3462.MORGAN, J.G., DOLGANOV, G.M., ROBBINS, S.E., HINTON, L.M., AND LOVETr,M. (1992). The selective isolation of novel cDNAs encoded by theregions surrounding the human interleukin 4 and 5 genes. Nucleic AcidsRes. 20: 5173-5179.RIESS, 0., NOERREMOELLE, A., COLLINS, C., MAH, D., WEBER, B., ANDHAYDEN, M.R. (1992). Exclusion of DNA changes in the 3—subunit of thec-GMP phosphodiesterase gene as the cause for Huntington’s disease.Nature (Genetics) 1: 104-108.IJBERBACHER, E.G. AND MURAL, R. (1991). Locating protein-codingregions in human DNA sequences by a multiple sensor-neural networkapproach. Proc. NatI. Acad. Sci. USA 88: 11261-11265.WEBER, B., COLLINS, C., KOWBEL, D., RIESS, 0., HAYDEN, M.R. (1991).Identification of multiple CpG islands and associated conservedsequences in a candidate region for the Huntington disease gene.Genomicsll: 1113-1124.WEBER, B., HEDRICK, A., ANDREW, S., RIESS, 0., COLLINS, C., KOWBEL, D.,AND HAYDEN, M.R. (1992). Isolation and characterization of new highlypolymorphic DNA markers from the Huntington disease region. Am. J.Hum. Genet. 50 382-393.WEBER, B., RIESS, 0., WOLFF, G., ANDREW, S., COLLINS, C., GRAHAM, R,.THEILMANN, J., AND HAYDEN, M.R. (1992). Delineation of a 50 kilobaseDNA segment containing the recombination site in a sporadic case ofHuntington disease. Nature (Genetics) 2: 216-222.WEBER, J.L., AND MAY, P.E. (1989). Abundant class of human DNApolymorphisms which can be typed using the polymerase chain reaction.Am. J. Hum. Genet. 44: 388-396.WOLFF, G., DEUSCHL, G., WIENKER, T.F., HUMMEL, K., BENDER, K., LUCKING,C.H., SCHUMACHER, M., HAMMER, J., AND OEPEN, G.(1989). New mutationto Huntington’s disease. J. Med. Genet. 26:18-27142


Citation Scheme:


Citations by CSL (citeproc-js)

Usage Statistics



Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            async >
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:


Related Items