UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Statistical study of human constitutional chromosome rearrangement breakpoint distributions Vásárhelyi, Krisztina 1990

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1990_A6_7 V37.pdf [ 9.21MB ]
Metadata
JSON: 831-1.0098346.json
JSON-LD: 831-1.0098346-ld.json
RDF/XML (Pretty): 831-1.0098346-rdf.xml
RDF/JSON: 831-1.0098346-rdf.json
Turtle: 831-1.0098346-turtle.txt
N-Triples: 831-1.0098346-rdf-ntriples.txt
Original Record: 831-1.0098346-source.json
Full Text
831-1.0098346-fulltext.txt
Citation
831-1.0098346.ris

Full Text

S T A T I S T I C A L S T U D Y OF H U M A N C O N S T I T U T I O N A L CHROMOSOME REARRANGEMENT BREAKPOINT  DISTRIBUTIONS  By Krisztina Vasarhelyi B. Sc. (Biology) University of British Columbia  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE  in THE FACULTY OF GRADUATE STUDIES GENETICS  We accept this thesis as conforming to the required standard  THE UNIVERSITY OF BRITISH COLUMBIA  September 1990 © Krisztina Vasarhelyi, 1990  In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission.  Genetics The University of British Columbia 2075 Wesbrook Place Vancouver, Canada V6T 1W5  Date:  Abstract  In this study the question of nonrandomness in the distribution of human constitutional rearrangements was evaluated. The distribution of breakpoints were analysed in three groups of reciprocal translocations and three groups of inversions, subdivided according to method of ascertainment of cases for study. In addition, one data set of structural aberrations obtained from sperm chromosomes was also analysed. The method of statistical analysis, based on the binomial distribution, was developed specifically to allow testing distributions in chromosome segments as small as chromosome bands. The distribution of breakpoints was analysed in all data sets using this method, in addition to testing for overall nonrandomness using goodness offitstatistics. Nonrandomness in breakpoint distributions was found in reciprocal translocations (rep) and inversions ascertained through abnormalities and through incidental events. However, random distribution was observed in incidentally ascertained de novo rearrangements as well as in sperm chromosome aberrations. The nonrandomness in the distribution of rep breakpoints can be largely attributed to a bias in ascertainment of cases based on the phenotypic manifestations of chromosomal imbalance resulting from a rearrangement. A dependence of the probability of producing specific types of balanced or unbalanced progeny on the position of breakpoints is a likely explanation for the nonrandomness produced in breakpoint distributions. However, some bands including, 5q35, 7p22, 9p22, 13ql4, and 17q25, were observed in different ascertainment groups, excluding selection bias as a likely explanation for this observation. These bands may represent true sites of nonrandom rearrangement due to some factor associated with an underlying DNA sequence or structural characteristic of chromatin n  that predisposes to rearrangement at specific sites. The nonrandomness observed in the distribution of inversion breakpoints is most likely the product of a founder effect. Many identical inversions in apparently unrelated individuals have been found suggesting that a few ancestral mutations have become widespread in the population. A large data set of incidentally ascertained de novo inversions is required to distinguish between sites of frequent breakage and nonrandomness produced by the ascertainment of related cases. All evidence considered together, indisputable predisposition to rearrangement at specific sites was not found in this study. Furthermore, an overall random association of constitutional rearrangement breakpoints in bands with known oncogenes and fragile sites was observed. However, the possibility of oncogenes and fragile sites as factors involved in constitutional rearrangements in a few isolated cases cannot be excluded. Nonrandomness was found when distribution of breakpoints in light and dark G bands was compared. An excess of breakpoints in some light G bands was observed even after a conservative correction for a possible pattern recognition bias which may lead to the overascertainment of breakpoints in light G bands.  m  Table of Contents  Abstract  ii  List of Tables  viii  List of Figures  x  Acknowledgement 1  xi  Introduction  1  1.1 General Introduction  1  1.1.1  Human Chromosome Rearrangements  2  1.1.2  Meiotic Segregation in Rep and Inversion Heterozygotes  4  1.2 Nonrandomness in Constitutional Rearrangements .  2  7  1.2.1  Review of Breakpoint Distribution Studies  8  1.2.2  Approach to Testing for Nonrandomness in the Present Study . .  13  1.2.3  A Note on "Hot Spots"  19  D a t a Sources and Methods  21  2.1 Overview of Data Analysis  21  2.2 Sources of Data  23  2.3 Organization Of Data In Ascertainment Groups  25  2.3.1 Ascertainment of Rearrangements in the Original Studies . . . . .  26  2.3.2 Definition of Ascertainment Groups Used for Classification of Rearrangements in the Present Study iv  30  3  2.4 Elimination of Duplicate Cases  34  2.5 Statistical Analysis  35  2.5.1  Mathematical Model of Chromosome Breakage  35  2.5.2  Hypotheses of Random Breakage  36  2.5.3  Binomial Confidence Limits  38  2.5.4  Testing for Nonrandomness Using Binomial Confidence Limits . .  39  2.5.5  Comparison of Results Between Data Sets  40  2.5.6  Summary of Statistical Analysis  41  Results  43  3.1 Reciprocal Translocations  43  3.1.1  Rep Ascertained Through Abnormalities (Al)  43  3.1.2  Incidentally Ascertained Balanced Rep (A2)  47  3.1.3  Incidentally Ascertained De Novo Rep (A3)  49  3.1.4  Comparison of Results: Rep Ascertained Through Abnormalities (Al) and Rep Ascertained Incidentally (A2 & A3)  49  3.1.5  Distribution of Rep Breakpoints in Dark and Light Bands . . . .  55  3.1.6  Association of Rep Breakpoints with Fragile Sites and Oncogenes  55  3.2 Inversions .  57  3.2.1  Inversions Ascertained Through Abnormalities (Bl)  57  3.2.2  Incidentally Ascertained Balanced Inversions (B2)  59  3.2.3  Incidentally Ascertained De novo Inversions (B3)  60  3.2.4  Comparison of Results: Inversions Ascertained Through Abnormalities (Bl) and Inversions Ascertained Incidentally (B2 &: B3) .  3.2.5  Distribution of Inversion Breakpoints in Dark and Light Bands . .  v  62 66  3.2.6  Association of Inversion Breakpoints with Fragile Sites and Oncogenes  66  3.3 Sperm Chromosome Aberrations  68  3.4 Comparison df Results: Rep (Group A), Inversions (Group B), and Sperm Chromosome Aberrations (Group C) 4  69  Discussion and Conclusions  70  4.1 Distribution of Breakpoints in Constitutional Rearrangements  70  4.1.1  The Effects of Ascertainment Bias  71  4.1.2  The Effects of Chance Fluctuations  77  4.1.3  Candidate Sites of True Nonrandom Involvement  79  4.2 Evidence for Random Breakage  81  4.3 Distribution of Breakpoints in Light and Dark Bands  84  4.4 Coincidence with Fragile Sites and Oncogenes  85  4.5 Improvements for Future Analysis  88  4.6 Conclusions  90  Appendices  92  A  Rearrangements Associated W i t h Bands of Frequent Breakage  92  A.l Lists of Rep Associated with Bands of Frequent Breakage  92  A. 2 Lists of Inversions Associated with Bands of Frequent Breakage  106  B a n d Measurements  111  B. l  112  B  C  Band measurements at 320 band resolution  B.2 Band Measurements for Sperm Chromosomes  119  C o m p u t e r Programs  126  vi  C.l Checking for Invalid Bands  126  C.2 Checking for Duplicate Rearrangements  127  C.3 Statistical Analysis Using Binomial Confidence Limits  130  C.4 Testing for Nonrandomness and Homogeneity  134  References  139  vii  List of Tables  1.1  Summary of Previous Studies of Rep  1.2  Summary of Previous Studies of Pooled' Structural Rearrangements  9 ...  10  2.1 Ascertainment Groups in the Original Studies for Rep Detected Through an Abnormalities  27  2.2 Ascertainment Groups in the Original Studies for Inversions Ascertained Through Abnormalities  28  2.3 Ascertainment Groups in the Original Studies for Rep Detected Incidentally 29 2.4 Ascertainment Groups in the Original Studies for Inversions Ascertained Incidentally  29  2.5 Sources of Data  32  3.1 Tests for Nonrandomness in Rep Ascertained Through Abnormalities (Group Al)  44  3.2 Hot Spot Bands in Rep Ascertained Through Abnormalities (Group Al)  45  3.3  Cold Spots in Data Set a  46  3.4  Tests for Nonrandomness in Rep Ascertained Incidentally (Group A2) . .  47  3.5  Hot Spot Bands in Rep Ascertained Incidentally (Group A2)  48  3.6  Hot Spot Bands in Pooled Rep Data  51  3.7  Cold Spots in Rep Ascertained Through Abnormalities (Al)  52  3.8 Distribution of Rep Breakpoints in Dark and Light G Bands  56  3.9  Tests for Nonrandomness in Inversions Ascertained Through Abnormalities 57  vm  3.10 Hot Spot Bands in Inversions Ascertained Through Abnormalities (Group Bl)  58  3.11 Tests for Nonrandomness in Inversions Ascertained Incidentally  60  3.12 Hot Spot Bands in Inversions Ascertained Incidentally (Group B2) . . . .  61  3.13 Hot Spot Bands in Pooled Inversion Data  63  3.14 Distribution of Inversion Breakpoints in Dark and Light G Bands . . . .  67  3.15 Hot Spot Bands in Sperm Chromosome Aberrations  68  ix  L i s t of F i g u r e s  3.1 Distribution of Rep Breakpoints Ascertained Through Abnormalities (Group Al)  53  3.2 Distribution of Rep Breakpoints Ascertained Incidentally (Group A2) . .  54  3.3 Distribution of Inversion Breakpoints Ascertained Through Abnormalities (Group Bl)  64  3.4 Distribution of Inversion Breakpoints Ascertained Incidentally (Group B2) 65  x  Acknowledgement  I would like to give special thanks to my supervisor, Dr. J . M. Friedman, who helped make the past two years an interesting and rewarding experience by providing all the support and expertise I needed, while always encouraging me to think independently. I would also like to thank the members of my committee for their valuable comments, Dr. Roy Douglas and Dr. Glen Cooper for consultations regarding the statistical analysis, and many members of the Physics Department at ETH (Zurich, Switzerland) for access to their facilities and expert advice on computing problems. I would like to thank my husband, Sandy Rutherford, for writing the programs in Appendix C, and for valuable discussions throughout the project. I am especially grateful for his love and unfailing interest in my work. I would like to express my appreciation to Lynn Bernard for her interest and friendship, and my family and all my friends for their patience and support.  xi  Chapter 1  Introduction  1.1  General Introduction  This thesis describes a statistical study of human constitutional chromosome rearrangement data to determine whether breakpoints i n two types of rearrangements,  reciprocal  translocations (rep) and inversions, are distributed i n any specific nonrandom fashion on the chromosomes. A s described i n the following, rearrangements may lead to imbalances of genetic material i n an individual, which can have significant detrimental manifestations, such as mental retardation and other major and minor congenital defects [57]. Definition and subsequent characterization of sites of nonrandom involvement may represent an initial step i n the identification of the factors involved i n the production of constitutional rearrangments. rearrangements  For this purpose, we attempted to assemble data sets of  which are as representative of a random sample of constitutional rep or  inversions as possible, and tested for nonrandomness i n the distribution of breakpoints. Furthermore, we tested for associations of these points of nonrandom involvement with known fragile sites and oncogenes i n order to test the hypotheses that these events have a role i n the generation of constitutional  rearrangements.  The following sections give a general introduction to human constitutional rearrangements, and the aberrant segregation processes at meiosis, which ultimately affect the ascertainment of rearrangements for breakpoint distribution studies. Section 1.2 is dedicated to a more specific introduction to breakpoint distribution analysis as well as a brief  1  Chapter  1.  Introduction  2  review of previous work on the subject. 1.1.1  H u m a n Chromosome Rearrangements  In rare instances one or more human chromosomes undergo structural rearrangment in a germline cell of an individual. These rearrangments may involve breakage of chromosome arms at one or more sites and the subseqent loss or reattachment of the separated fragment in an inappropriate position or orientation. A wide variety of chromosome rearrangements have been observed in humans. In its simplest form, a breakage of a chromosome arm is followed by the subsequent loss of a chromosome fragment in simple deletions. More than a single break on one chromosome may lead to interstitial deletion, inversion, insertion, or duplication of chromosome material, or the formation of a ring chromosome. Breakages involving two chromosomes may result in insertion of an interstitial segment of one chromosome at a breakpoint on another chromosome, or in exchanges of chromosome fragments in reciprocal translocations. Occasionally complex rearrangements are observed involving 3 or more chromosomes. In this project reciprocal translocations and inversions were studied, both involving 2 breakpoints on two different and on a single chromosome respectively. Reciprocal translocation is the most common form of structural chromosome abnormality in humans. In a study of pooled data from surveys of 59,452 consecutively born infants, 52 cases of balanced rep were found [35]. This translates to a frequency of 0.87 in 1000 live births. Rep are formed subsequent to breakage of two nonhomologous chromosomes followed by the exchange and reunion of end fragments. The two new chromosomes are refered to as derivatives (der) of the original chromosomes depending on the centromeric fragment retained in the exchange [34]. A special type of rep, Robertsonian translocation, involves breakpoints at the centromeres of two of the five pairs of acrocentric chromosomes (13, 14, 15, 21, or 22) followed by an exchange resulting in the  Chapter  1.  Introduction  3  union of two long and two short arms. The tiny chromosome formed by the union of short arms is frequently lost in subsequent cell divisions. Robertsonian translocations are known to be important in chromosome evolution and they may have a role in speciation [61]. Since Robertsonian translocations are nonrandomly restricted to centric fusions of the acrocentric chromosomes, they were not included in this study. The issue of nonrandomness in these rearrangements has been examined by other investigators (for example [60] [69]). Inversions are less common than rep in liveborn infants. The frequency of inversions in newborn survey data was found to be 0.15 in 1000 live births [35]. Inversions may be produced if an interstitial segment defined by two breakpoints on the same chromosome is inverted 180° and the breakpoints are subsequently repaired in the reversed orientation. In pericentric inversions breakpoints are on different arms of the chromosome including the centromere in the inverted segment. As a result, the chromosome arm ratio is often altered. The arm ratio is unaffected in paracentric inversions because both breakpoints are located on the same chromosome arm, leaving centromere position unchanged [68]. Rep and inversions in their balanced forms are compatible with a normal phenotype. Rearranged chromosomes may be passed on to an offspring and become constitutionl parts of all cells in that individual. The rearrangements are transmitted undisturbed in mitosis. However, meiotic segregation of chromosomes in both balanced rep and inversion heterozygotes can lead to the production of unbalanced gametes with duplications and deletions of specific chromosome segments. The phenotypic consequences of chromosomal imbalance can range from lethality of sperm or ovum to abnormalities in a liveborn offspring. This variability in the influence of chromosomal imbalances on the viability and phenotype of carriers is of direct concern in the selection of a random sample of cases for the study of nonrandomness in breakpoint distributions. Therefore, the principles of how imbalances arise in rep and inversion are considered in the following section, prior  Chapter  1.  Introduction  4  to a discussion of ascertainment bias in section 1.2.2. 1.1.2  Meiotic Segregation in Rep and Inversion Heterozygotes  Rearrangement of chromosomes through reciprocal translocation and inversion leads to problems in homologous pairing in meiosis. Abnormal pairing can lead to abnormal segregation resulting in chromosomally unbalanced gametes. Meiotic Segregation in Balanced Rep Carriers  Meiotic pairing in a balanced rep carrier is possible only through a quadriradial configuration involving both derivative chromosomes and their normal counterparts (see figure 9.7 in [68]). This quadriradial structure of 4 chromosomes may undergo 2:2, 3:1, or 4:0 disjunction to produce various combinations of the 4 chromosomes at the two poles. In alternate segregation chromosomes positioned diagonally in the quadriradial structure move to opposite poles. This type of segregation results in balanced offspring with either two normal or two derivative chromosomes (seefigure14 in [63]). The other 2:2 segregation types always lead to imbalances of chromosome material. In adjacent-1 segregation neighbouring nonhomologous centromeres and in adjacent-2 segregation homologous centromeres move to the same pole. In both cases, one derivative and one normal chromosome segregate together resulting in combined duplications and deficiencies of specific chromosome segments defined by the rep breakpoint. In 3:1 disjunction, trisomies of specific chromosome segments are produced through either segregation of 2 normal chromosomes together with one derivative (tertiary trisomy), or through segregation of 2 derivative chromosomes with one of the normal chromosomes (interchange trisomy). In liveborn offspring, the most common segregation type is adjacent-1, 3:1 is less frequent, and adjacent-2 is rarely seen [38] [37]. The same order in the frequency of segregation types was observed in sperm [44]. The 4:0 disjunction type has never  Chapter  1.  Introduction  5  been seen in live births or abortions, but it was observed in one case in the sperm of a carrier of two balanced rep [6]. In addition to the segregation outcomes described here further variations are possible if chiasma formation and crossing over takes place in the quadriradial structure. Imbalances of chromosome material arising through aberrant segregation has a direct influence on the viability potential of the resulting segregation product, and that in turn affects the probability that the rearrangement is detected in a given population, such as abnormal live births or in a survey of normal individuals. This has direct relevance to studies of breakpoint distributions which require that the rearrangement studied represent a random sample of all possible rep that are formed. Therefore, it is important to understand the factors determining viability potentials of unbalanced rep. There is extensive evidence that chromosomal imbalance leads to abnormalities, but what characteristics of an imbalance determine the extent of abnormality is debated. Intuitively, larger chromosomal imbalance is expected to lead to greater abnormalities and is thought to be less likely to result in live birth. Instead, large imblances are expected to lead to conceptuses that fail to implant or that are aborted at various stages of gestation. In studies of rep ascertained through live births and abortions, the size of the imbalance was found on average to be smaller in the former than in the latter group [1] [3] [9] [13] [59]. However, this relationship is not a straight correlation. Specific chromosome regions have been observed to have decreased tolerance to imbalances compared to other regions, indicating that genetic content is an important factor [20] [39]. In fact, in one study, the average size of imbalance was found not to be significantly different for rep ascertained through an abnormality and rep ascertained through recurrent abortions [10]. Additional factors that possibly contribute to viability potentials of unbalanced rep through physically affecting meiotic segregation or through imposing selective forces include breakpoint and centromere position, comparative sizes of interstitial segments,  Chapter  1.  Introduction  6  chiasma frequency and position, formation of chain or ring multivalents, and comparative sizes of the chromosomes involved (see [9] for details). The parental origin of unbalanced segments may also possibly affect viability potentials of segregation products through a genomic imprinting mechanism. Meiotic Segregation in Balanced Inversion Carriers  In contrast to rep, unbalanced gametes in inversion carriers are produced only when crossing over occurs in the inverted segment. The probability of producing unbalanced chromosomes is dependent on the likelihood of pairing between an inversion chromosome and its normal homologue [40]. The likelihood of pairing is in turn dependent on the length of the inversion segment. Very small inversions do not pair in the inverted segment and therefore recombination cannot take place [43]. Crossover supression in the inversion region in the Drosophila male with no effect on fertility is a well known example of a comparable process [65]. Inversions of intermediate size achieve pairing and chiasma formation through a loop configuration of the homologoues. In the loop the inversion segment lines up with the homologous segment in the correct orientation. Loop formation does not occur in very large inversions with breakpoints at opposite ends, near the telomeres. In such cases the inversion segment pairs and pairing is absent in the two end segments [40]. An uneven number of crossovers in the inverted segment produces unbalanced chromosomes. A single crossover in a paracentric inversion loop results in a dicentric chromosome and an acentric fragment, with duplications and deficiencies of specific segments. The uninvolved chromatids produce balanced chromosomes, one normal and one with the inversion. In pericentric inversions, a single crossover produces two unbalanced and two balanced chromosomes, all monocentric. Additional crossover events may produce additional imbalances, such as 100% unbalanced products in a 4 strand double crossover event [43]. Crossovers outside the inversion do not produce imbalances.  Chapter  1.  7  Introduction  As in the case of rep, the extent of imbalance in recombinant chromosomes determines the risk of a liveborn abnormal offspring to an inversion heterozygote. In general, larger imbalances are less compatible with live birth. Furthermore, unbalanced offspring may be aborted at various stages of gestation, or prior to implantation in some cases. An imbalance may also be disruptive to the meiotic process preventing the production of gametes [40]. In pericentric inversions recombination in a small inversion segment produces large imbalances (see Figure 7 in [40]). In paracentric inversions the acentric fragment produced by crossing over in an inversion loop is often lost leading to deficiencies of the corresponding chromosome segments. The dicentric chromosome may break at anaphase and the resulting imbalance is determined by the breakpoint. In general, paracentric inversions that are large enough to produce pairing loops but involve only a small portion of the chromosome produce large imbalances. 1.2  Nonrandomness in Constitutional  Rearrangements  Constitutional rearrangements may arise through random breakage of chromosomes at two or more sites followed by repair of breakpoints joining the wrong fragments or the the original fragment in the wrong orientation. Alternatively, the process of chromosome breakage, the union of fragments, or both these processes may be nonrandom. There may be sites on the genome that are predisposed to breakage due to some characteristic of the DNA sequence or of the higher order chromatin structure. Random breakage together with preferential participation of specific combinations of breakpoints in rearrangements, due perhaps to sequence similarities, would also lead to nonrandomness in the distribution of rearrangement breakpoints The development of chromosome banding techniques has sparked an interest in defining specific sites of nonrandom breakage in constitutional rearrangements. Similar efforts  Chapter  1.  Introduction  8  in cancer rearrangements resulted in the association of several rearrangements with certain types of malignancies. In constitutional rearrangements however, the question of nonrandomness has been difficult to resolve. The problems primarily involve the selection of an appropriate study population. In addition, conventional methods of statistical analysis proved inadequate for dealing with precise localization of sites of nonrandomness allowed by the improvements in banding techniques. In the following, previous studies on the subject are briefly reviewed and the approach employed in the present study is outlined. 1.2.1  Review of Breakpoint Distribution Studies  The need to deal with problems of ascertainment bias in order to assess properly the question of nonrandom breakage has been recognized in early studies of breakpoint distribution analysis [36] [51]. Consequently, various approaches have been employed with respect to subdivision of data according to modes of ascertainment, and types of rearrangements included in the data. The most common criterion for subdivision of data was according to ascertainment through unbalanced carriers, balanced carriers, or recurrent abortions. However, the exact definitions of these groups were not identical in each study, probably contributing to the variation observed in the results. The majority of studies involved reciprocal translocations, as it is the most common form of structural rearrangement [35]. In some studies, all rearrangements involving breakage were analyzed as a single group. Very few studies on the distribution of inversion breakpoints have been done. The methods and results of previous studies are summarized in table 1.1 for reciprocal translocations and in table 1.2 for all structural rearrangements. Several observations can be made in comparing previous studies: (1) Nonrandomness in the distribution of breakpoints is repeatedly observed. The exception to this are rearrangements detected in surveys of consecutive newborns [49] (not shown in tables)  Chapter  1.  Introduction  9  Rep Ascertained # of Breaks  Reference  Schwartz, et. al., 1986  326  Through Unbalanced Carrier Results Excess Deficit 9p, 14p, 18p |  Rep Ascertained / of Breaks  Reference  Statistics  z test  none  Through Balanced Carrier Results Excess Deficit  Jacobs, et. al., 1974  84  llq  A u r i a s , et. al., 1978  106  4p, 9p, l O q  none  Statistics  by observation  l p , 2p, 6q  X  2  test  21q, 22q Davis, et al., 1985  420  Schwartz, et al., 1986  292  Reference  Rep Asceriainec # of Breaks  C a m p a n a , et. al., 1985 Davis, et al., 1985  312  Schwartz, et al., 1986  256  Reference Stoll, 1980  9, 21, 22  llq  none  plot 95% confidence interval z test  Through Recurrent Abortions Results Excess Deficit 6, 7, 22 none  190  7p, 17p, 22p  Rep Ascertained # of Breaks 770  1, 2, 3, 6 8, 17, 19  12  Statistics  X test plot 95% confidence interval 2  none  none  z test  Through Various Means Results Excess Deficit 4p, 9p, 9q 13q, 18q, 21p, 22p  P a l m e r , 1981  353  9, 18, 21.  M a s e r a t i , et al., 1986  213  I q l l , 2pl3 2p21, 9p22 llq23, 15qll 15ql3, 1 8 q l l 18q23, 2 0 q l l 21q22, 2 2 q l l Xq22  lp, lq, 3q, 5q, 7p, 12p, Yp, Yq, Xq  3p, 6q 16p Xp,  1, 19 not reported  Statistics A n a l y s i s of X  X  2  2  components  test  X test for small numbers (see [62]) 2  Table 1.1: Summary of results and methods from previous studies of breakpoint distributions i n reciprocal translocations.  Chapter  1.  Introduction  10  Structural Reference Y u , et. al., 1978 P a l m e r , et al., 1981 Porfirio, et a l , 1987  Table 1.2:  Ascertained # of Breaks  Rearrangements  Through Various Means Results  Statistics  Excess  Deficit  9, 13, 18, 21 22, Y  2, 3, 6, 16 19, 20  615  18, 21, X  6391  2 q l 3 , 3p25 5pl5, 5pl3 5 p l l , 9p24 9p22, 9 p l l 9 q l l , 9ql3 llql3, llq25 13ql4, 1 5 q l l 1 8 p l l , 18q21 2 1 q l l , 21q22 2 2 q l l , 22ql3 Xp22, X p l l Ypll  1, 19 not reported  1134  Regression Analysis X test Regression A n a l y s i s 2  S u m m a r y of results and methods from previous studies of breakpoint  butions i n pooled data on structural  rearrangements.  distri-  Chapter J.  Introduction  11  where random breakage was found. (2) The level of agreement with respect to chromosomes frequently involved in breakage varies according to ascertainment group as well as between studies utilizing similar ascertainment criteria. (3) A number of chromosomes with excess breakpoints, including 9, 18, 21 and 22 are repeatedly observed in different studies, in some cases independent of mode of ascertainment and rearrangement type. (4) Results are reported at various levels of resolution ranging from whole chromosomes to chromosome bands at the 320 band level. This reflects a lack of ability of the statistical methods used, to detect deviations from nonrandomness in small chromosome segments. The variation in results between studies utilizing similar ascertainment criteria may be attributed to the differences in interpretation of specific ascertainment groups. For example, balanced carriers have been considered suitable for breakpoint distribution analyses because such individuals carry the full chromosome complement and do not usually have phenotypic manifestations of the rearrangement. However, balanced carriers can be ascertained in various situations including surveys of specific populations [36], in prenatal diagnosis [59], or through referal for cytogenetic study because of recurrent abortions [1] [13], possibly introducing systematic bias in selection of cases for study. The variation observed between studies using different ascertainment criteria is likely, at least in part, to be the product of this difference in case selection. Although advances in chromosome banding techniques allowed precise localization of breakpoints in chromosome bands, conventional statistical methods were frequently inadequate to test significant deviations from nonrandomness at that level (see tables 1.1 and 1.2). Sites of frequent breakage were identified by straight observation [36], or by other tests not adequate to detect nonrandomness at the level of chromosome bands, where the expected values are often very small [1] [7] [51] [59]. The % test is frequently 2  used in testing for overall deviations from nonrandomnes, but its sensitivity deteriorates as the number of classes increases [62], as is the case in testing for nonrandomness in  Chapter  1.  Introduction  12  chromosome bands. In addition, identification of specific segments that are nonrandomly involved in breakage can be ambiguous. However, some attempts have been successful in identifying nonrandomness at the chromosome band level using regression analysis [53] and a special % [46] devised to deal with large number of classes and small expected 2  numbers in the classes [62]. Based on the comparisons of previous studies, it appears necessary to define precise criteria for selection of study cases in such a way as to reduce ascertainment bias maximally and also to find a method of statistical analysis that unambiguously detects deviations from random breakage at the level of chromosome bands. As explained in subsequent sections, we have defined ascertainment criteria that are based not strictly on the nature of the chromosome complement in probands. Instead we divided rearrangements based on their relationship to the method of ascertainment. Accordingly, balanced rearrangements ascertained through reproductive failure, including infertility and recurrent spontaneous abortions, were classified together with unbalanced rearrangements associated with abnormalities in aborted fetuses, stillbirths, and live born infants. This group was compared to rearrangements that were ascertained for reasons unrelated to the presence of the rearrangement. This approach may help to determine if the similarities in results observed in tables 1.1 and 1.2 are the function of an underlying biological predisposition to breakage. Alternatively, it may help identify the sources of bias leading to these results. The method of statistical analysis used in our study is also different from previous methods in that it is specifically developed to deal with detecting nonrandom breakage in small segments, such as chromosome bands. For this purpose we used the binomial distribution as a model for chromosome breakage in bands, and used binomial confidence intervals to describe the observed distribution of breakpoints on the chromosomes without reference to a hypothetical distribution. Testing a hypothesis of breakage is a separate  Chapter  1.  Introduction  13  process. The novelty of this method, in addition to easily detecting nonrandomness in chromosome bands, is that it gives an overall picture of the observed distribution of breakpoints, instead of merely making a statement about the fit of the distribution to an arbitrary hypothesis. 1.2.2  Approach to Testing for Nonrandomness in the Present Study  In order to explore the question of nonrandom breakage and rearrangement, we felt it was necessary to address both the problem of ascertainment bias in breakpoint data and the inadequacy of statistical methods used to deal with breakpoint data. The following is a discussion of ascertainment bias which includes an explanation of some forms of bias dealt with previously by others, in addition to some theoretical ideas that await support by experimental demonstrations. Furthermore, some aspects of statistical testing that were considered in the development of our statistical method are also considered. Ascertainment Bias  Interest in testing for nonrandom breakage was prompted by the advances in chromosome banding technology in the early 1970's, that allowed mapping of breakpoints to specific sites on the chromosome arm. However, it was also recognized that ascertainment bias in human data produces apparent nonrandomness in the distribution of breakpoints that is not due to a biological predisposition to nonrandom participation of specific sites in structural rearrangements [36], [51]. Cytogenetic data from clinical laboratories have been the most available source of information on rearrangements. Patients are referred for testing usually for some form of physical abnormality, mental retardation, recurrent abortions or infertility. When a rearrangement is found in these cases, either the individual is a carrier of an unbalanced rearrangement or a balanced carrier is ascertained as a result of pregnancy failure or  Chapter  1.  14  Introduction  infertility, likely to be due to unbalanced gametes or conceptuses. In most studies utilizing this source of information, nonrandom breakage has been found. Based on the discussion in section 1.1.2, the extent and nature of the imbalance influences the outcome of a pregnancy. Therefore, breakpoints of rearrangements ascertained through an unbalanced carrier or reproductive failure are expected to represent a subset of rearrangements more likely to be compatible with a specific phenotypic outcome, such as abnormalities in a live birth, or abortion of unbalanced offspring. For this reason, breakpoints from rearrangements ascertained through unbalanced carriers is considered to be a highly nonrandom sample. In contrast to unbalanced carriers, balanced carriers of rearrangements have a full complement of the normal chromosome set and they are usually phenotypically normal. Therefore, ascertainment bias relating to variations in viability potential and to other phenotypic manifestations of unbalanced rearrangements may be avoided in an analysis of breakpoints ascertained through normal balanced carriers. However, detection of balanced carriers may be biased as well if selection of cases is nonrandom. with respect to the location of breakpoints on the chromosomes. For example, balanced carriers ascertained through abnormalities [1], recurrent abortions [13], or population surveys of institutions, including mental or psychiatric hospitals, subfertility clinics, or prisons [36], where patients may be admitted for reasons that are related to their carrier status. Balanced carriers may also be detected in routine prenatal diagnosis screening procedures [59], or in surveys of consecutive newborns [36]. In table 1.1 results in the latter two populations vary from those of the former studies utilizing data ascertained mainly through phenotypic abnormalities. A possible explanation for this is that the balanced carriers in prenatal diagnosis screens and in surveys of newborns were frequently ascertained by chance, with no a priori rearrangement.  suspicion of the presence of the  Chapter  1.  Introduction  15  Clearly, ascertainment through balanced carriers can represent various biased samples of rearrangements directly related to how a "balanced carrier" is defined. For this reason we did not subdivide rearrangements according to ascertainment through balanced or unbalanced carriers. Instead, we analyzed groups of rearrangements ascertained through abnormalities and rearrangements ascertained incidentally. The former group consisted of both unbalanced cases and balanced carriers if ascertainment was in any way related to phenotype or abnormal pregnancy outcome. Incidentally ascertained cases are all phenotypically normal carriers of balanced rearrangements (see below for further details). In the early 1970's the most readily available source of incidentally ascertained balanced rearrangements was surveys of consecutive newborns [36] [49]. Unfortunately, data from these studies are nevertheless insufficient for statistical analysis of breakpoint distributions because the incidence of rearrangements in the population is relatively low. Furthermore, our interest in this study is to define breakpoint distributions at the 320 band level of resolution, which excludes newborn data from surveys that were carried out before chromosome banding techniques were used. Today, the most readily available source of incidentally ascertained information is provided by cytogenetic studies of fetal samples obtained in prenatal diagnosis procedures. Routine prenatal diagnosis screening for trisomies in mothers of advanced maternal age is now widely available in many countries of Europe and North America. Unexpected balanced rearrangements are occasionally detected in such screens. Therefore, the sample of balanced carriers detected through a balanced prenatal diagnosis result represents a group of incidentally ascertained breakpoints similar to that in newborn surveys, with the exception that the balanced carrier is detected at the early stage of about 14 to 16 weeks of gestation in prenatal diagnosis, and later at birth in newborn surveys. In this study, incidentally ascertained balanced rearrangements from several large studies involving prenatal diagnosis data were analyzed in addition to some data obtained in  Chapter  1.  Introduction  16  newborn surveys. Based on the above discussion, subdivision of rearrangements according to ascertainment through abnormalities and incidental ascertainment seems superior to criteria frequently used in previous studies. However, incidentally ascertained balanced rearrangements cannot be considered completely free of bias. Particularly, inherited rearrangements ascertained incidentally in prenatal diagnosis are in all likelihood affected by ascertainment bias which is indirectly related to pregnancy histories of balanced carrier parents and thus have to be taken into consideration. Studies of rep in sperm chromosomes suggest that the known disjunctions and segregations types occur with unique frequencies for each rearrangement [44], which is presumably at least to some extent the function of breakpoint position [63]. For inversions it is less clear clear what factors are involved in producing recombinants. It appears that the probability of crossing over in an inversion segment and producing unbalanced gametes, is related to the size of the segment which in turn is defined by the location of breakpoints [40] [43]. In conclusion, the position of breakpoints at least partially determines the probabilities for balanced and unbalanced inversion recombinants and rep segregation products respectively. Therefore, a subset of breakpoints from balanced rearrangements that have a higher probability of leading to balanced offspring are likely to be overrepresented in incidentally ascertained carriers. For this reason, we felt that it would be of interest to study samples of incidentally ascertained balanced de novo rep and inversions, and compare these results to similarly ascertained balanced inherited rearrangements. So far in this introduction, the focus has been on the forms of bias directly or indirectly related to abnormal meiotic segregation of rearranged chromosomes resulting in nonrandom ascertainment of cases for study. However, elimination of all forms of bias from this source would not ensure a completely random sample of breakpoints. In the  Chapter 1.  Introduction  17  discussion of balanced carriers, we assumed that no abnormalities are associated with balanced rearrangements. However, balanced rearrangements are thought to be associated with abnormalities in some cases [26]. Since these individuals appear to carry a full chromosome complement, the abnormalities may result from very small submicroscopic damage to genes at the breakpoint site. For example, small deletions or duplications in genes may lead to abnormalities. Detrimental influence of the rearrangement on gene function may also arise from position effects. It is possible that some abnormalities involving breakpoints in essential genes are lethal at a very early stage and are never observed, removing a subset of breakpoints from the total sample of balanced rep breakpoints. Furthermore, there may be other rearrangements that are stable in sperm cells that do not undergo further mitotic division but are unstable in mitotic divisions in the zygote and these are also aborted at a very early stage. Nonrandom effects of this type cannot be dealt with in a study restricted to constitutional rearrangements. Therefore studies of sperm cells may be enlightening as to possible sites that are predisposed to breakage, or both breakage and subsequent rearrangements. We analyzed a small set of data points from sperm chromosome rearrangements Due to the various forms of bias affecting virtually all constitutional rearrangement data, we took a comparative analytical approach in evaluating nonrandomness in distributions of constitutional rearrangement breakpoints. We have analyzed and compared data ascertained either through some form of abnormality (mainly from unbalanced carriers) and inherited and de novo balanced rearrangements ascertained incidentally. This approach may help identify breakpoints that are solely involved in a specific ascertainment group and therefore could be considered to be the product of the bias in ascertainment. Any bands nonrandomly participating in all ascertainment groups, and especially also in sperm chromosome data are candidates for hot spots for breakage and rearrangement due to intrinsic properties of the chromatin in a given area.  Chapter  1.  Introduction  18  Development of Statistical Method  As discussed in section 1.2.1,previously used methods of breakpoint distribution analysis were often inadequate to detect nonrandomness at the level of individual chromosome bands. The first comprehensive attempt to test specifically for nonrandomness in chromosome bands, taking the differences in band lengths into account, was made by De Braekeleer et al., in testing for association of fragile sites and cancer rearrangements [18]. This method, involving a Monte Carlo simulation, was subsequently applied to a data set of constitutional rearrangements [16], and was also the basis for the method of analysis developed in the present study. A computer simulation of random breakage was carried out based on the assumption that breakage is random at all sites on the chromosomes. A random number generator was used to distribute a given number of breakpoints in proportion to the relative lengths of individual bands in the haploid genome. This process was repeated a large number of times to produce a probability distribution of breakpoints in each band. In comparison of the generated distribution to an observed distribution with the same total number of breakpoints, those bands found to have less than the observed number of breaks in 95% of the simulations were considered to have a significant excess of breakpoints at the 5% level of significance. Analogously, bands with greater than the observed number of breaks in 95% of the simulations were considered to have a significant deficit of breakpoints at the 5% level. This method has the advantage that it can be used to test unambiguously the distribution of breakpoints in individual bands in order to detect sites of unusually frequent or infrequent breakage. One disadvantage is that the simulation makes approximations to expected probabilities of breakage that can otherwise be calculated easily based on a hypothesis of random breakage. The simple observation that there are only two possibilities for the outcome of a breakage event with respect to a specific band, namely that  Chapter  1.  Introduction  19  a break will occur and that a break will not occur in a specific band, with respective probabilities adding up to 1, allows the binomial probability distribution to be used as a model for chromosome breakage [17] [70]. Using the binomial model, tail probabilities for observed breakpoints in chromosome bands have been calculated [17], using a hypothetical probability of random breakage (see chapter 2 for further details). The approach to statistical testing used in this study also employs the binomial probability distribution model. However, confidence intervals around observed breakpoints in chromosome bands were calculated instead of point estimates for the probability [17]. A point estimate is particularly sensitive to chancefluctuations[41], which is especially pronounced for small samples, characteristic of breakpoint data at the chromosome band level. The binomial confidence interval for each band is calculated without any assumption about the distribution of breakpoints in a band. Therefore, the confidence interval enclosing the set of values likely to be observed at a given significance level, is a more descriptive parameter and is more revealing about the observed distribution of breakpoints. In the testing of hypotheses, expected values are calculated separately and the test for nonrandomness simply involves observation of the location of expected values relative to the confidence interval. Detailed description of the method of binomial confidence intervals in testing for nonrandom breakage is included in chapter 2. 1.2.3  A Note on "Hot Spots"  The terms "hot spot" and "cold spot" have been loosely coined in statistical studies to describe chromosome bands found to have an excess or a deficit of breakpoints respectively. Although the underlying purpose of breakpoint distribution studies is to try to identify sites on the human genome that are biologically predisposed to breakage and/or rearrangement, the previous discussion clearly indicates that indisputable demonstration that specific sites are true "hot spots" or "cold spots" for breakage is beyond the scope of  Chapter  1.  Introduction  20  any statistical study. Bands found to have an excess or deficit of breakpoints i n a carefully designed study are potential candidates for true "hot spots" or "cold spots", but this status has to be confirmed by experimental methods. For the sake of convenience, in this study the terms "hot spot" and "cold spot" are used to refer to bands with excess or deficit of breakpoints in a data set. However, it must be stressed that this designation does not signify an underlying assumption about the true predisposition at certain bands either to frequent or infrequent breakage.  Chapter 2  D a t a Sources and M e t h o d s  2.1  Overview of D a t a Analysis  In this study, we analyze data on human constitutional chromosomal rearrangements to determine if points of nonrandom breakage exist. The primary sources of data include large scale studies published in the literature and registries of cytogenetic information. The data sets are described in detail in section 2.2 below. Inversions and rep were considered separately. Within each group of rearrangements data were analysed in subgroups denned according to mode of ascertainment of cases (see section 2.3). For purposes of analysis, a pair of breakpoints in a specific rearrangement was assumed to be independent. The distribution of breakpoints in a 320 band haploid karyotype was studied to detect any nonrandomness overall on the genome and in specific bands. The 320 band karyotype was chosen because cytogenetic information is generally reported at that level of chromosome resolution, and because higher resolution data can always be reduced to a lower resolution breakpoint. For tests of nonrandomness, two hypotheses were formulated as described in section 2.5.2. Expected values, calculated based on these hypotheses, were corrected to take account of the differences in the repesentation of the X and Y chromosomes in males and females. In a haploid karyotype, the probability of observing an X or a Y chromosome is 0.75 and 0.25 respectively, assuming equal numbers of males and females in the study population is equal. Accordingly, the expected probabilities for the X and Y were  21  Chapter  2.  Data  Sources  and  22  Methods  multiplied by these values. To test for overall nonrandomness in the distribution of breakpoints, two goodness of fit statistics were calculated, including Pearson's % and the likelihood-ratio G [23]. To 2  2  our knowledge, there is no reliable statistical method available for testing goodness of fit in situations when the number of classes is large (320 in this study) and the expected values are small (less than 1 for several data sets). One method developed specifically for breakpoint distribution studies [62] is based on assumptions that were later shown to be incorrect [23]. However, one view is that reasonable agreement between Pearson's % and 2  the likelihood-ratio G indicates a reliable result [23]. For very small data sets, we tested 2  the overall distributions using chromosomes instead of chromosome bands as the smallest class. The analysis of rearrangements involved pooling of data sets from independent studies. Prior to pooling, data sets were tested for homogeneity using Pearson's x  2  a n  d  the likelihood ratio G . 2  Nonrandomness in a specific chromosome band was tested using the method of binomial confidence intervals [70] considered in detail in section 2.5. For testing breakpoint distributions in chromosome bands, the size of each band in the 320 band karyotype was measured from the standard idiograms of human G banded chromosomes in the ISCN nomenclature [34]. Each band in the 320 band karyotype was designated as a light or a dark band based on the ISCN idiograms. For some bands several subbands are present in the diagrams. In these cases the band was designated as light or dark based on the total relative sizes of light and dark subbands within that band. The overall distribution of breakpoints in light and dark G band was tested by calculation of a z statistic for one sample proportions (see pages 207-210 in [54]). In addition to inherited rearrangements, breakpoints from aberrations of sperm chromosomes were also studied. For the assignment of breakpoints a 329 band karyotype was  Chapter 2.  Data Sources and Methods  23  used to allow inclusion of breaks designated only as centromeric with no specification of the chromosome band involved. On some chromosomes (1, 4, 9, 19) we defined a single centromeric band as bands p l l and q l l together. Breakpoints on these chromosomes designated either as centromeric or in bands p l l or in q l l , were assigned to this centromeric band. On the other chromosomes the centromeric region was defined originally in ISCN by either subbands of p l l and qll, or by a combination of a subband of one of p l l or q l l together with p l l or qll. Breaks on these chromosomes in the centromere region were assigned to the centromeric band denned by the appropriate band and/or subbands as before. However, breaks specified to be in p l l or q l l were not considered to be centromeric. These breakpoints were assigned to bands p l l and q l l which we defined as the subbands adjacent to the centromeric band. In the following sections several aspects of the general methods described above are considered in detail. In section 2.2, the sources of data are described. The ascertainment of cases in the original studies, and the ascertainment groups used to organize information in this study are described in section 2.3. The elimination of duplicate cases is considered in section 2.4. Finally, the method of statistical analysis used to test for nonrandomness in chromosome bands is explained in section 2.5. 2.2  Sources of D a t a  Data were obtained from two chromosome registries, several large scale studies, and a few smaller studies. Breakpoint information on reciprocal translocations (rep) and inversions (inv) were analyzed separately. Robertsonian translocation breakpoints were not included in the analysis because they are known to be nonrandomly restricted to centric fusions of thefiveacrocentric chromosomes. Analysis of breakpoint distributions of other types of constitutional rearrangements was not possible due to lack of sufficient  Chapter 2.  Data Sources and Methods  24  numbers of available cases for sound statistical analysis. A total of 1,272 cases of rep and 310 cases of inversions were studied. Of these, 1,012 rep, and 242 inversions were from large scale prenatal diagnosis studies [11], [22], [32], [71], 232 rep, and 68 inversions were contributed to the Registry of Cytogenetic Abnormalities and Phenylketonuria (ReCAP) [24] and the Interregional Cytogenetic Registry System (ICRS) [51]. 28 cases of rep were detected in cytogenetic surveys of consecutive newborns. [5], [20], [25], [50],[42]. The number of inversion cases detected in newborn surveys was not sufficient for statistical analysis. For several rearrangements, only one breakpoint was included in the data set because the available information on the other breakpoint was incomplete. The data described above represent three distinct sources of information, including (1) rearrangements from prenatal diagnosis studies, (2) rearrangements contributed to a chromosome registry, primarily from clinical laboratories, and (3) rearrangements detected in a systematic cytogenetic survery of consecutively born babies. In the first group of studies, the primary aim was to determine frequencies of structural abnormalities at prenatal diagnosis [32], [22], or to estimate the risk of unbalanced segregants in balanced carriers of structural rearrangments [11]. Balanced carrier couples were ascertained either through an abnormality or for reasons unrelated to the rearrangement, most frequently advanced maternal age. Rearrangements in the second group were obtained from two cytogenetic registries, established to provide an organized collection of cytogenetic data obtained from a number of actively participating laboratories. Information on all types of chromosome aberrations were included in these registries. Therefore, the data are varied, including balanced and unbalanced rearrangments from abnormal individuals, spontaneous abortions, infertile individuals and prenatal diagnosis results. The third group of rearrangements, ascertained in newborn surveys, includes both balanced and  Chapter 2. Data Sources and Methods  25  unbalanced rearrangements detected in liveborn infants. The methods of case ascertainment and methods of analysis, as well as the ultimate purposes of the three groups of studies vary widely and this can be expected to affect the nature of the data. However, a useful feature of all studies is that ascertainment information was provided for each rearrangement. Consequently, analysis of breakpoint distributions using data from these sources could be carried out on groups of specific rearrangements selected from the original studies based on the original ascertainment criteria and assigned to ascertainment groups defined according to the specific purpose of testing for nonrandom breakage in constitutional rearrangements, as described later in this chapter. In addition to constitutional rep and inversions, aberrations detected in sperm chromosomes of normal men were also analyzed. Although points of breakage in aberrations of sperm chromosomes may not all be equally frequently involved in subsequent rearrangments, such points may represent areas that are especially prone to breakage. An additional advantage of this type of data is that it is free from some of the phenotypic and viability effects that influence the selection of cases of constitutional rearrangements for study. In two studies, 104 metaphases with chromosome aberrations were found with a total of 109 breakpoints. Only a small number of rep were observed. The majority of aberrations of sperm chromosomes were gaps and breaks of chromatids and chromosomes, in addition to a few deletions and other rearrangements. 2.3  Organization Of Data In Ascertainment Groups  The primary objective of this study was to compile and compare data on chromosome rearrangements ascertained in one of two general ways: through abnormalities or through incidental means. Description of the ascertainment groups used for analysis in this study are found in section 2.3.2 below. The ascertainment groups in the original studies that  Chapter 2.  Data Sources and  Methods  26  are subsequently assigned to one of the general ascertainment groups in section 2.3.2 are described in detail in section 2.3.1. 2.3.1  Ascertainment of Rearrangements in the Original Studies  Rearrangements from the sources described were selected for analysis in the present study based on the original mode of ascertainment of the cases. Ascertainment criteria in the original studies were defined in terms of study objectives giving rise to systematic variations between data sets. For example, the number of ascertainment groups and the generality of their definitions were different between studies. Furthermore, the number of cases in each ascertainment group varies substantially, both within and between studies. Ascertainment information for rearrangements from all data sources is presented in tables 2.1 and 2.3 for rep, and in tables 2.2 and 2.4 for inversions. Rearrangements in tables 2.1 and 2.2 were ascertained through various abnormalities and are as a result very heterogeneous with respect to reasons for ascertainment. In the study of Daniel et al. [11], and in the ReCAP registry, the cases were classified into one of a few relatively general groups. In contrast, specific criteria were used to classify cases in one of a number of ascertainment groups in the ICRS data set. Some ascertainment definitions, such as group "a" in the ICRS data set, are somewhat ambiguous, requiring interpretation to allow inclusion in the ascertainment group, denned in section 2.3.2, for purposes of breakpoint distribution analysis. Rearrangements in tables 2.3 and 2.4 include cases ascertained incidentally through a balanced carrier, with no a priori  reason to suspect the presence of a balanced rear-  rangement. The majority of these rearrangements were ascertained at prenatal diagnosis in women with advanced maternal age. A minor subset includes prenatal diagnosis patients referred for reasons other than advanced maternal age, such as anxiety, that are  Chapter  2.  Data  Data  Sources  and  27  Methods  Source  Ascertainment  Group  Description  Daniel et al., 1989  a b c d  ICRS  a b c d e f g  ReCAP  h i j k 1 m n a b c  Offspring with Unbalanced Rep Multiple Spontaneous Abortions Infertility Balanced Proband with Mental Retardation Confirm/Rule Out Chromosome Abnormality Multiple Congenital Anomalies Multiple Spontaneous Abortions Suspected Autosomal Abnormality Down Syndrome Suspected Turner Syndrome Suspected Chromosome Abnormality Suspected Dysmorphic Features Ambiguous Genitalia Neoplasia Study Secondary Amenorrhea Trisomy 18 Suspected Primary Amenorrhea Mental Retardation Abnormal Phenotype Multiple Spontaneous Abortions Infertility  #  of  Breaks  594 553 8 12 183 20 20 9 9 6 6 6 4 2 2 2 •1 1 107 24 6  Table 2.1: Rep breakpoints listed according to mode of ascertainment in the original studies. All rearrangements were ascertained through abnormalities and were assigned to group A l , which is defined in section 2.3.2  Chapter  2.  Data  Data  Sources  and  28  Methods  Source  Ascertainment  Group  Description  ICRS  a b c d e f g h  Daniel et al., 1989  i a b c  ReCAP  a b  Confirm/Rule Out Chromosome Abnormality Multiple Congenital Anomalies Dysmorphic Features Sex Chromosome Abnormality Suspected Multiple Spontaneous Abortions Abortion Material From Spontaneous Abortion Neoplasia Study Down Syndrome Suspected Trisomy 18 Suspected Offspring with Unbalanced Rep Multiple Spontaneous Abortions Balanced Proband with Mental Retardation Abnormal Phenotype Multiple Spontaneous Abortions  #  of  Breaks  47 8 2 2 2 2 2 4 2 29 34 2 22 6  Table 2.2: Inversion breakpoints listed according to mode of ascertainment in the original studies. A l l rearrangements were ascertained through an abnormality and were assigned to group B l , which is denned in section 2.3.2  Chapter 2.  Data Sources and Methods  Data  Source  29  Ascertainment  Group #  Description  of  Breaks  D a n i e l et a l , 1989  a  Incidental Ascertainment for Reasons Unrelated to the Rearrangement.  575  Hook et a l . , 1987  a  Amniocentesis for A d v a n c e d Maternal Age Amniocentesis for A n x i e t y  125  b  4  Ferguson-Smith & Yates, 1984  a  Amniocentesis for A d v a n c e d Maternal Age  90  ReCAP  a  Amniocentesis for A d v a n c e d  22  b  Maternal Age Amniocentesis for A n x i e t y  2  a  Amniocentesis for A d v a n c e d  12  b  Maternal Age Survey of N o r m a l C h i l d r e n  2  a  N e w b o r n Surveys  54  ICRS  N e w b o r n Surveys  Table 2.3: Rep breakpoints listed according to mode of ascertainment in the original studies. All rearrangements were ascertained through incidental means, and were assigned to groups A2 and A3, which are defined in section 2.3.2  Data  Ascertainment  Source  Group  Description  #  of  Breaks  D a n i e l et a l , 1989  a  Incidental Ascertainment for Reasons Unrelated to the Rearrangement.  282  Hook et a l . , 1987  a  Amniocentesis for A d v a n c e d Maternal Age  83  Ferguson-Smith & Yates, 1984  a  Amniocentesis for A d v a n c e d Maternal Age  42  ReCAP  a  Amniocentesis for A d v a n c e d Maternal Age  18  b  Amniocentesis for A n x i e t y  2  a  Amniocentesis for A d v a n c e d Maternal Age  14  ICRS  Table 2.4: Inversion breakpoints listed according to mode of ascertainment in the original studies. All rearrangements were ascertained incidentally and were assigned to groups B2 and B3, which are defined in section 2.3.2  Chapter  2.  Data  Sources  and  Methods  30  also unrelated to the balanced rearrangement. For reasons explained in section 2.3.2, balanced carriers detected in newborn surveys were analyzed together with balanced carriers detected at prenatal diagnosis. A number of laboratories contributed data to more than one study or registry. To ensure that each rearrangement represents an independent mutation event, we tried to eliminate duplicate cases from the data. The procedures used to identify duplicate rearrangements are described in section 2.4. 2.3.2  Definition of Ascertainment Groups Used for Classification of Rearrangements in the Present Study  The distribution of breakpoints was analyzed and compared in different ascertainment groups. Rep, inversions and sperm chromosome rearrangements, referred to as group A, group B and group C, respectively, were each analysed separately. Groups A and B were further subdivided according to the mode of ascertainement of the rearrangement as follows: Group A .  Reciprocal  Translocations  • Group A l .  Rep Ascertained Through Abnormalities  • Group A2.  Rep Ascertained Incidentally  • Group A 3 .  Incidentally Ascertained De novo Rep  Group B.  Inversions  • Group B l .  Inversions Ascertained Through Abnormalities  • G r o u p B2.  Inversions Ascertained Incidentally  • Group B3.  Incidentally Ascertained De novo Inversions  Chapter  2.  Data  Sources  and  31  Methods  The number of breakpoints from the original studies in each of groups A, B, and C, defined above, are listed in table 2.5. The ascertainment groups defined above are very general requiring further clarification. For statistical analysis, all cases ascertained for some form of abnormality were assigned to group A l for rep and group B l for inversions. The "abnormality" may be a single or multiple congenital defect, mental retardation or other abnormality in a liveborn individual, an abnormal stillbirth or aborted fetus. Furthermore, the abnormality may also refer to a healthy and normal balanced carrier with abnormal reproductive outcomes. For example, carriers of balanced rearrangements are occasionally found among individuals with multiple spontaneous abortions. When abortion material is not available for study, the aborted fetus is assumed to be an abnormal unbalanced segregant. Similarly, in cases of infertility when a structural aberration is found, we assume that the segregation of rearranged chromosomes produces very large imbalances that lead to reproductive failure. Groups A2 and B2 include balanced rep and inversions ascertained incidentally at prenatal diagnosis or newborn surveys. The ascertainment of a balanced carrier at prenatal diagnosis may be a direct or indirect process. The balanced carrier may be (1) a normal fetus with a balanced rearrangement inherited from one of the parents, (2) a normal fetus with a balanced de novo rearrangement, (3) a balanced carrier parent detected incidentally prior to entering prenatal diagnosis study , (4) a balanced carrier parent 1  with an unbalanced fetus detected at prenatal diagnosis subsequently followed by cytogenetic studies of the parents. In this study (4) is excluded from data in order to avoid ascertainment bias resulting from different viability potentials of carriers of unbalanced rearrangement s. Although newborn surveys involve a systematic screening of consecutive liveborn 1  T h i s group is found only i n the s t u d y of D a n i e l et al. [11]  Chapter  2.  Data  Sources Group  Al  and Reference  D a n i e l et a l . , 1989 ReCAP  #  [51] Total  279 1583  D a n i e l et a l . , 1989 Ferguson-Smith & Yates, 1984  [11] [22]  575 90  Hook & Cross, 1987 ReCAP Evans et a l . , 1978  [32] [24] [20]  129 24 22  ICRS Friedrich & Nielsen, 1974 Nielsen & Sillesen, 1975  [51] [25] [50]  14 12 11  [5] [42] Total W a r b u r t o n , 1984 [71] Hook & Cross, 1987 [32] Ferguson-Smith & Yates, 1984 [22]  7 2 886  Hook et a l . , 1983 ReCAP B u c k t o n et a l . , 1980 Friedrich & Nielsen, 1974 Nielsen & Sillesen, 1975  Bl  B2  ICRS D a n i e l et a l . , 1989 ReCAP Daniel et a l . , 1989 Hook & Cross, 1987 Ferguson-Smith & Yates, 1984 ReCAP ICRS  B3  C  Breaks  1167 137  B u c k t o n et a l . , 1980 L i n et a l . , 1976  A3  of  [11] [24]  ICRS  A2  32  Methods  Total  49 38 16 8 6 2 2 1 122  [51] [11] [24 Total  81 65 28 174  [11] [32] [22] [24]  282 83 42 20 14 441  [33] [24] [5] [25] [50]  [51] Total  W a r b u r t o n , 1980 Hook & Cross, 1987  [71] [32]  Ferguson-Smith & Yates, 1984 ReCAP  [22] [24]  8 2 2  Total  24  [4] [45]  62  Total  109  B r a n d r i f f et a l . , 1985 M a r t i n et a l . , 1987  10  47  Table 2.5: Sources of data. Groups A, B, and C consist of breakpoints from rep, inversions and sperm chromosome aberrations respectively. Subdivisions of groups A and B are according to mode of ascertainment as described previously in this section.  Chapter  2.  Data  Sources  and  Methods  33  infants, rearrangements detected by this procedure were grouped together with other incidentally ascertained rearrangements (Groups A2, A3, B2, and B3). As with rearrangements ascertained at prenatal diagnosis, there is no a priori  reason to suspect a  chromosome aberration in balanced carriers from newborn surveys. A balanced carrier ascertained in newborn surveys may be (1) a normal infant who is a carrier of a balanced inherited rearrangement, (2) a normal infant who is a carrier of a balanced de novo rearrangement, (3) an abnormal liveborn infant who is a carrier of an unbalanced rearrangement inherited from a balanced carrier parent, or (4) an abnormal liveborn infant who is a carrier of an unbalanced de novo rearrangement. Groups (3) and (4) were not included in our analysis. Since Groups (3) and (4) were excluded, the ascertainment of balanced carriers is comparable through prenatal diagnosis for advanced maternal age and through newborn surveys. There are two important differences between these two methods of ascertainment, however. First, balanced carriers are detected at different stages of development. In prenatal diagnosis, balanced carriers are detected at about 14 to 16 weeks of gestation, while in newborn surveys infants are studied at birth. In this study, we included only balanced rearrangements associated with normal phenotype. Although balanced rearrangements are occasionally associated with abnormalities, these abnormalities are expected to be often compatible with live birth, unless the rearrangement itself disrupts an essential gene. Therefore, we do not expect the sample of balanced carriers to be substantially different at prenatal diagnosis than at birth. Secondly, in newborn surveys all consecutive births are studied while only a specific group of pregnancies is studied at prenatal diagnosis. In both situation, we expect the sampling with regard to balanced carrier status to be random. For these reasons, we felt it was justified to analyse prenatal diagnosis data and newborn survey data in a single group of incidentally ascertained balanced rearrangements.  Chapter  2.  Data  Sources  and  34  Methods  Groups A3 and B3 consist of incidentally ascertained de novo rep and inversions representing subgroups of A2 and B2 respectively, with additional information from a study of de novo rearrangements [71]. 2.4  Elimination of Duplicate Cases  Since data in Groups A and B were compiled from several large published studies, it was necessary to consider the possible duplicate inclusion of some rearrangements. For breakpoint distribution analysis, it is essential to ensure that the same individual, or individuals from the same family are entered only once in order to avoid overrepresentation of a single set of breakpoints. In some of the original studies duplicate cases were not removed by the authors [11] [32], so it was necessary for us to do so. An additional problem involved overlap between data sets, because several laboratories contributed their data to more than one registry or study. In the ICRS and ReCAP registries individuals from the same family were identifiable by the assigned identification number. This was not possible for the data in other studies because only names of contributing laboratories was provided. Since complete information on family relationships between subjects was not always available, we employed a conservative approach for removal of duplicate rearrangements. The search for identical rearrangements within and between data sets was carried out using a C shell script program . This program searches for rearrangements with identical 2  breakpoints and prints all matches with the appropriate identification information. The program was initially run on each set of data from the original studies separately. Following removal of all but one of matching records from within data sets, data from the original studies were pooled in one of the ascertainment groups defined in the previous T h i s and other programs used in these studies were w r i t t e n by A . R. R u t h e r f o r d programs is reproduced in A p p e n d i x C 2  C o d e for  all  Chapter 2.  Data Sources and  Methods  35  section. The pooled sets were checked for matching records once again and duplicates were individually evaluated to determine if they were likely to represent a single mutation event (see below). For rep, the majority of matches were reported by the same laboratory. In a few cases, only the country of origin could be identified. As all cases originate from either Europe or North America, and since information on the origin of rearrangements was incomplete in many cases, matching rep were considered to be identical mutations if both originated in Europe or both in North America. Although this is a conservative approach, it has the advantage of reducing the chance of including identical mutations from related individuals who are unaware of this relationship. In this study we found that identical rep rarely arise in unrelated individuals. The opposite is true for inversions, as identical rearrangements were frequently reported by different laboratories (see Appendix A). The conservative approach of eliminating all but one of several identical inversions originating from either Europe or from North America would require elimination of most inversion data. For this reason, only cases identified by the same laboratory were removed. If the contributing laboratory was not known, duplicates were removed only if there was a possible overlap of data sources in two studies. It is unlikely that all inversions originating from the same mutation were eliminated by this method of duplicate removal. However, any criterion of eliminating identical inversions is highly arbitrary and is unlikely to aid in clarification of the data. 2.5 2.5.1  Statistical Analysis Mathematical M o d e l of Chromosome Breakage  Chromosome breakage in specific small regions, such as chromosome bands, can be described by a binomial probability model. Considering a breakage event in a chromosome  Chapter  2.  Data  Sources  and  36  Methods  band, each band i can be described to have an unknown probability of breakage, pi, determined by various biological factors. T h e only other possible outcome is no breakage in the specified band, and this event occurs with a probability of  = 1 — p;. If each  breakage event is assumed to be independent of every other, breakage of chromosomes corresponds to a series of Bernoulli trials [54]. T h e probability P that an X{ of breakage events are observed i n band i can therefore be described by a binomial distribution as follows: P(X  =  x)  p?qf- ' X  {  V *i  (2.1)  }  J  where N is the total number of breakpoints i n the sample, and A ' is a random variable representing all possible values of X{. T h e value P calculated i n 2.1 is a point estimate of the probability that an a:,- number of breakpoints are observed i n band i, given the inherent probability of breakage p; i n that band.  2.5.2  Hypotheses of Random Breakage  In testing a distribution of breakpoints for nonrandomness, the observed distribution is compared to an expected distribution based on a hypothesis of random breakage.  In  this study two hypotheses were tested, both assuming random breakage with respect to a particular chromosome band. In hypothesis I, breakage of chromosomes was assumed to occur at equal probability everywhere on the genome.  Accordingly, the expected  distribution of breakpoints i n a specific chromosome segment is calculated as the relative length of that segment i n a haploid genome. T h e expected probability of breakage i n a chromosome band i is calculated as:  PiE = \  (2.2)  where piE = the expected probability of breakage i n band i; Z; = the length of band i;  Chapter 2.  Data, Sources and Methods  37  and L = the total length of the haploid genome. The lengths of bands were measured from the diagrams of G banded chromosomes in the ISCN Nomenclature [34] (Appendix B). For comparison of breakage frequencies between chromosome bands, the expected breakage density in a band is a more useful entity. Expected breakage density (dis) is obtained by the modification of 2.2 as follows: i  iE  =  P  f N  =  (2.3)  ?-  In G banded data sets, breakpoints are more often found in light bands than in dark bands [55]. In hypothesis II, the assumption of random breakage is modified to account for the excess number of breakpoints in light G bands. The underlying assumption in this hypothesis is the extreme possibility that all excess breakage in light bands is due to bias, and probabilities of breakage were corrected accordingly as follows: Expected Probability of Breakage in a Dark Band  P*  (piE ) D  (2-4)  =  D  Expected Probability of Breakage in a Light Band (piE ) L  ViE  L  =  where PiE = expected probability of breakage in dark band i; D  (2.5)  pis  h  = expected probability  of breakage in light band i\ ND = total number of breaks in all dark bands; NL = total number of breaks in all light bands; LL = total length of all light bands; Lrj = total length of all dark bands. To obtain expected breakage densities in light and dark bands, 2.4 and 2.5 are multiplied by yV, and divided by  to give:  Chapter  2.  Expected  Data  Sources  Breakage  and  Density  38  Methods  in Dark  (diE )  Bands  D  ND L Expected  Breakage  Density  in Light  Bands  (2.6)  D  (d{E ) L  (2.7) Both hypotheses I and II were tested on all ascertainment groups as well as on all sets of data from the original studies. 2.5.3  B i n o m i a l Confidence Limits  Hypotheses I and II may be tested by inserting expected values of probability into equation 2.1 to calculate a point estimate of the probability P that X; number of breakpoints are observed. However, such point estimates are unsatisfactory measures of probability because they are quite sensitive to chance variation [41]. A more descriptive measure is the confidence interval (ci) around the the observed probability of breakage (pio — %i/N) [41] A ci at a specific significance level, such as 0.01, defines a range of Xi values likely to be observed in a given proportion (in this case 99%) of samples in repeated samplings. The value of Xi is expected to he outside the ci by chance 1% of the time. 99% ci were calculated for all bands in this study to allow not only the detection of bands that qualify as hot spots for breakage but also to help identify trends and understand the distribution of breakpoints on the entire genome. Since we have no control over errors introduced at the technical and at the study design level, a relatively conservative nominal significance level of 99% was chosen to reduce the effects of systematic and random errors on the data. The exact upper and lower confidence limits on p ^ can be derived by setting the value of P to the desired significance level, and calculating the upper and lower values of pio in  Chapter  2.  Data  Sources  and  39  Methods  2.1. The calculations involved in this derivation are substantially simplified by making approximations to the discrete binomial distribution using either the continuous normal or Poisson distributions [41]. The appropriate choice for the approximation is determined by the sizes of both iV and a;,-. The decision to use one or the other distribution is rather arbitrary, however. Through graphing a series of binomial probability distributions, we found that sample sizes N of about 400 breakpoints or greater and a;,- of 9 or more produce a near normal probability distribution. For smaller Xi, the Poisson distribution is more appropriate. Tables of confidence limits are readily available for the Poisson distribution [52]. For the normal distribution, we adapt the formulae in [41] to the notation used in this study as follows: Lower  Confidence  Limit  for p&  N  (pio,)'-  2N  vTVT V \Nj \  X  Pio, = Upper  Confidence  Limit  7—T* 1^ 4- N—  u  -  2  1  for p^  8  {piO )'u  N + 2N + v/iVV (w)  pio  N) 1" AN  I  1  N) + 4N  r r u N  =  (•  )  1 +  Confidence limits for breakage densities are obtained by modifying equations 2.8 and 2.9 respectively as follows:  d  t 0 t  =  ^  <ko. =  2.5.4  (2.10)  (2.11)  Testing for Nonrandomness Using Binomial Confidence Limits  The central assumption in testing for nonrandom breakage in a set of structual rearrangements is that the observed probability of breakage in a chromosome band (p,o) estimates  Chapter  2.  Data  Sources  and  40  Methods  the true unknown probability of breakage (pi). The confidence interval defines the set of values this probability may have at a specific significance level. As afirststep in the analysis, breakpoint distributions were tested for nonrandomness using Pearson's % test and the likelihood ratio G which detect overall deviations from 2  2  random breakage. To define specific sites of nonrandom breakage the method of binomial confidence limits [70] was used. Confidence intervals were calculated for all bands in the haploid chromosome set for each data set. The test for nonrandomness involves the formulation of a hypothesis of random breakage (Ho) and comparison of hypothetical breakage densities to observed breakage densities. If the hypothetical breakage density lies within the confidence interval for the observed breakage density dio, the hypothesis of random breakage is accepted. Unusually frequent breakage is indicated by values of diE that he outside of the lower confidence limit for d^. Conversely, areas of unusually infrequent breakage can be detected as well by values of diE higher than the upper confidence limit for dio2-5.5  Comparison of Results Between D a t a Sets  Chromosome bands with unusually high number of breakpoints may or may not represent true sites of frequent chromosome breakage. Bias in ascertainment of cases, in addition to random variation, may produce nonrandomness in the results. Lists of bands with excess breakpoints from two different data sets can be statistically compared to evaluate the probability that identical bands are observed by chance in independent data sets. The hypergeometric distribution [41] is used to calculate the probability P that k identical bands in two data sets with P>i and Bi bands of frequent breakage are observed by chance as follows:  Chapter  2.  Data  Sources  and  41  Methods  \  k  P ( X = k) =  ,) \ B - k  — — 1  \  2.5.6  t  7  B  B  (2.12)  X  2  S u m m a r y of Statistical Analysis  In the test for nonrandom breakage, the following steps of statistical analysis were carried out on sets of data with duplicate rearrangements removed. 1. A x  2  a n  d C statistic was calculated to test for homogeneity in breakpoint distri2  butions between individual data sets prior to pooling of data. 2. Individual data sets, including pooled data, were tested for overall nonrandomness in the distribution of breakpoints by computing % and G . 2  2  3. Distribution of breakpoints in specific chromosome bands were tested for nonrandomness using the method of binomial confidence limits in individual data sets as well as the pooled data. 4. Comparison of results between data sets were carried out using the hypergeometric distribution to calculate the probability that identical bands occur in two independent data sets by chance. 5. The hypergeometric distribution was also used to calculate the probability of random association of bands of frequent constitutional breakage with fragile sites and oncogenes. 6. The observed distribution of breakpoints in dark and light G bands was compared to the distribution expected based on the total relative lengths of light and dark G  Chapter  2.  Data  Sources  and  Methods  42  bands. The proportion of breakpoints in light and dark G bands was compared by calculation of a 2 statistic in each data set, including pooled data.  Chapter 3  Results  In the following sections, the results of breakpoint distribution analysis i n reciprocal translocations, inversions and sperm chromosome rearrangements are presented.  3.1  Reciprocal Translocations  Results for inherited rep ascertained through abnormalities (group A l ) and incidentally (group A 2 ) , and for de novo rep ascertained incidentally (group A 3 ) are presented i n sections 3.1.1, 3.1.2, and 3.1.3 respectively. of these ascertainment  This is followed by a comparative analysis  groups (section 3.1.4), a description of the distribution of rep  breakpoints i n light and dark G bands (section 3.1.5), and their association with fragile sites and oncogenes (section 3.1.6).  3.1.1  Rep Ascertained T h r o u g h Abnormalities  (Al)  Breakpoints from rep ascertained through abnormalities ( A l ) were tested for nonrandomness and the results are summarized i n tables 3.1 and 3.2. The distribution of breakpoints overall the entire chromosome complement was nonrandom i n all three independent data sets as measured both by the % and G 2  2  (table 3.1).  In testing distributions specific to bands, nonrandomness was found i n all three data sets under the assumptions of both hypotheses I and II . The bands nonrandomly involved 1  in breakage are listed in table 3.2. The number of hot spot bands i n a data set appear 1  See Chapter 2 for definitions of hypotheses I and II. 43  Chapter  3.  44  Results  Data Sample  Source Size  (N)  df  x  2  P G  2  P Interpretation  a  b  c  1167 320 1170 < 0.0001 1042 < 0.0001  273 320 631 < 0.0001 486 < 0.0001  137 24 53.8 0.0005 42.5 0.014  Nonrandom  Nonrandom  Nonrandom  Table 3.1: Summary of % and G in tests for overall randomness of breakpoint distributions of rep ascertained through abnormalities (group Al). Data sources: a. Daniel et al., 1989; b. ICRS; c. ReCAP. See table 2.5 Group A l for sample sizes and references for individual data sets. 2  2  to correlate with the total sample size, with more hot spots found in larger data sets. Some bands are nonrandomly involved in more than one of the independent data sources. Among hypothesis I hot spot bands there were no identical results between ICRS (data set b) and ReCAP (data set c). However, the probability is very low (p = 0.00275) that the 3 hot spot bands observed in both data sets a and b is a chance occurrance. Similarly, the one identical band found in both data sets a amd c (18q21) is unlikely to be a chance observation (p = 0.00284). For hypothesis II, the 3 matches of hot spot bands in data sets a and b were also found to be highly significant (p — 0.0019). There were no matches between data sets a and c, or b and c for hypothesis II. Of the three data sets, bands of infrequent breakage were detected only in study a with the largest sample size. These are listed in table 3.3. Bands lpl3, 19pl3, Xp22, and X p l l were nonrandomly involved in both hypotheses I and II. In all data sets, the overall distribution of breakpoints in light and dark bands was significantly different according to calculations of z statistic. In all cases breakpoints occur more frequently in light G bands. (See section 3.1.5 below.) In prenatal diagnosis data (study a), 30 and 20 bands were detected as hot spots in tests of hypothesis I and  Chapter  3.  Results  Hypothesis  45  I.  Hypothesis  Hot Spots i n D a t a Set:  Hot Spots a  N  =  lq42 2q33 3q21 3q27 4pll 4q35 5pl5 5 13 5pl2 5q35 6q21 7p22 7q32 P  1167  N  = 279  N  = 137  (L) (L) (L) (L) (D) (L) (L) (L) (D) (L) (L) (L) (L)  9p24 (L)  18q23 21q22 22qll 22ql2  9 p l l (D)  9 p l l (D)  9 q l l (D)  -  -  l l q l l (D)  9 p l l (D)  13ql4 ( L ) 13q22 ( L ) 13q34 ( L )  l l q l l (D)  -  -  17pl3 (L)  1 8 p l l (D)  1 8 p l l (D)  18q22 (D)  -  18q23 ( L ) 21q22 ( L ) 2 2 q l 2 (D)  1 7 p l 3 (L)  -  17q25 ( L ) 1 8 p l l (D) 18q21 (L)  -  7q32 ( L )  (L) (L) (L) (L) (L)  17pl3 (L)  9p24 ( L )  9p22 ( L )  5pl5 (L) 5 p l 2 (D) 7p22 ( L )  10q26 (L) 13ql4 13q22 13q34 14q32 15q22  9p24 ( L )  4q35 ( L )  9p22 ( L ) 9 p l l (D)  = 279  -  3q27 ( L ) 4 p l l (D)  9p24 ( L )  N  -  lq42 (L)  7q34 (L)  Set:  c  b  1167  N =  II  in Data  N = 137  7q34 ( L ) 8q21 (D)  -  --  Xq21  (D)  18q21 (L)  (L) (L) (L) (D)  T a b l e 3.2: B a n d s o f s i g n i f i c a n t excess breakage d e t e c t e d at t h e 0.01 level of s i g n i f i c a n c e i n rep a s c e r t a i n e d t h r o u g h a b n o r m a l i t i e s ( A l ) . D a t a sources are i d e n t i f i e d i n t a b l e 3.1. B o x e d b a n d s are o b s e r v e d i n at least t w o d a t a sets. d a r k a n d light G b a n d respectively.  ( D ) a n d ( L ) are d e s i g n a t i o n s for  Chapter  3.  46  Results  Cold Hypothesis  lp31 (D) Ipl3 (L) 2q24 (D) 8q21 (D) 12q21 (D) 19pl3 (L) X 22 (L) X p l l (L) P  Spots I Hypothesis  11  -  1 13 (L) 2 13 (L) P  P  3 21 (L) P  19pl3 (L) X 22 (L) X p l l (L) P  Table 3.3: Bands with a significant deficit of breakpoints detected at the 0.01 confidence level in data set a [11]. Boxed bands are significant in both hypotheses. (D) and (L) refer to dark and light G bands. II respectively. Hypothesis I hot spot bands include 24 light G bands and 6 dark G bands (ratio 4:1), and hypothesis II hot spot bands include 13 light G bands and 7 dark G bands (ratio 1.9:1). According to expectation, the correction for excess breakpoints in light G bands produced a decreased ratio of light to dark G band hot spots in data source a. In ICRS and ReCAP, the distribution of breakpoints with respect to light and dark bands does not show obvious trends in analyses of breakpoint distributions in chromosome bands, due to the small number of bands with excess breakpoints in these data sets. Both tests of hypothesis I and II detected the same set of 4 hot spot bands in the ICRS data (study b), with one additional dark G hot spot (18pll) observed in hypothesis II. In ReCAP, 2 and 3 hot spots were found by tests of hypothesis I and II respectively. However, only band 7q34, a light G band, was seen in both tests.  Chapter 3.  Results  Data  47  Source  Sample Size df  x  2  P G p(z Interpretation 2  a  b  c  d  e  f  575 320  129 24  90 24  14 24  24 24  54 24  573 < 0.0001 531 < 0.0001 Nonrandom  30.3 0.1761 33.6 0.0900 Random  23.7 0.4803 29.5 0.2018 Random  19.7 0.7108 21.8 0.5904  15.1 0.9183 19.7 0.7149 Random  25.9  (N)  Random  0.3573 27.7 0.2708 Random  Table 3.4: Summary of % and G in tests for overall randomness of breakpoint distributions of rep ascertained incidentally (group A2). Data sources: a. Daniel et al, 1989; b. Hook et al., 1987; c. Ferguson-Smith., 1984; d. ICRS; e. ReCAP; /. Newborn Surveys. See also Table 2.5 Group A2 for sample sizes and references for individual data sets. 2  3.1.2  2  Incidentally Ascertained Balanced Rep (A2)  Incidentally ascertained rearrangements that qualify for group A2 (see chapter 2) were tested for nonrandomness and the results are summarized in tables 3.4 and 3.5. Overall nonrandomness in breakage was found only in data set a. Distribution of breakpoints in all other data sets was random. This result was confirmed for data sets b, e, and /in testing for nonrandomness in specific chromosome bands (table 3.5). Two dark bands (12qll and Xqll) were frequently involved in breakage according to hypothesis II in data set d, but random breakage was observed in the test of hypothesis I. Nonrandom involvement was greater in the larger data set a than in data set c, and randomness was generally found in the smaller data sets with the exception of b. In addition to random breakage, the only identical result observed across data sets, is the frequent involvement of dark G band l p l l in two independent prenatal diagnosis studies (a and c). This result is significant at the 95% level as the chance occurrence of a single identical hot spot band in the two independent prenatal diagnosis studies has a low probability (p = 0.019 for hypothesis I, and p — 0.025 for hypothesis II). The overall distribution of breakpoints in dark and light bands follows a trend similar  Chapter 3.  Results  48  Hypethesis Hot  N  = 575  N  = 129  l p l l (D) 3pl3 (L)  I.  in Data  Set:  c  b  a  Spots  90  N =  e  d N =  14  N =  24  /  A = 54 r  l p l l (D)  -  -  llq21 (L)  -  -  16pl3 (L)  Hypothesis Hot a N =  c  b  575  N =  129  l p l l (D) 4q22 (D) l l q l l (D)  1 9 q l l (D)  II.  Spots in Data  N =  Set:  d  90  N  = 14  e N -  24  /  iV = 54  l p l l (D)  -  llq21 (L)  12qll (D)  -  -  X q l l (D)  Table 3.5: Bands of significant excess breakage detected at the 0.01 level of significance in rep ascertained incidentally (A2). D a t a sources are identified i n table 3.4 Boxed bands are observed i n at least two data sets. (D) and (L) are designations for dark and light G band respectively.  Chapter  3.  49  Results  to what was observed for rep ascertained through abnormalities (Al). The number of breakpoints is significantly greater in light G bands in all data sets (see section 3.1.5 below for further details). For data set a, an increase in the number of dark G band hot spots and elimination of light G band hot spots is produced by the correction in hypothesis II for excess breakpoints in light G bands. The number of bands with excess breaks was too few in data set c and d to make a similar comparison. 3.1.3  Incidentally Ascertained De Novo Rep (A3)  A set of 122 de novo rep, from the sources listed in table 2.5 was tested for nonrandomness. All breakpoints in this data set were incidentally ascertained and represent only those cases where both parents were investigated and found not to be carriers of the rep discovered incidentally in the proband. Overall distribution of breakpoints was found to be random (p{x > 33.6) = 0.0927; p(G > 28.2) = 0.2522). Distribution in 2  2  chromosome bands was also random by the test of hypothesis I. The test of hypothesis II reveals one dark G band, 4pl5, with greater number of breakpoints than expected at the 0.01 significance level. This band is not frequently involved in breakage in any of the other data sets. Distribution of breakpoints in dark and light G bands was nonrandom in favor of light G band breakpoints (see section 3.1.5). 3.1.4  Comparison of Results: Rep Ascertained Through Abnormalities ( A l ) and Rep Ascertained Incidentally (A2 &  A3)  Data from individual studies were tested for homogeneity prior to pooling using Pearson's X and the G statistic. Individual data sets of incidentally ascertained rep (A2) were 2  2  found to be homogeneous p(x > 1194) = 0.997; p(G > 849) = 1.0000). However, 2  2  individual data sets of rep ascertained through abnormalities (Al) were not homogeneous  Chapter  3.  Results  50  (p(x > 669) = 0.0061; p ( G > 659) = 0.0128), as indicated also by the heterogeneous 2  2  methods of case ascertainment (table 2.1). Data pooled into either groups of rep ascertained through abnormalities (Al) or incidentally ascertained rep (A2), were tested for overall nonrandomness and for nonrandom breakage in specific bands. The overall distribution in both ascertainment groups was found to be highly nonrandom (p(x > 1421) < 0.0001; p { G > 1196) < 0.0001 for rep as2  2  certained through abnormalities (Al), and p(% > 754) < 0.0001; p ( G > 664) < 0.0001 2  2  for rep ascertained incidentally (A2)). Bands with a significant excess of breakpoints at the 0.01 confidence level are listed for both ascertainment groups and both hypotheses in table 3.6. The entire distribution of breakpoints on the chromosomes with the 99% confidence intervals and expected values for hypotheses I and II are shown in figures 3.1.4 and 3.1.4. The rep involving hot spot bands in table 3.6 are listed in Appendix A for both ascertainment groups. It is appearent in these lists that for the most part each rep involves a unique set of breakpoints. Identical rep are rarely observed between unrelated individuals. 5 light G bands, 5q35, 7p22, 9p22, 13ql4, and 17q25, were observed with increased frequency in both rep ascertained through abnormalities (Al) and rep ascertained incidentally (A2), when hypothesis I was tested. The probability that the coincidence of these bands is a chance event is low (p = 0.0075). In the test of hypothesis II only 2 bands were found in both data sets (9p22, 17q25) and this result does not reach significance at the 95% level (p = 0.061). The one band in incidentally ascertained balanced de novo rep data (A3), 4pl5, that had a significant excess of breakpoints after the hypothesis II correction was never seen in rep ascertained through abnormalities (Al) or in incidentally ascertained balanced rep (A2). Some bands in rep ascertained through abnormalities (Al) had a significant deficit of breakpoints and these are listed in in table 3.7. In incidentally ascertained rep (A2), only  Chapter  3.  51  Results Hypothesis I  Ascertained Through Abnormalities N = 1583  lq42 (L) 3q21 ( L ) 3q27 ( L ) 4 p l l (D) 4q35 ( L ) 5pl5 (L) 5pl3 (L) 5pl2 (D)  Hypothesis II  Ascertained Incidentally N = 886  CFS  lq42 (L)  2q33 ( L )  CFS  4pll (D)  CFS  3q27 ( L )  4q35 ( L ) 5pl5 (L) 5pl2 (D) 7q32 ( L ) 9p24 ( L )  CFS  CFS  9p22 ( L ) 9 p l l (D) 9 q l l (D) 10q26 ( L ) l l q l l (D)  5q35 ( L )  CFS 7p22 ( L )  9p22 ( L )  CFS llq21 (L)  CFS CFS  9p22 ( L )  13ql4 13q22 15q22 17pl3  9p22 ( L )  10q22 ( L ) 10q24 ( L )  CFS  11 15 (L)  ONC  CFS P  l l q l l (D)  (L) (L) (L) (L)  17q25 ( L )  9pl3 (L) 9 p l l (D) 9 q l l (D)  10q26 ( L )  CFS CFS 4q22 ( D )  7p22 ( L ) (L) (L) (L) (L)  Ascertained Incidentally N = 886 l p l l (D)  CFS  5q35 ( L )  7q22 7q32 8p23 9p24  Abnormalities N = 1583  l p l l (D) lq21 (L)  5gl3 (L) 6q21 ( L )  Ascertained Through  CFS ONC  17q25 ( L )  ONC  18pll (D) 18q22 ( D ) 18q23 ( L ) 19qll (D) 21qll (L) 21q22 ( L ) 22ql2 (D)  ONC CFS  llq21 (L) 13ql4 (L) 13q22 ( L ) 13q34 ( L ) 14q32 ( L ) 15q22 ( L ) 17pl3 (L) 17q25 ( L ) 1 8 p l l (D) 18q21 ( L ) 18q23 ( L ) 21qll (L) 21q22 ( L ) 22qll (D) 22ql2 (D)  13ql4 ( L )  ONC CFS ONC  17q25 ( L )  ONC  C/0 ONC CFS  Table 3.6: Bands of significant excess breakage detected at the 0.01 confidence level in pooled rep data. Boxed bands are common to both ascertainment groups. (D) and (L) refer to dark and light G bands respectively. CF5=common fragile site; OiVC=oncogene; C/O=common fragile site and oncogene.  Chapter  3.  52  Results  Cold  Spots  in Rep  Through Hypothesis  lp31 lql2 2pl6 2q24  Ascertained  Abnormalities I  Hypothesis  II  (D) (D) (L) (D)  6pl2 (D) 6q22 (D) 12q21 (D)  3p21 (L)  1 7 q l l (L) 19pl3 (L) Xp21 (D)  19pl3 (L) 19ql3 (L) . X p l l (L) X q l 3 (L)  Table 3.7: Bands with a significant deficit of breakpoints detected at the 0.01 confidence level in rep ascertained through abnormalities (Al) under the assumptions of hypotheses I and II. two light bands (Xp22, and Xpll) were found to have a significant deficit of breakpoints under hypothesis II and none had deficits under the hypothesis I assumption. These bands were not among the cold spots in rep ascertained through abnormalities (Al). The overall distribution of breakpoints in light and dark bands follows the trend observed in individual data sets with a significantly higher number of breakpoints in light G bands. This observation is discussed in more detail in the following section. In the group of rep ascertained through abnormalities (Al), the hypothesis I test detected 28 light and 7 dark G band hot spots (ratio 4 to 1) and the hypothesis II test detected 16 light and 8 dark G band hot spots (ratio 2 to 1). The corresponding numbers for incidentally ascertained rep (A2) are 12 light and 1 dark band hotspots (ratio 12 to 1) for hypothesis I and 3 light and 3 dark band hotspots (ratio 1 to 1) for hypothesis II.  1p36 1p35 1p34 1p33 1p32 (c1) 1p31 1p22 1p21 1p13 1p12 1p11 1q11 (d) 1q12 1q21 1q22 1q23 1q24 1q25 1q31 1q32 1q41 (h12) 1q42 1q43 1q44  77.2  6p25 6p24 6p23 6p22 6p21 (d) 6p12 6p11 6q11 6q12 6q13 6q14 6q15 6q16 (hi) 6q21 (c1) 6q22 6q23 6q24 6q25 6q26 6q27  2p25 ssass ***** wm 2p24 2p23 * m I mm. mmmmmi 2p22 • 2p21 I , <c1) 2p16 ss*s~ :iim<mmmmm 2p15 I *: • \' 2p14 * m 2p13 i**;*!*** :*:*• I 2p12 I • I **s*s I 2p11 SxSSSS mmm 2q11 * • *l 2q12 +»* * s m m i 2q13 mm 2q14 !• ***mmmml' 2q21 ;s's*ss ss*ssas| 2q22 • 2q23 (d)2q24 mmmmm 2q31 2q32 SSSSSS m>m\ 2q33 ssasss 2q34 -. • fm mmmmm , ********( 2q35 2q36 m • 85 1 ; *•mm s i 2q37 Sx m S H x*si  4p16 4p15 4p14 4p13 4p12 (h12) 4p11 4q11 4q12 4q13 4q21 4q22 4q23 4q24 4q25 4q26 4q27 4q28 4q31 4q32 4q33 . ' 4q34 (M2) 4q35  I  f  tSSi * f i  =  ,.,..|  ft — - L•!•••••, • 1-. ••••••;•••••;  74.3 84.3  :  (hi) 7p22 7p21 7p15 7p14 7p13 7p12 7p11 7q11 7q21 (hi) 7q22 7q31 (h12) 7q32 7q33 7q34 7q3S 7q36  m  wm ^p73.2  68.7  It  104.4  3P| 99.8 !  Ijp 74.6  10p15 10p14 10p13 10p12 10p11 10q11 10q21 10q22 10q23 10q24 10q25 (h12) 10q26  3.  I  Results  • 117.9  ::|:  %mW  •  q  MB—^  • 78.4 •I  I  .^:.:.?.:.;.^^^•.,,:.h,.,,;,.;,.;,,,,,,-,^ .,,,,,,.,,,,,,,,,.1  \  mm Imm  m  I  ^li  1  • 92.7  fe—*i  \m-'¥  I  L__^  -T-n  1  g £ Z  m±m km  1  ^  ta*L -.•.-.-V-.-.-.-.v.-.-.-.-.i : • 84.5  fegg  ; L  .•..v.v^.v.v.;.-...-.v..-.^........-„  . H  ^mmmmmmmmmmmm  i p mm  r  sis :|  :  m  mmmmmmmm  '^f  1  ;  -:.Vxx-"":ix:.^^ x xxx.fexxxxxx x « xxx xxxl .  |:::s:v:-::::-:-::^::.::K|:.:.:.:.:::-:.. :  ' i:*: ^ ^ ^ ^  1  1  I  g = * ^ ^  : 1  Jp t xlixXxXWXxXXxXXxl [•:•:::-:-:-:  1  lisCii  I 74.6  (.:•:.¥.;.: :.:.:.v.:.:.x.:.:.:l XvXvx-x-x-x-x-j ;v;-v;v;v;-;vv;-;-J  r  I-::*:-.:-:  mmmmm  '  Bar*——' • •  : :  •  l  ; : : : x  ^ — — i .  , f ••  :  —  r------— •  i- *« -4-  1  mmmmmM±mmi 1 L  s= mtsr  1  hm t  I  Hm  •IS':®  — *****  90.4  IN^  tt-je»  ^  I- > •  ±m  t1  mmmmiyc  • :• :• :• : U.x  ' ;  *l  (h12) 17p13 17p12 17p11 (c2) 17q11 17q12 17q21 17q22 17q23 17q24 (M2) 17q25 (h12) 18p11 18q11 18q12 (hi) 18q21 (h2) 18q22 (h12) 18q23  53  •  1  mm- s*ss*  20p13 20p12 20p11 20q11 20q12 20q13  ™  mmmmi • **'* : i mmsmm mm rtmmsm x|SSS»S :*:x:xx|  s  t  |x:;:xx:x  mmmmm  1  SSI  •  mi  immimmm  sas •• mm IS: mmm sss s:: 1 tsxssssssssi Is***:  sssl  :  \:m |.::x:xSS::«!S:SxS:SS:xx:: S* ;  \ ' -m  KSSSSSSSSl  lm><±  ]82.6 mmimmmm 1 1l i s  (c12) 19p13 ;:(™ss| 19p12 :Sx|::.S:S sssssas sixssl 19p11 19q11 xxxx.xx: mmmmmm,m~ * asssssssssi 19q12 mm mm mmm i S:;|:::x: SXJ. (c2) 19q13  22p13 22p12 22p11 (hi) 22q11 (h12) 22q12 22q13  xXxisSSSSSSSSI  :  16p13 16p12 16p11 16q11 16q12 16q13 16q21 16q22 16q23 16q24  ,1  m  m  m  m  m  m  m  mm  21p13 21p12 21p11 (h12) 21q11 21q21 (h12) 21q22  ::::-::: ::::-:-.|  E  (hi) 8p23 8p22 8p21 8p12 8p11 8q11 8q12 8q13 8q21 8q22 8q23 8q24 (h12) 9p24 9p23 (h12) 9p22 9p21 (M) 9p13 9p12 (h12) 9p11 (h12) 9q11 9q12 9q13 9q21 9q22 9q31 9q32 9q33 9q34  Chapter  «~  :  r  3p26 3p25 3p24 3p23 3p22 (c2) 3p21 3p14 3p13 3p12 3p11 3q11 3q12 3q13 (hi) 3q21 3q22 3q23 3q24 3q25 3q26 (h12) 3q27 3q28 3q29  (h12) 5p15 5p14 (hi) 5p13 <h12) 5p12 5p11 5q11 5q12 5q13 Sq14 5q15 5q21 Sq22 5q23 5q31 5q32 5q33 5q34 (hi) 5q35  232.2 1.8  Xp22 (d) Xp21 (c2) Xp11 Xq11 Xq12 <c2) Xq13 Xq21 Xq22 Xq23 Xq24 Xq25 Xq26 Xq27 Xq28  11  74.3  Yp11 Yq11 Yq12  Figure 3.1: Distribution of breakpoints per unit chromosome length for rep ascertained through abnormalities. Horizontal bars represent breakage densities for each band i n a 320 band karyotype. Shaded region within a bar represents the 99% confidence interval. Thin vertical line within a confidence interval represents the observed breakage density. Thick vertical line, spanning all bands in a chromosome, is the expected value for breakage density based on hypothesis I (8.88). Hypothesis n expected values are shown as •; dark band values (4.90) are seen to the left of hypothesis I expected value, and light band values (12.45) are to the right. Hot spots are bands with expected values located outside the lower confidence limit, i n the nonshaded region. Broken bars represent values greater than 65. Bands of frequent and infrequent breakage at the 99% confidence level are designated as h i , h2, or h l 2 for hot spots, and c l , c2, or c l 2 for cold spots, for hypothesis I, hypothesis II, or both hypotheses, respectively.  1p36 1p35 1p34 1p33 1p32 1p31 1p22 1p21 1p13 1p12 <h12) 1p11 1q11 1q12 (hi) 1q21 1q22 1q23 1q24 1q25 1q31 1q32 1q41 1q42 1q43 1q44  142.8  4p16 4p15 4p14 4p13 4p12 4p11 4q11 4q12 4q13 4q21 (h2) 4q22 4q23 4q24 4q25 4q26 4q27 4q28 4q31 4q32 4q33 4q34 4q3S  Chapter  99.8  10p15 10p14 10p13 10p12 10p11 10q11 10q21 (hi) 10q22 10q23 (M) 10q24 10q25 10q26  91-5  (I  16p13 16p12 16p11 16q11 16q12 16q13 16q21 16q22 16q23 16q24  54  17p13 17p12 17p11 17q11 17q12 17q21 17q22 17q23 17q24 (h12) 17q25 92.7  18p11 18q11 18q12 18q21 18q22 18q23 19p13 19p12 19p11 (h2) 19q11 19q12 19q13  I  91.5 83.9  20p13 20p12 20p11 20q11 20q12 20q13 21p13 21p12 21p11 21q11 21q21 21q22 22p13 22p12 22p11 22q11 22q12 22q13 (c2) Xp22 Xp21 <02) Xp11 Xq11 Xq12 Xq13 Xq21 Xq22 Xq23 Xq24 Xq25 Xq26 Xq27 Xq28  8p23 8p22 8p21 8p12 8p11 8q11 8q12 8q13 8q21 8q22 8q23 8q24 9p24 9p23 <h12) 9p22 9p21 9p13 9p12 9p11 9q11 9q12 9q13 9q21 9q22 9q31 9q32 9q33 9q34  Results  3  (hi) 7p22 7p21 7p15 7p14 7p13 7p12 7p11 7q11 7q21 7q22 7q31 7q32 7q33 7q34 7q35 7q36  91.5  3.  Zlt  6p25 6p24 6p23 6p22 6p21 6p12 6p11 6q11 6q12 6q13 6q14 6q15 6q16 6q21 6q22 6q23 6q24 6q25 6q26 6q27  2p25 2p24 2p23 2p22 2p21 2p16 2p15 2p14 2p13 2p12 2p11 2q11 2q12 2q13 2q14 2q21 2q22 2q23 2q24 2q31 2q32 (hi) 2q33 2q34 2q35 2q36 2q37  3p26 3p25 3p24 3p23 3p22 (c2) 3p21 3p14 3p13 3p12 3p11 3q11 3q12 3q13 3q21 3q22 . 3q23 3q24 3q25 3q26 3q27 3q28 3q29  5p1S 5p14 6p13 5p12 5p11 5q11 5q12 (M) 5q13 5q14 5q15 5q21 5q22 5q23 6q31 5q32 5q33 5q34 (hi) 6q35  128.6  I IE  mm  S2.9  m I  I  mm m mmmm  N  mmm  74.3  Yp11 Yq11 Yq12  Figure 3.2: Distribution of breakpoints per unit chromosome length for rep ascertained incidentally. Horizontal bars represent breakage densities for each band in a 320 band karyotype. Shaded region within a bar represents the 99% confidence interval. Thin vertical line within a confidence interval represents the observed breakage density. Thick vertical line, spanning all bands in a chromosome, is the expected breakage density based on hypothesis I (4.99). Hypothesis II expected values are shown as •; dark band values (2.88) are seen to the left of hypothesis I expected value, and light band values (6.88) are to the right. Hot spots are bands with expected values located outside the lower confidence limit, in the nonshaded region. Broken bars represent values greater than 65. Bands of frequent and infrequent breakage at the 99% confidence level are designated as hi, h2, or hl2 for hot spots and cl, c2, or cl2 for cold spots, for hypothesis I, hypothesis H, or both hypotheses, respectively.  Chapter  3.  55  Results  The correction for excess breakpoints in light G bands (Hypothesis II) resulted in the reduction of the ratio of light band to dark band hot spots in both data sets, but the effect was more pronounced in the incidentally ascertained data (A2). 3.1.5  Distribution of Rep Breakpoints in D a r k and Light Bands  The overall distribution of breakpoints was compared in light and dark bands in all individual data sets of rep ascertained through an abnormality (Al) and rep ascertained incidentally (A2). The expected number of breaks in light and dark bands were computed as proportional to the total relative lengths of light and dark bands  (NL/LL  and  ND/LD  respectively). The results are shown in table 3.8. In all cases the frequency of breakpoints in light G bands was found to be significantly higher than in dark G bands. 3.1.6  Association of Rep Breakpoints with Fragile Sites and Oncogenes  Overall distributions of rep breakpoints were tested under the assumption that the distribution is random with respect to fragile sites or oncogenes. Under this assumption the frequency of breakpoints in all fragile site bands was expected to be proportional to the sum of the relative lengths of all fragile site bands. The overall distribution of breakpoints in fragile site bands was found to be nonrandom in rep ascertained through abnormalities (Al) with significantly greater number of breakpoints in fragile site bands. (x = 12.2, p < 0.001) In incidentally ascertained rep (A2), the distribution of break2  points was random with respect to fragile sites (% = 1.29, p > 0.1). However, the 2  coincidence of specific rep hot spots and fragile site bands as measured by the hypergeometric distribution is likely to be due to chance in both ascertainment groups and for both hypotheses (p is at least 0.083). The same method was used to test the distribution of. rep breakpoints in bands with oncogenes. The overall distribution was nonrandom for both rep ascertained through  Chapter  3.  Results  # Data  Set  Rep Ascertained T h r o u g h Abnormalities a b c Total  of  Breakpoints  Light  Bands  Dark  #  %  #  %  848 208 109 1165  (72.7) (76.2) (79.6) (73.7)  319 65 28 412  (27.3) (23.8) (20.4) (26.1)  15.32 9.12 7.80 19.16  < < < <  403 100 68 12 19 43 602  (70.1) (77.5) (75.5) (85.7) (79.2) (79.6) (72.4)  172 29 22 2 5 11 230  (29.9) (22.5) (24.5) (14.3) (20.8) (20.4) (27.6)  9.12 6.76 5.05 3.53 3.20 4.92 12.89  < 0.0001 < 0.0001 < 0.0001 0.0002 0.0007 < 0.0001 < 0.0001  103  (84.4)  19  (15.6)  9.67  < 0.0001  Bands  z  P  0.0001 0.0001 0.0001 0.0001  Rep Ascertained Incidentally a b c  d e  / Total  De novo Rep Ascertained Incidentally  Table 3.8: Distribution of rep breakpoints i n dark and light G bands, (z for p < 0.01 2.58).  Chapter  3.  57  Results  Data Sample  Source Size  (N)  df  x  2  P G  2  P Interpretation  a  b  c  73 24 59.7 < 0.00018 70.2 < 0.0001  65 24 43.5 0.0087 54.1 0.0004  28 24 58.8 0.0001 48.8 0.0020  Nonrandom  Nonrandom  Nonrandom  Table 3.9: Summary of % and G in tests for overall randomness of breakpoint distributions of inversions ascertained through abnormalities (Bl). Data sources: a. ICRS; b. Daniel et al., 1989; c. ReCAP. See also table 2.5 Group B l for references. 2  2  abnormalities (Al) and rep ascertained incidentally (A2) (% = 12.3, p < 0.001 and 2  X = 6.30, p < 0.025 respectively). On the other hand, we were unable to detect 2  significant coincidence with specific hot spot bands (p is at least 0.109). 3.2  Inversions  The only rearrangement type, besides rep, for which sufficient number of data points was available for analysis was inversion. Inversion data were analyzed in a manner similar to rep data.^Initially, data from individual sources were analyzed separately. Subsequent to testing for homogeneity between individual data sets, pooled data were analyzed. The results are described in the following sections. 3.2.1  Inversions Ascertained T h r o u g h Abnormalities  (Bl)  The distribution of breakpoints overall on the chromosomes and in specific bands was tested for inversions ascertained through abnormalities (group Bl). The results are summarized in tables 3.9 and 3.10 respectively. The distribution of breakpoints overall on the chromosomes was nonrandom, despite  Chapter  3.  58  Results  Hypethesis Hot  Spots  Set:  6  a N  I.  in Data  = 81  N  = 65  N  -  2 p l l (L) 2ql3 (L) 3 p l l (D) 3ql2 (L)  c = 28  2 p l l (L) 2ql3 (L)  6p25 (L) 6pl2 (D) 7 15 (L) 8p23 (L) 8q22 (L) P  llq21 (L)  Hypothesis Hot  Spots in Data  Set:  c  b  a  = 81 2 p l l (L) N  II.  N  = 65  N  =  28  -  2ql3 (L)  2ql3 (L)  3 p l l (D) 3ql2 (L)  7pl5 (L) 8p23 (L) 8q22 (L)  6p25 (L) 6pl2 (D)  llq21 (L) Table 3.10: Bands of significant excess breakage detected at the 0.01 confidence level i n inversions ascertained through abnormalities ( B l ) . D a t a sources are identified i n table 3.9. Boxed bands are significant i n at least two data sets. (D) and (L) are designations for dark and light G bands respectively.  Chapter  3.  59  Results  the small sample sizes in all three data sets. Nonrandom breakage was also detected at the chromosome band level. High frequencies of breakpoints were observed in light G bands 2pll and 2ql3 in both the ICRS (data set a) and ReCAP (data set c) data sets. The probability that the two matches are due to chance is low (p=0.0012). The probability that the match between data sets a and c in hypothesis II is a chance occurrance is 0.0457, a significantly low result at the 95% level. No other matching hot spots were found in the three data sets. A significantly greater number of breakpoints were observed in light G bands than in dark G bands (see section 3.2.5). However, at the chromosome band level, the results of hypotheses I and II were virtually identical in all three data sources, with the exception that band 2pll in data set c was observed only in the hypothesis I test, indicating that the correction for a possible bias in assigning breakpoints to light G bands has very little influence on the results. Light G bands predominate in all data sets in tests of both hypotheses (see section 3.2.5. 3.2.2  Incidentally Ascertained Balanced Inversions  Summaries of x  2  a n  (B2)  d G tests for overall distribution of breakpoints and hot spots for 2  incidentally ascertained inversions (B2) are shown in tables 3.11 and 3.12 respectively. The overall distribution of breakpoints was nonrandom in all data sets, but no specific bands were involved in ICRS (data set e). 4 matching bands were observed across data sets a through e including, 2pll, 2ql3, Ypll, and Yqll. These bands represent only two rearrangements, inv(2)(pll;ql3) and inv(Y)(pll;qll) (see Appendix A). By comparing data sets with matching bands in pairs, we found that the coincidence of bands in any two data set is highly significant (p < 0.001 in all cases). The distribution of breakpoints in light and dark bands was nonrandom with a higher frequency of breakpoints in light G bands. This observation was not significant in data  Chapter  3.  60  Results  Data Source  a  b  c  Sample Size (N)  282 320 2281  83 24  42 24  57.9 0.0001 68.3 < 0.0001 Nonrandom  320 < 0.0001 101 < 0.0001 Nonrandom  df X P G  2  2  P Interpretation  < 0.0001 876 < 0.0001 Nonrandom  d  e  20 24 66.5 < 0.0001 43.8 0.0081 Nonrandom  14 24 90.0 < 0.0001 44.7 0.0063 Nonrandom  Table 3.11: Summary of % and G in test for overall randomness of breakpoint distributions of inversions ascertained incidentally (B2). Data Sources: a. Daniel et al, 1989; b. Hook et al., 1987; c. Ferguson-Smith., 1984; d. ReCAP; e. ICRS; See also table 2.5 Group B2 for references. 2  2  set e at the 95% level, probably due to small sample size (see section 3.2.5). Similarly to inversions ascertained through abnormalities (Bl), the hypothesis II correction had little influence on the set of bands found to have excess breakage. Although the ratio of light to dark hot spot bands is slightly lower in hypothesis II for data sets a and c, light hot spots predominate in both hypotheses for all data sets. 3.2.3  Incidentally Ascertained  D e novo Inversions  (B3)  The overall distribution of breakpoints was found to be nonrandom {p(x ^ 40.0) = 2  0.0211; p ( G > 42.1) = 0.0127). At the level of chromosome bands, one hot spot in band 2  llq21 was detected. The probability of breakage in llq21 is low because of its small size, therefore, this result was found to be significant in tests of both hypotheses I and II. Ilq21 was also observed in inversions ascertained through abnormalities (data set a) as well as in incidentally ascertained inversions (data set c).  Chapter  3.  61  Results  Hypethesis I. Hot Spots in Data Set: a  b  c  d  e  N = 282  N = 83  N = 42  N = 20  N = 14  l p l l (D)  :  -  lq21 (L) 2 p l l (L)  2 p l l (L)  2 p l l (L)  2 p l l (L)  2ql3 (L)  2ql3 (L)  2ql3 (L)  2ql3 (L)  7pl3 (L)  -  5 13 (L) 5ql3 (L) 6ql5 (L) P  7qll (L)  -  l O p l l (L) 10q21 ( D ) llq21 (L) 12pll (L) 12ql5 (L) Y p l l (L) Y q l l (L)  -  -  Y p l l (L) Y q l l (L)  -  -  Hypothesis II. Hot Spots in Data Set: a  6  c  d  e  N = 282  N = 83  N = 42  N = 20  N = 14  —  l p l l (D)  —  -  2 p l l (L)  2 p l l (L)  2 p l l (L)  2 p l l (L)  2ql3 (L)  2ql3 (L)  -  2ql3 (L)  -  lq21 (L)  5pl3 (L) 5ql3 (L) 6pl2 (D) 6 p l l (D) 6ql5 (L) l O p l l (L) 10q21 ( L )  -  -  Ilq21 (L)  Y p l l (L) Y q l l (L)  -  Y p l l (L) Y q l l (L)  -  -  Table 3.12: Bands of significant excess breakage detected at the 0.01 confidence level in inversions ascertained incidentally (B2). Data sources are identified in table 3.11. Boxed bands were observed in at least two data sets. (D) and (L) represent dark and light G bands respectively.  Chapter  3.2.4  3.  62  Results  Comparison of Results: Inversions Ascertained Through Abnormalities ( B l ) and Inversions Ascertained Incidentally (B2 & B3)  Data from individual studies were tested for homogeneity using Pearson's x and G . 2  2  The ReCAP data set was not included in this analysis due to its small sample size. The remaining two data sets were homogeneous ( p ( x > 91) = 0.327; p ( G > 124) = 2  2  0.420). In testing incidentally ascertained inversions for homogeneity the smallest data sets (ICRS and ReCAP) also had to be excluded from analysis because of small sample size. The three remaining data sets were not homogeneous ( p ( % > 359) = 0.019; p ( G > 2  2  339) = 0.031) at the 95% level. Data were pooled into either groups of inversions ascertained through abnormalities (Bl) or incidentally ascertained inversions (B2), and breakpoint distributions overall and in specific bands were tested in comparison of hypotheses of random breakage. The overall distribution in both ascertainment groups was found to be highly nonrandom (p(% > 774) < 0.0001; p { G > 461) < 0.0001 for inversions ascertained through ab2  2  normalities (Bl) and p(  2  X  > 2941) < 0.0001; p { G > 1099) < 0.0001 for inversions 2  ascertained incidentally (B2)). The results for individual bands are summarized in table 3.6. The observed breakage densities with the 99% confidence intervals and expected breakage densities for hypotheses I and II are shown in figures 3.2.4 and 3.2.4. Inversions involving breakpoints in hot spot bands listed in table 3.6 are found in Appendix A. The only bands that are common to both inversions ascertained through abnormalities (Bl) and inversions ascertained incidentally (B2) are 2pll and 2ql3, which are also repeatedly detected in the individual data sets. This result is unlikefy to be due to chance coincidenece (p — 0.017 for hypothesis I and p — 0.012 for hypothesis II). The number of breakpoints in light G bands was significantly greater in both data sets (see section 3.2.5). The majority of hot spot bands were also in light G bands in both  Chapter 3.  63  Results  Hypothesis  I  Ascertained  Ascertained  Through  Incidentally  Hypothesis  Abnormalities N  =  P  6p25 L 8p23 (L)  Ascertained  Through  Incidentally  Abnormalities  174  2pll (L) 2ql3 (L) 3 25 (L) 3pll (D) 3ql2 (L)  II  Ascertained  RFS  = 441 lpll (D) lq21 (L) CFS 2pll (L) 2ql3 (L) N  ONC  5pl3 (L) 5ql3 (L)  =  174  2pll (L) 2ql3 (L) 3p25 (L) 3pll (D) 3ql2 (L) 6p25 (L) 6pl2 (D)  CFS  lOpll (L) lOqll (L) 10q21 (D) Ypll (L) Yqll (L)  N  CFS  RFS  = 441 lpll (D) lq21 (L) CFS 2pll (L) 2ql3 (L) RFS N  ONC  5pl3 (L) 5ql3 (L) 6pl2 (D) lOpll (L) 10q21 (D) Ypll (L) Yqll (L)  CFS  CFS  Table 3.13: Bands of significant excess breakage detected at the 0.01 confidence level in pooled inversion data. Boxed bands are common to both ascertainment groups. CFS=common fragile site; i l F 5 = r a r e fragile site; 0/VC=oncogene. (D) and (L) represent dark and light G bands respectively.  1p36 1p35 1p34 1p33 1p32 1p31 1p22 1p21 1p13 1p12 1p11 1q11 1q12 1q21 1q22 1q23 1q24 1q25 1q31 1q32 1q41 1q42 1q43 1q44  2p25 2p24 2p23 2p22 2p21 2p16 2p15 2p14 2p13 2p12 (h12) 2p11 2q11 2q12 (h12) 2q13 2q14 2q21 2q22 2q23 2q24 2q31 2q32 2q33 2q34 2q35 2q36 2q37  3p26 (h12) 3p25 3p24 3p23 3p22 3p21 3p14 3p13 3p12 (h12) 3p11 3q11 (h12) 3q12 3q13 3q21 3q22 3q23 3q24 3q25 3q26 3q27 3q28 3q29  4p16 4p15 4p14 4p13 4p12 4p11 4q11 4q12 4q13 4q21 4q22 4q23 4q24 4q25 4q26 4q27 4q28 4q31 4q32 4q33 4q34 4q35  5p15 6p14 5p13 5p12 5p11 5q11 5q12 5q13 6q14 5q15 6q21 5q22 5q23 5q31 5q32 6q33 5q34 Sq3S  61.9  ijf]  28.6  "Ife  44.7  Chapter  IZE  73.2  4b 44.2 U P  (h12) 6p25 6p24 6p23 6p22 6p21 (h2) 6p12 6p11 6q11 6q12 6q13 6q14 6q15 6q16 6q21 6q22 6q23 6q24 6q25 6q26 6q27  ZJJH 53.0 I J P 67.6  10p15 10p14 10p13 10p12 10p11 10q11 10q21 10q22 10q23 10q24 10q25 10q26  16p13 16p12 16p11 16q11 16q12 16q13 16q21 16q22 16q23 16q24  66.2  17p13 17p12 I7p11 17q11 17q12 17q21 17q22 17q23 17q24 17q25  S3.0  18p11 18q11 18q12 18q21 18q22 18q23 19p13 19p12 19p11 19q11 19q12 19q13  64  IE 44.2  20p13 20p12 20p11 20q11 20q12 20q13 21p13 21p12 21p11 21q11 21q21 21q22 22p13 22p12 22p11 22q11' 22q12 22q13 Xp22 Xp21 Xp11 Xq11 Xq12 Xq13 Xq21 Xq22 Xq23 Xq24 Xq25 Xq26 Xq27 Xq28  (hi) 8p23 8p22 8p21 8p12 8p11 8q11 8q12 8q13 8q21 8q22 8q23 8q24 9p24 9p23 9p22 9p21 9p13 9p12 9p11 9q11 9q12 9q13 9q21 9q22 9q31 9q32 9q33 9q34  Results  67.6  7p22 7p21 7p15 7p14 7p13 7p12 7p11 7q11 7q21 7q22 7q31 7q32 7q33 7q34 7q35 7q36  ^{3  3.  4J3 48.S 66.2  74.3  Yp11 Yq11 Yq12  Figure 3.3: Distribution of breakpoints per unit chromosome length for inversions ascertained through abnormalities. Horizontal bars represent breakage densities for each band in a 320 band karyotype. Shaded region within a bar represents the 99% confidence interval. Thin vertical line within a confidence interval'represents the observed breakage density. Thick vertical line, spajming all bands in a chromosome, is the expected breakage density based on hypothesis I (0.93). Hypothesis H expected values are shown as •; dark band values (0.52) are seen to the left of hypothesis I expected value, and light band values (1.30) are to the right. Hot spots are bands with expected values located outside the lower confidence limit in the nonshaded region. Broken bars represent values greater than 40. Bands of frequent and infrequent breakage at the 99% confidence level are designated as h i , h2, or h l 2 for hot spots and c l , c2, or cl2 for cold spots, for hypothesis I, hypothesis H , or both hypotheses, respectively.  1p36 1p3S 1p34 1p33 1p32 1p31 1p22 1p21 1p13 1p12 (M2) 1p11 1q11 1q12 (h12) 1q21 1q22 1q23 1q24 1q25 1q31 1q32 1q41 1q42 1q43 1q44  2p25 2p24 2p23 2p22 2p21 2p16 2p15 2p14 2p13 2p12 (h12) 2p11 2q11 2q12 (M2) 2q13 2q14 2q21 2q22 2q23 2q24 2q31 2q32 2q33 2q34 2q3S 2q36 2q37  3p26 3p25 3p24 3p23 3p22 3p21 3p14 3p13 3p12 3p11 3q11 3q12 3q13 3q21 3q22 3q23 3q24 3q25 3q26 3q27 3q28 3q29 4p16 4p15 4p14 4p13 4p12 4p11 4q11 4q12 4q13 4q21 4q22 4q23 4q24 4q25 4q26 4q27 4q28 4q31 4q32 4q33 4q34 4q35  5p15 5p14 (h12) 5p13 5p12 5p11 5q11 5q12 (h12) 5q13 5q14 5q16 5q21 Sq22 5q23 5q31 5q32 5q33 5q34 5q35  1104.9  6p25 6p24 6p23 6p22 6p21 (h2) 6p12 6p11 6q11 6q12 6q13 6q14 6q15 6q16 6q21 6q22 6q23 6q24 6q25 6q26 6q27  IP  68.6  112.7  7p22 7p21 7p15 7p14 7p13 7p12 7p11 7q11 7q21 7q22 7q31 7q32 7q33 7q34 7q35 7q36  T\  10p16 10p14 10p13 10p12 10p11 (h12) 10q11 thD 10q21 (h12) 10q22 10q23 10q24 10q25 10q26  3.  16p13 16pl2 16p11 16q11 16q12 16q13 16q21 16q22 16q23 16q24  Results  84.3  78.4  74.3  65  17p13 17p12 17p11 17q11 17q12 17q21 17q22 17q23 17q24 17q25 18p11 18q11 18q12 18q21 18q22 18q23 19p13 19p12 19p11 19q11 19q12 19q13 20p13 20p12 20p11 20q11 20q12 20q13  U  *m. -  mA im m :« mmm •i >: mm mmmm: I mmmwms •|: m mm : *. mm *m mmtm . :•: :  :  :  mi  mmmmi • mmmmff s i\ *mmmm m :¥:  :*:  :  :  vmmmm: ;**,)  *m  8p23 8p22 8p21 8p12 8p11 8q11 8q12 8q13 8q21 8q22 8q23 8q24 9p24 9p23 9p22 9p21 9p13 9p12 9p11 9q11 9q12 9q13 9q21 9q22 9q31 9q32 9q33 9q34 I  Chapter  21p13 21p12 21p11 21q11 21q21 21q22 22p12 22p11 22q11 22p13 22q12 22q13 Xq22 Xp21 Xp11 Xq11 Xq12 Xq13 Xq21 Xq22 Xq23 Xq24 Xq25 Xq26 Xq27 Xq28 <h12) Yp11 (h12) Yq11 Yq12  I  ,1  mm  M E T  W  90.9 157.0 104.7  Figure 3.4: Distribution of breakpoints per unit chromosome length for inversions ascertained incidentally. Horizontal bars represent breakage densities for each band in a 320 band karyotype. Shaded region within a bar represent the 99% confidence intervals. Thin vertical line within a confidence interval represents the observed breakage density. Thick vertical line, spanning all bands in a chromosome, is the expected breakage density based on hypothesis I (2.48). Hypothesis II expected values are shown as •; dark band values (1.28) are seen to the left of hypothesis I expected value, and light band values (3.56) are to the right. Hot spots are bands with expected values located outside the lower confidence limit in the nonshaded region. Broken bars represent values greater than 65. Bands of frequent and infrequent breakage at the 99% confidence level are designated as h i , h2, or h l 2 for hot spots and c l , c2, or c l 2 for cold spots, for hypothesis I, hypothesis n , or both hypotheses, respectively.  Chapter  3.  66  Results  hypotheses I and II. The ratio of light to dark bands is reduced in hypothesis II for both inversions ascertained through an abnormality and inversions ascertained incidentally (5 to 4, and 5 to 3 in the two groups respectively). However, light G bands predominate in both sets of hot spots, similarly to inversions ascertained through abnormalities (Bl). 3.2.5  Distribution of Inversion Breakpoints in D a r k and Light Bands  The overall distribution of breakpoints were compared in light and dark bands in all individual data sets of inversions ascertained through abnormalities (Bl) and inversions ascertained incidentally (B2), and the results are described in table 3.14. The expected number of breaks in light and dark bands were computed as proportional to the total relative lengths of light and dark bands  (NL/LL  and ND/LD respectively). In most data  sets, the frequency of breakpoints in light G bands was significantly higher than in dark G bands. 3.2.6  Association of Inversion Breakpoints with Fragile Sites and Oncogenes  Overall distribution of inversion breakpoints was random for each ascertainment group with respect to both fragile sites and oncogenes (highest % value was 2.95 with associated 2  p = 0.1). In testing the probability of chance coincidence of specific sets of bands of frequent breakage with fragile sites or oncogenes, we found that the 2 fragile sites detected in hypothesis I in inversions ascertained through abnormalities (Bl) were unlikely to match only by chance (p = 0.019). However, any other matches of hot spots with either fragile sites or oncogenes observed in the two hypothesis tests and the two ascertainment groups (see table 3.13) were likely to be coincidental (p is at least 0.288).  Chapter  3.  67  Results  # Data  Set  Inversions Ascertained Through Abnormality o b c Total  Light  #  of Bands  Breakpoints Dark  Bands  %  #  49 50 23 125  (67) (77) (82) (72)  24 • (33) 15 (23) 5 (18) 49 (28)  206 67 34 16 10 333  (73) (81) (81) (80) (71) (76)  76 16 8 4 4 108  15  (68)  7  z  %  V  2.63 4.64 4.07 5.62  0.0043 < 0.0001 < 0.0001 < 0.0001  (27) (19) (19) (20) (29) (24)  7.71 6.48 4.72 3.05 * 1.55 9.72  < 0.0001 < 0.0001 < 0.0001 0.0011 0.0606 < 0.0001  (32)  * 1.56  0.0594  Inversions Ascertained Incidentally a b c  d e Total  De novo Inversions Ascertained Incidentally Table 3.14: D i s t r i b u t i o n of inversion breakpoints i n dark and light G bands. Difference in data sets designated with * does not reach significance at the 95% level, z for p < 0.01 and p < 0.05 are 2.58 and 1.98 respectively.  Chapter  3.  68  Results  Band  Study  Case  Aberrations  3cen  a  1  ctb 3cen  a  2  csb 3cen  a  3  ctb 3cen  b  1  t(9;14)( 22;q22)  b  2  del 9p22  a  1  cte(5;17)(qll;q21)  a  1  csb(17q21)  a  2  cteasy(2;17)(?;q21)  b  1  csg(17q21)  9p22 17q21  P  Table 3.15: Bands of significant excess breakage detected at the 0.01 confidence level, and the observed aberrations i n these bands. Study a) Brandriff et al., 1985 [4]; study b) M a r t i n et a l , 1987 [45].  3.3  Sperm Chromosome Aberrations  We analyzed 109 sperm chromosome breakpoints. T h e distribution of breakpoints over all chromosomes was random {p(% > 21.2) = 0.6264; p(G > 23.4) = 0.4986). A t the 2  2  level of chromosome bands, 3cen, 9p22, and 17q21 were frequently involved in aberrations in data pooled from two studies. This information is summarized i n table 3.15 W i t h the exception of two aberrations involving band 17q21 found i n one individual (study a, case 1), all aberrations involving -hot spots were obtained from different individuals. Distribution of sperm chromosome breakpoints i n bands with fragile sites or oncogenes was tested and found to be random for fragile sites (% = 1.69, p < 0.25), and nonrandom 2  for oncogenes (% = 3.96, p < 0.05), with a significantly lower than expected frequency 2  of breakpoints i n bands with oncogenes.  Chapter  3.4  3.  Results  69  Comparison of Results: Rep (Group A ) , Inversions (Group B), and Sperm Chromosome Aberrations (Group C)  There were no bands frequently involved i n all three groups of pooled rep, pooled i n versions and sperm chromosome aberrations. The light G band 9p22 was found to have excess breakpoints i n both incidentally ascertained rep (A2) and rep ascertained through abnormalities ( A l ) , as well as i n sperm chromosome aberrations. Bands l p l l and lq21 were found i n both incidentally ascertained rep (A2) and inversions (B2). 5 p l 3 is a hot spot band i n rep ascertained through abnormalities ( A l ) (hypothesis I only) and inversions ascertained through abnormalities ( B l ) . 8p23 is a hot spot i n both rep ascertained through an abnormality ( A l ) and incidentally ascertained inversions (B2) (hypothesis I only i n both data sets). I l q 2 1 was seen in incidentally ascertained de novo inversions (B3) and incidentally ascertained rep (A2). As measured by the hypergeometric distribution method, chance may account for the coincident occurrence of bands 8p23 and 5 p l 3 i n two data sets. Furthermore, coincidence of 9p22 i n sperm chromosome aberrations and either i n rep ascertained through abnormalities ( A l ) or i n rep ascertained incidentally (A2) was found not to be significant. The hypergeometric method used to compare results of the tests for nonrandomness is inadequate for evaluating the significance of observing the same hot spot in three independent data sets, as we found for 9p22 i n this study. For all other mathing bands found in independent studies chance could not be ruled out as a reasonable possibility (p is at least 0.05).  Chapter 4  Discussion and Conclusions  4.1  Distribution of Breakpoints in Constitutional Rearrangements  In previous studies of breakpoint distributions from constitutional rearrangements, nonrandom breakage was frequently suggested (see tables 1.1 and 1.2). However, the interpretation of these results is difficult due to a likely nonrandom representation of specific groups of breakpoints i n the samples studied, resulting from selection procedures that are i n some way related to viability potentials and other phenotypic manifestations of the rearrangements. T h e goals of this project were to determine if there are specific sites on human chromosomes that are preferentially involved i n constitutional reciprocal translocations and inversions. Ideally, the hypothesis of nonrandom breakage would be evaluated i n a random sample of all human constitutional rearrangements.  As it is virtually impossible  to obtain such a sample, we took a comparative approach and analyzed several groups of rearrangements organized according to the method of selection for study, in the hope that we would be able to identify sources of nonrandomness i n breakpoint distributions that are due to bias and those that may be truly the result of a biological predisposition to involvement i n constitutional rearrangements. In our analyses we usually found the distribution of breakpoints i n the large data sets of reciprocal translocations and inversions to be nonrandom. In the evaluation of these results several possible explanations have to be taken into consideration: (1) The  70  Chapter  4.  Discussion  and  Conclusions  71  observed clustering of breakpoints in specific chromosome regions may be the product of bias introduced through nonrandom ascertainment of rearrangements with respect to the location of breakpoints; (2) Nonrandomness may be due to chancefluctuations.Studies of breakpoint distribution present a special problem for statistical analysis and these have to be taken into account in the interpretation of the results. (3) Nonrandomness may in fact represent sites that are predisposed either to chromosome breakage, preferential rearrangement involving specific sites on the chromosomes, or both. Each of these potential explanations for nonrandomness is considered individually in the following sections. 4.1.1  The Effects of Ascertainment Bias  Nonrandomness in a distribution of breakpoints may result from bias if selected rearrangements are ascertained in a manner that is in some way related to the position of breakpoints. Nonrandom selection with respect to breakpoint position may occur, if specific sets of breakpoints in rearrangements lead to specific phenotypic outcomes. If a specific outcome, such as live birth of a phenotypically abnormal child, depends on factors that are secondarily influenced by the position of breakpoints, rearrangements ascertained through other outcomes should represent different sets of breakpoints. This might occur for example if zygotes with unbalanced rearrangements at certain breakpoints were more likely to survive to be born alive than zygotes with unbalanced rearrangements at other breakpoints. In order to test the hypothesis, that the distribution of breakpoints observed in cases ascertained through abnormal phenotypes is determined, at least in part, by secondary factors, such as differential fetal survival, we compared the distribution of breakpoints associated with rearrangements ascertained through abnormalities to the distribution of breakpoints from rearrangements ascertained through incidental events. Nonrandomness was found in both ascertainment groups. However, the bands frequently involved in one  Chapter  4.  Discussion  and  Conclusions  72  group were generally different from the bands frequently involved in the other group. This result suggests that the nonrandomness observed among cases ascertained through phenotypic abnormality, is, at least in part, the product of ascertainment bias. In order to understand the effects of nonrandom selection of rearrangements on the distribution of breakpoints, it is useful to identify sources of ascertainment bias, in reference to the specific groups of rearrangements that are likely to be affected. For this purpose, possible sources of bias are considered that may be of concern at specific stages, subsequent to the initial formation of the chromosomal rearrangement. Rearrangements may be detected prior to fertilization of the gamete, shortly after the mutation event by cytogenetic analysis of the sperm or ovum. Most often, however, rearrangements are detected at a stage subsequent to the transmission of the mutant chromosomes to the next generation. Detection of rearrangements in gametes is unaffected by forms of bias related to viability of the conceptus. On the other hand, some rearrangements may never be seen in gametes due to selection prior to gametogenesis. Sources of Bias P r i o r to or D u r i n g Transmission of Rearrangement  It is not known at what points structural rearrangements arise. Rearrangements may be formed prior to meiosis in a germ line cell, during meiosis, or subsequent to meiosis in a gamete. One of these times may predominate, or more than one of them may be common. If an rep is formed in a diploid germ line cell, prior to pairing of homologoues in meiosis I, aberrant pairing and segregation of the rearranged chromosomes and their homologoues during meiosis may result. Depending on breakpoint position, the rearrangement may completely interfere with pairing and segregation, preventing the production of gametes [8]. Alternatively, segregation of chromosomes may follow successful pairing of rearranged  Chapter  4.  Discussion  and  Conclusions  73  chromosomes and their homologoues in a quadriradial structure. The products of segregation may be chromosomally normal, balanced or unbalanced. If balanced products are produced, many are expected to be compatible with normal differentiation and survival of gametes. If unbalanced products are produced, the size and nature of the imbalanced segment determine the viability potential of the segregation product [10] [37] [63]. Severe imbalances may not be compatible with gamete differentiation [8], possibly leading to the nonrandom loss of certain breakpoints at this stage. Alternatively, reciprocal translocation may be formed in a haploid cell. Rearrangements in haploid cells do not affect segregation of homologoues in meiosis I. Consequently, imbalances due to segregation are not produced. Many of such rearrangements are expected to be chromosomally balanced. However, unbalanced products may arise post-meiotically as well. For example, imbalances may be produced through unequal degradation and/or resynthesis of DNA in a repair process. It is also conceivable that a balanced chromosome complement is not maintained in a rearrangement if fragments separated by breakpoints are lost prior to the completion of the rearrangement process. Balanced and unbalanced products in inversions may also be produced prior, or subsequent to meiotic segregation. Unbalanced recombinants result when a crossover takes place between homologous chromosomes in meiosis I. Alternatively, imbalances may arise as a consequence of the rearrangement process following meiotic segregation as suggested above for rep. In summary, selective loss of rep and inversion breakpoints may occur shortly after the rearrangement is formed, but prior to transmission of the rearranged chromosomes to an offspring. Rearrangements formed prior to meiosis, may interfere with pairing and segregation preventing germ cell maturation [8], or severe imbalances may lead to lethality in differentiating gametes. Any rearrangements formed after meiosis would not be affected by bias imposed by the segregation of aberrant chromosomes. However,  Chapter  4.  Discussion  and  Conclusions  74  imbalances may arise as a consequence of the processes of rearrangement or repair. An additional mechanism that may also lead to selective loss of specific rearrangement breakpoints involves the disruption of genes essential for germ cell maturation and gamete differentiation. Breakpoints of rearrangements in these genes may interfere with gamete survival and result in the nonrandom loss of a subgroup of rearrangement breakpoints. The extent that the mechanisms of nonrandom loss of specific groups of breakpoints may be of significance depends on the actual time when many or most rearrangements arise. Through these mechanisms a biased representation of breakpoints is produced in a sample of rearrangements detected prior to transmission to offspring. At present, the only such sample that is feasible to study involves aberrations from sperm chromosomes. Based on the discussion above, breakpoint distributions obtained from sperm chromosomes should be considered a nonrandom sample of rearrangement breakpoints. Sources of Bias Subsequent to Transmission of Rearrangement  All balanced and unbalanced rearrangements that are compatible with gamete survival may be transmitted to an offspring. A de novo carrier of a constitutional rearrangement results from the initial transmission followed by survival and successful implantation and development of the zygote. The rearrangement may be inherited in subsequent generations through transmission by fertile carriers. Based on the previous discussion, there are potentially four types of gametes that may be fertilized after the initial rearrangement event: Type 1 (balanced post-meiotic) A chromosomally balanced gamete that carries the original rearrangement formed in the haploid cell after meiotic segregation. Type 2 (balanced pre-meiotic) A chromosomally balanced gamete that is the product of meiotic segregation of chromosomal rearrangement that occurred in the diploid precursor cell, prior to or during meiosis.  Chapter  4.  Discussion  and  Conclusions  75  Type 3 (unbalanced post-meiotic) A chromosomally unbalanced gamete that carries the original rearrangement formed in the haploid cell after meiotic segregation. Type 4 (unbalanced pre-meiotic) A chromosomally unbalanced gamete that is the product of meiotic segregation of chromosomal rearrangement that occurred in the diploid precursor cell, prior to or during meiosis. These four types of gametes may very well produce different distributions of rearrangement breakpoints. It is possible that certain rearrangements are incompatible with normal mitotic division preventing implantation of the zygote and possibly resulting in an early unrecognized pregnane}' loss. Breakpoints that are nonrandomly associated with rearrangements incompatible with preimplantation cleavage or mitotic division are lost at this stage. Random loss with respect to rearrangement breakpoint position may also occur through early post-zygotic lethality due to aberrant metabolic mechanisms or mitotic errors that are unrelated to the rearrangement. These are not expected to affect breakpoint distributions. Fertilization of a gamete, with a balanced rearrangement formed in a haploid cell (type 1), should have a relatively high probability of leading to a normal live birth. The same holds for the second type of gamete, that carries a balanced rearrangement formed prior to meiosis I (type 2). These two gametic types are most likely to be represented in groups of rearrangements classified as incidentally ascertained balanced de novo rep (group A3) and inversion (group B3). De novo rearrangements with no obvious phenotypic manifestations may be detected in the offspring of individuals referred for prenatal diagnosis for advanced maternal age or in surveys of consecutive newborns (see table 2.5 for references). Whether rearrangements from only one or both types of balanced gametes are present in incidentally ascertained de novo rearrangements cannot be determined until the process of chromosome rearrangement is understood in more detail. The  Chapter  4.  Discussion  and  Conclusions  76  relative contribution by the two gametic types may have an effect on the extent that bias introduced in the segregation of aberrant chromosomes leads to nonrandomness in the distribution of breakpoints in incidentally ascertained balanced de novo rearrangements. Fertilization of a gamete with an unbalanced chromosome complement formed in a haploid cell (type 3) or in a diploid cell (type 4), may have various outcomes depending on the severity of the imbalance. - The imbalance may lead to an unrecognized abortion prior to or soon after implantation, recognized abortion of an abnormal fetus, stillbirth, or live birth of a phenotypically abnormal child. Such unbalanced de novo carriers may rarely be detected by chance in prenatal diagnosis or in population surveys. Abnormalities in a de novo carrier offspring or conceptus may also lead to the detection of the rearrangement. Some appearently balanced rearrangements are also detected in these ways when associated with some form of phenotypic abnormality. Ascertainment bias associated with direct phenotypic effects of imbalances and the influence of breakpoint position on meiotic segregation are also of concern in inherited rearrangements. In addition, the detection of rearrangements based on previous reproductive history is of relevance, particularly in the detection of balanced carriers. Balanced rearrangements detected through infertility, recurrent abortions, or birth of an abnormal child are likely to be representative of breakpoints that preferentially lead to unbalanced segregation products. In contrast, balanced rearrangements that tend to result in normal balanced offspring are probably overrepresented in incidentally ascertained balanced rearrangements. In conclusion, comparison of rep and inversions ascertained incidentally and through abnormalities suggests that biased representation of specific sets of breakpoints accounts for much of the nonrandomness in the distributions. A possible mechanism of producing this nonrandomness involves the interdependence of the position of breakpoints with  Chapter  4.  Discussion  and  Conclusions  77  various phenotypic manifestations of the chromosome imbalance. In rep, the source of this interdependence at the level of chromosome segregation may be a tendency to specifically produce balanced or unbalanced products through a prefered segregation mode [44]. Although in inversions the influence of breakpoint position on segregation leading to balanced and unbalanced products is less clear, the relative position of breakpoints to each other does affect the frequency of recombination in the inverted segment, that lead to chromosomally unbalanced products [40] [43]. Beyond segregation, phenpotypic manifestation of a rearrangement is related to breakpoint position through the differential survival potential of segregants with a chromosome rearrangement. The size and genetic content and possibly other characteristics of the unbalanced chromosome segment are directly related to the phenotype of the carrier. That in turn determines the likelyhood that the rearrangement is detected in a population ascertained in a specific way. 4.1.2  T h e Effects of C h a n c e F l u c t u a t i o n s  Some of the nonrandomness in the distribution of breakpoints may be the product of chance fluctuations. In any statistical analysis, conclusions have a component of uncertainty resulting from the probabilistic nature of statistical testing. For this reason, the set of hot spot bands detected in a single sample may include bands that show frequent breakage by chance. Alternatively sites of nonrandomness may also go undetected. The probability that bands of nonrandom involvement are detected by chance is determined by the level of significance used to test for nonrandomness. The detection of a hot spot in a band with random breakage is equivalent to the inappropriate rejection of a null hypothesis ( H ) of random breakage when it is really true. This is known as 0  a type I, or a error [48]. At the 99% significance level, hypothetical values of random breakage are expected to he outside the observed confidence interval in any band 1% of the time. Therefore, in a simultaneous test of 320 bands, approximately 3 to 4 bands  Chapter  4.  Discussion  and  Conclusions  78  are expected to be falsely designated as hot spots. In some of the data sets, this number represents the majority of hot spots. One way to distinguish between hot spot bands that are a chance occurrence and bands that are frequently involved i n breakage for other reasons is to analyze data from different sources that were compiled independently but using similar procedures and ascertainment criteria. The data sets should represent large samples and have approximately equal sample sizes.  Bands that are consistently affected by breakpoints at a  high frequency can then be considered to be nonrandomly involved i n rearrangements for other reasons. In this study we attempted to make such comparisons between independent data sets by comparing data from individual studies prior to pooling, to see if the same bands were repeatedly observed i n most (see tables 3.2, 3.5, 3.10, 3.12). Although some bands were observed i n more than one data set, many were not. This difference is not likely to be due to ascertainment bias as the mode of ascertainment was similar i n all data sets i n each group. There are two explanations for our inability to reproduce results i n independent data sets of the same general mode of ascertainment.  It is possible that the populations  sampled in the original studies were not comparable. In the present analysis we had to rely on data collected i n various studies with various purposes. Therefore, we had no control over the design of the data collection process. Although incidentally ascertained rearrangements were mostly detected i n prenatal diagnosis for advanced maternal age (tables 2.3 and 2.4), the group of rearrangements ascertained through abnormalities is particularly heterogeneous  (tables 2.1 through 2.2).  In this study, we were unable to  demonstrate homogeneity between data sets of rep ascertained through abnormalities and inversions ascertained incidentally. A n alternative explanation for differences observed across data sets is the large variation i n sample sizes. A true site of nonrandom involvement is more likely to be masked  Chapter  4.  Discussion  and  Conclusions  79  by chancefluctuationin a small data set leading to the acceptance of the null hypothesis when it is actually false [41]. This error is known as a type II orflerror. The exact value of type II orflerror is difficult to calculate for most statistical tests, but in generalflerrors are reduced by increasing sample size [48]. Accordingly,flerrors probably account for the observation in the present study that fewer hot spot bands were detected in independent data sets employing similar methods of ascertainment was greater in smaller samples. In this analysis, the sample size in the original studies was probably too small to avoid mostflerrors. With the exception of rep ascertained incidentally and through abnormalities from [11], all individual data sets averaged fewer than 1 break per band in a 320 band karyotype. A relatively simple empiric method for determining the optimal sample size to reduceflerrors would involve comparisons of the numer of hot spots detected between data sets with a range of sample sizes ascertained the same way. The increase in the number of hot spots detected should level off when the sample sizes become large enough to reduceflerrors to acceptable levels. 4.1.3  Candidate Sites of True Nonrandom Involvement  Subsequent to evaluation of the roles of ascertainment bias and chance in the production of nonrandomness in breakpoint distributions, there remain a set of bands with significant excess of breakpoints in various ascertainment groups. These bands may represent sites of biologically important predisposition to rearrangement due perhaps to characteristics of the underlying DNA sequence, chromatin structure or nuclear organization. Five chromosome bands (5q35, 7p22, 9p22, 13ql4, 17q25) were found to have a higher than expected number of breakpoints in pooled data sets of both incidentally ascertained rep (A2) and rep ascertained through abnormalities (Al). After correcting for a possible bias in preferentially assigning breakpoints to light G bands (see below), only 9p22 and 17q25 have a significant excess of breakpoints. As discussed above, the influences of  Chapter  4.  Discussion  various ascertainment  and  80  Conclusions  biases i n the groups of rep ascertained incidentally and through  abnormalities would be expected to lead to overrepresentation of different sets of breakpoints.  Thus, excess breakage i n bands observed i n both ascertainment  represent sites of biologically important predisposition to rearrangement.  groups may A l t h o u g h ran-  dom chance i n the coincidence of bands 9p22 and 17q25 in the two ascertainment groups cannot be excluded, the coincidence of five bands detected according to hypothesis I, was found to be highly unlikely. Further support for a predisposition to rearrangement  at  band 9p22 comes from the observation of excess breakage at this band i n sperm chromosome abnormalities. The only bands with a significant excess of breakpoints i n both incidentally ascertained inversions and inversions ascertained through abnormalities were 2 p l l and 2 q l 3 . Almost all rearrangements involving these bands are the identical inversion i n v ( 2 ) ( p l l ; q l 3 ) detected i n apparently unrelated individuals. This situation is a typical example of our observations i n inversion breakpoints. Most inversion hot spots result from many carriers of the same rearrangement detected independently. There are two possibilities to account for this observation. Identical rearrangements may occur i n unrelated individuals as the product of frequent mutations because of some D N A sequence or structural predisposition that strongly favors the formation of one specific inversion as opposed to any other at a given breakpoint. Alternatively, a single mutation may have become widespread i n the population because of a founder effect (i.e.  genetic drift) and the lack of any significant  reduction in the reproductive fitness of carriers. To distinguish between these possibilities, analysis of a data set of incidentally ascertained de novo inversions would be informative. In this study, only a very small data set of such rearrangements was available. Although we did not observe i n v ( 2 ) ( p l l ; q l 3 ) among these cases, a much larger set of data would be required to draw reliable conclusions.  Chapter  4.  Discussion  and  Conclusions  81  There is support for a founder effect related to inversions. A mechanism that maintains the reproductive fitness of inversion carriers through crossover suppression does exist in other organisms [40] [43]. Furthermore, the risk for unbalanced segregation products for both paracentric and pericentric inversion carriers was found to be relatively low [10]. Inversions have been important in evolution [15] indicating that they can become fixed in the population. Therefore, it seems likely that nonrandom involvement in inversions is more likely to be due to the high frequency of certain inversions, derived from a single ancestral mutation in the population. As a result, the study of breakpoint distributions to detect sites that are predisposed to forming inversions would be more enlightening using de novo data. In summary, for rep we found some bands that may represent sites of biologically mediated predisposition to rearrangement. However, we were unable to show this unequivocally in our data on rep and inversions. Candidate bands for hot spots due to a predisposition to rearrangement would be expected to be seen in most independently conducted studies, in data ascertained through different methods, and in de novo data. Similar hot spots may also occur in sperm data. Bands that are predisposed to rearrangement might also be involved in both inversions and rep. We were unable to detect any bands that showed frequent involvement in data from all these sources. However, some bands may turn out to be frequently involved in breakage and rearrangements in larger and more homogeneous data sets. 4.2  E v i d e n c e for R a n d o m Breakage  The distribution of breakpoints in incidentally ascertained rearrangements and in rearrangements ascertained through abnormalities was found to be nonrandom. In the previous discussion we considered ascertainment bias and chance variation as possible  Chapter  4.  Discussion  and  Conclusions  82  explanations for the observed nonrandomness. Here we evaluate the evidence for random breakage in terms of the results obtained in de novo rearrangements and sperm chromosome aberrations. The data set of incidentally ascertained de novo inversions was very small. However, we were able to obtain 122 incidentally ascertained de novo rep breakpoints and found that the overall distribution of breakpoints was random. In addition there was no frequent involvement in breakage in specific bands that was not otherwise be detected in testing for overall nonrandomness. The reason for analyzing incidentally ascertained de novo rep was the possibility that some forms of bias affecting incidentally ascertained inherited balanced rearrangements may not be present in de novo rearrangements. Ascertainment bias related to segregation leading to unique different frequencies of balanced and unbalanced products for specific rearrangements is virtually impossible to eliminate in constitutional rep data. Recognition of such bias is most important in analysing inherited rearrangements, and it may be of special concern in prenatal diagnosis data through the following mechanism. As discussed previously in section 4.1.1 carriers of balanced rep that are not likely to produce unbalanced progeny are less likely to be referred for cytogenetic studies than carriers of rearrangements who tend to produce unbalanced progeny at a higher frequency. In practical terms this means that carriers in the former group would only in rare instances have abnormal children, recurrent abortions or fertility problems related to their rearrangement. Consequently, cytogenetic studies are unlikely to be performed on these carriers. By the time mothers become eligible for prenatal diagnosis for advanced maternal age, a subset of rearrangements that tends frequently to produce imbalance is likely to have been removed from this "normal" population through earlier detection because of abnormal pregnancy outcome. In terms of the possible difference between incidentally ascertained balanced and  Chapter  4.  Discussion  and  Conclusions  83  inherited rearrangements with respect to a bias of segregation, the observed randomness in breakage in de novo rearrangements is interesting. The question is if this difference is due to less bias in de novo data or simply to small sample size. As discussed above in section 4.1.2, small samples are affected by /3 errors to a greater extent, and this may be the reason for the observed randomness in de novo data. In small sets of inherited rearrangements randomness in the overall distribution of breakpoints was also observed. Nonrandomness was detected only in the larger or pooled data sets. Therefore, it is quite possible that nonrandomness would also be found in a larger data set of de novo rep. As this type of data potentially represents the best source of constitutional data with respect to ascertainment bias, it is important to study a larger data set of these rearrangements. We also found random breakage in a set of breakpoints obtained from sperm chromosome aberrations. Although we found 3 sites of nonrandom involvement in testing at the chromosome band level, this could be a chance observation as discussed in section 4.1.2. Sperm chromosome aberrations are fundamentally different from constitutional rearrangements detected in fetuses, children and adults in several ways. For example, cells of somatic tissues go through a series of mitotic divisions, but division and chromosome segregation do not take place in mature sperm cells. Therefore, sperm chromosomes may exhibit aberrations with lethal or destabilizing effects on dividing somatic cells. Furthermore, chromosome aberrations in sperm cells involve a wider range of abnormalities from chromatid gaps to reciprocal translocations [45] [44]. In this respect, sperm chromosome aberrations may represent a more complete and unbiased set of the breakpoints, but these may or may not be the same sites involved in rearrangements. Sperm chromosome data to date have been obtained from a small number of individuals, with many different rearrangements observed in each individual. Thus the distribution of breakpoints observed might be influenced by predispositions to breakage unique to certain individuals. We observed 5 cases of two different rearrangements  Chapter  4.  Discussion  and  Conclusions  84  involving the same band in the same individual among 24 donors. An increasing frequency of structural aberrations in sperm has been observed with increasing age of donor [45]. Therefore, sites of nonrandom breakage found in sperm chromosome rearrangements might differ according to the age structure of the population of donors. The results of our analysis of incidentally ascertained de novo rep and sperm chromosome aberrations provide no indication of nonrandom predisposition to chromosomal rearrangement. Although there are several problems with the data that interfere with firm conclusions, as discussed above, the results obtained in these data sets are in general agreement with our inability to show indisputable evidence of predisposition to rearrangement at specific chromosome bands of inherited chromosomal abnormalities. 4.3  Distribution of Breakpoints in Light and D a r k Bands  Breakpoints in light G bands were consistently observed with a higher frequency than breakpoints in dark G bands (see sections 3.1.5 and 3.2.5). The obvious question that follows this observation is whether there is a possible predisposition to rearrangement in light G bands. It has been proposed that this excess is not due to such predisposition, but rather to a bias on the part of cytogeneticists in preferentially assigning breakpoints to fight bands because the human eye recognizes such breaks more easily [55]. A test of this hypothesis would be to use both R and G banding on the same set of rearrangements and compare the assignment of breakpoints in light and dark bands between the two banding methods. If pattern recognition bias is the sole explanation for the excess of breakpoints observed in light G bands, then in an R banded data set a similar excess should be observed in fight R bands (i.e. dark G bands). We do not have data available to make such comparisons with the exception of 45  Chapter  4.  Discussion  and  Conclusions  85  breakpoints from R banded rep with unknown ascertainment (provided by Dr. C.-L. Richer (data not shown)). In this data set there were 25 breakpoints in dark R bands (light G), and 20 breakpoints in light R bands (dark G). Although, this distribution is not significantly different from the expectation of 24 and 21 breakpoints respectively, based on the total relative lengths of light and dark G bands, it is interesting to note that the trend is opposite to that produced on the basis of pattern recognition bias. Similar trends in a much larger data set would indicate that pattern recognition bias is not the only explanation for excess breakage in light G bands. 4.4  C o i n c i d e n c e w i t h Fragile Sites a n d Oncogenes  Fragile sites are expressed under certain culture conditions in vitro as breaks or gaps at specific locations on the chromosomes [66]. Their existence in vivo, functional significance and structure at the molecular level are as yet unknown. Fragile sites are classified in two main groups according to their frequency in the population: rare fragile sites are observed only in a small number of individuals while common fragile sites are frequently seen [66]. Further subgroups based on the mode of induction exist. To date 113 fragile sites are mapped, of which 75 have confirmed status [67]. It has been suggested that fragile sites predispose to chromosome breakage and structural rearrangements [66]. In keeping with this hypothesis, increased frequencies of sister chromatid exchange [27] [21], sperm chromosome aberrations [2], breakpoints of structural rearrangements in cancer [18] [73], and spontaneous chromosome breaks [47] have been reported at various fragile sites. Deletions and reciprocal translocations at two fragile sites have been demonstrated cytologically in somatic cell hybrids [28]. In a study of spontaneous abortions, stillbirths, and newborns [31], and in prenatal  Chapter  4.  Discussion  and  Conclusions  86  diagnosis results [30], overall elevated frequencies of constitutional rearrangement breakpoints in fragile site bands have been claimed. These results, however, are questionable due to limitations of the data source [30] [31], as well as of the statistical methods used [16]. In another study of 6391 constitutional rearrangement breakpoints including various rearrangements ascertained in a variety of ways, no overall association of constitutional breakpoints with fragile sites were found [53]. Similarly, in 984 breakpoints from balanced carriers with recurrent spontaneous abortions no association between fragile site bands and rearrangement breakpoints was observed [12]. These results suggest that there is no overall preferential involvement of constitutional rearrangement breakpoints in fragile sites. Our own results indicate no overall association between fragile site bands and either rep breakpoints ascertained incidentally (A2) or inversions in either ascertainment group (Bl & B2). Rep breakpoints ascertained through abnormalities (Al) were, however, nonrandomly associated with fragile site bands (see section 3.1.6. In incidentally ascertained rep and sperm chromosome data no such association was detected. Therefore, the association of fragile sites with rep breakpoints ascertained through abnormalities appears to be dependent on the ascertainment bias affecting these rep. Of the fragile site bands observed in other studies only 18q21 [53] in rep ascertained through abnormalities and 2ql3 [30] in inversions of both ascertainment groups were seen in this study as well. 18q21 was found in a study of pooled data on all kinds of structural rearrangements and 2ql3 was seen among cases ascertained through prenatal diagnosis. Apparently, the extent of association and the specific bands involved are heavily influenced by the types of rearrangements in the data set, and the forms of ascertainment bias affecting the data. We conclude that there is no consistent association between fragile sites in general and constitutional chromosome rearrangement breakpoints. In all of our pooled data sets there were some fragile site bands that were frequently  Chapter  4.  Discussion  and  Conclusions  87  involved in rearrangements. The coincidence of these bands with fragile site bands is probably explained as a chance occurrence (see sections 3.1.6 and 3.2.6). Our method of testing for significant overall coincidence using the hypergeometric distribution is not adequate to test for significant correlation at any one specific site. Therefore, lack of significant correlation demonstrated by this method does not rule out the possibility that certain specific fragile site bands are also predisposed to constitutional chromosomal rearrangement. Consistent association between hot spots for constitutional breakage in bands with fragile sites in independent unbiased data sets would be required as a first step to demonstrate this. Verification would require high resolution banding and molecular techniques, to demonstrate such a relationship in isolated cases. Some bands containing oncogenes were found among the hot spots in rep ascertained both incidentally and through abnormalities. Only one band, 17q25 with oncogene ERBA2L, was observed in both ascertainment groups of rep. This band may be a candidate for studying the oncogene as a possible factor involved in constitutional rearrangement. Overall coincidence of the bands involved, however, was not significant as only 4 out of 20 bands in rep ascertained incidentally and 2 out of 20 bands in rep ascertained through abnormalities contained oncogenes (see sections 3.1.6 and 3.2.6. The distribution of inversion breakpoints was also random with respect to bands containing oncogenes. In sperm chromosome rearrangements breakpoints were significantly less frequently involved in bands with oncogenes. Chance may be responsible for producing this effect given the small sample size. We conclude that bands with known oncogenes do not appear generally associated with a predisposition to constitutional chromosome rearrangements.  Chapter  4.5  4.  Discussion  and  88  Conclusions  Improvements for Future Analysis  In this study, we were faced with several problems in the analysis of constitutional rearrangement breakpoints. To allow more definite conclusions regarding the distribution of constitutional rearrangement breakpoints, it is important to address several of these concerns in future studies. First of all, a major problem with this and other similar studies has been the lack of large enough data sets. Ideally, the number of breakpoints should be much greater than the number of classes studied for reliable statistical conclusions. For the % test, 4 to 5 2  times the number of classes, that is well over 1000 breakpoints for a 320 band karyotype, have been suggested [23]. Only a few of our data sets had this many breakpoints, and most had far fewer. We were able to observe recurring trends in some of our data sets, but most were likely the result of ascertainment bias or founder effect. Such bias can be strong enough to show an effect in very small data sets, as we have seen for inversions. However, nonrandomness due to a biological predisposition to rearrangement is more difficult to detect, unless the tendency is strong enough to be apparent despite any superimposed bias. Constitutional rearrangement data from incidentally ascertained balanced de novo cases probably represents the best type of data that is currently available. It would be useful to compile data sets comprising this kind of information to allow more reliable statistical analyses to be carried out. Chromosome aberrations from sperm of healthy individuals is a potentially useful source of information for breakpoint distribution analysis. However, more data of this kind are also required. Data from a large number of individuals is necessary to understand the nature of the variation between individuals and to determine if predispositions to breakage at different specific sites exist in different men.  Chapter  4.  Discussion  and  Conclusions  89  The question of nonrandom breakage with respect to light and dark G bands remains unresolved. The problem of pattern recognition bias may be best evaluated by analysing single data sets that are both G and R banded and interpreted separately. Analysis of separate data sets ascertained in similar ways but banded using the two different methods might also be useful. The present study was aimed at defining the distribution of breakpoints at the chromosome band level. For similar studies in the future, it is important to obtain precise measurements of chromosome bands at different levels of resolution. Furthermore, the relative amounts of DNA in light and dark bands should be determined in order to allow meaningful comparisons of breakpoint frequencies in the two bands. The amount of DNA in a band may be of particular relevance if chromosome breakage most often takes place in interphase when the DNA is uncoiled, rather than in metaphase when the DNA is tightly coiled. In the former situation, we may expect that the length of the DNA in a band would be related to the frequency of breakpoints in that segment. In the present study and in other studies utilizing methods of analysis aimed at detecting nonrandomness at the level of chromosome bands, direct measurements of chromosome bands from the ISCN diagrams [34] have been made. These diagrams were produced based on estimated relative lengths and are, as a result, inaccurate. Further error is introduced by measurements of the diagram. Precise meausurements, with error estimates, of metaphase chromosome bands at various levels of contraction are essential for this kind of study. It has been repeatedly shown that taking band lengths into consideration greatly influences the results [16] [53]. If this is to be a consistent approach for breakpoint distribution analysis, proper measurements of band length and consistant treatment of variable bands are necessary. Our statistical method has proved useful in detecting nonrandomness in breakpoint distributions at the level of in chromosome bands. As discussed in section 4.1.2, chance  Chapter  4.  Discussion  and  Conclusions  90  can produce a and f3 errors that simulate or mask nonrandomness in breakpoint distributions. Therefore, it is useful also to test for overall nonrandomness to understand the general distribution of breakpoints. At present, we are unaware of a reliable statistical method to do this, given the large number of classes and small samples in each class in available data sets. We used two goodness of fit tests and considered a result acceptable if the same conclusion was reached by both tests [23]. It would be useful tofindimproved approaches to this problem. 4.6  Conclusions  1. The distribution of breakpoints in inherited reciprocal translocations and inversions ascertained through abnormalities or through incidental events is nonrandom. Much of the nonrandomness can be accounted for as bias in the ascertainment of rep, and a founder effect among inversions. 2. Our least biased data set (incidentally ascertained balanced de novo reciprocal translocations), shows a random distribution of constitutional rep breakpoints with no specific bands preferentially involved. Larger data sets of de novo rep are required, however, because this data set is so small that (3 errors are quite likely. We cannot conclude anything about inversions, as the data set of de novo inversions was too small. The overall distribution of breakpoints in sperm chromosome rearrangements was also random. There are a few candidate bands for possible biologically mediated predisposition to rearrangement in rep. Bands 5q35, 7p22, 9p22, 13ql4, and 17q25 were observed  Chapter  4.  Discussion  and  91  Conclusions  as hot spots both in incidentally ascertained rep and rep ascertained through abnormalities, despite the opposing forces of ascertainment bias affecting these data. 9p22 was also a hot spot i n sperm chromosome rearrangements providing further support for possible nonrandom breakage and/or rearrangment at this site. 5. No conclusions can be drawn about the involvement of specific bands i n inversions until the problem of repeated ascertainment  of the same rearrangement  is  satisfactorally addressed. 6. Fragile sites and oncogenes were not found to be frequent factors in predisposing to constitutional rearrangements.  However, they may play a role i n the generation of  constitutional rearrangements i n a few specific bands. Possibilities from the present analysis include bands l p l l , and 7p22 that coincide with common fragile sites and band 17q25 which contains the E R B A 2 L oncogene. 7. The distribution of breakpoints i n dark and light bands appears to be nonrandom in the direction of higher frequencies of breakpoints i n light G bands, even after correction for overascertainment. 8. Ascertainment bias is important i n producing nonrandomness i n the distribution of rearrangement  breakpoints, and masks the fundamental, structurally-related dis-  tribution of breakpoints produced at the time rearrangements  occur.  it is important to study both de novo constitutional rearrangements  Therefore ascertained  incidentally de novo rearrangements i n sperm from sperm to understand the distribution of rearrangement breakpoints. Further studies must be to be carried out to characterize sites of frequent rearrangement in order to provide clues to the possible mechanisms of constitutional chromosomal rearrangement.  Appendix A  Rearrangements  Associated W i t h Bands of Frequent Breakage  The following are lists of reciprocal translocations and inversions associated with chromosome bands with significantly higher freqency of breakpoints than expected by hypotheses of random breakage. For definitions of hypotheses I and II see chapter 1.  A.l  Lists of Rep Associated with Bands of Frequent Breakage Rep Ascertained T h r o u g h Abnormalities ( G r o u p A l )  Band lq42  Hypothesis  I & II  Reference  Breakpoint 1  Breakpoint 2  [11]  4pl5  H] 11] 51] L241 f24  lq42 lq42 lq42 lq42 lq42 lq42 lq42 lq42 lq42 lq42 lq42 lq42 lq42 lq42  n; ii ii ii if if n n ii  lq31 3q21 3q21 3q21 3q21 3q21 3q21 3q21 3q21  '11] '11] "11] ll] 11] 11] 11]  11] :  :  :  3q21  I  :  : :  92  5pl5 5q23 llql3 12ql2 14ql3 16q24 17pl2 17pl2 21q22 22ql3 21? 13ql4 18ql2 3q21 4pl6 7q36 13ql4 13q22 14q24 14q32 16ql2 16ql3  Appendix  A.  Band  3q27  4pll  4q35  5pl5  Rearrangements  Associated  Hypothesis  I & II  I & II  I & II  I & II  With  Reference  Bands  of Frequent  Breakpoint 1  Breakage  Breakpoint 2  [11] [51] [51] [24]  3q21 3q21 3q21 3q21  17q25 9p22 16pl3 15q26  [11]  2q21  3q27  [11] [11] [11] [11] [11] [11] [11] [11] [51]  3q27 3q27 3q27 3q27 3q27 3q27 3q27 3q27 3q27  4q25 9q22 12q22 14ql3 14q22 17pl3 18q23 21q22 15q22  [11]  1 34  4pll  [11] [11] [11] [11] [11]  4pll 4pll 4pll 4pll 4pll  5qll 5q35 13pll 21qll 22pl3  [11]  lp22  4q35  [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [51] [51] [51] [51] [24]  lq32 2p21 2q33 4q35 4q35 4q35 4q35 4q35 4q35 4q35 4q35 4q35 4q35 4q35 4q35  4q35 4q35 4q35 7q32 7q32 8ql3 9p21 9pl3 10q25 18q22 18qll 10q21 8q22 llq22 22qll  [11]  lq32  5pl5  [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11]  lq42 3p24 3p25 3q22 4q26 5pl5 5pl5 5pl5 5pl5 5pl5 5pl5 5pl5 5 15  5 15 5pl5 5pl5 5pl5 5pl5 8q23 9pl3 9p24 10q24 10q25 llq25 14ql3 14q24  P  P  P  93  Appendix  A.  Rearrangements  Band  Hypothesis  5pl3 [  I  5pl2 [  I & II  5g35 |  6q21 |  Associated  I  1  ][  With  Bands  of Frequent  Breakage  Reference  Breakpoint 1  Breakpoint 2  [ITj [11] [11] [11] [51] [51] [24] [24]  5pl5 5 15 5pl5 5pl5 5pl5 5pl5 4q27 5pl5  15^22 15q22 18q21 22ql2 8q22 13q22 5pl5 10q26  [11]  5p~13  6q21  [11] [11] [11] [11] [11] [11] [11] [11] [11] [51] [51] [24]  5pl3 5pl3 5pl3 5pl3 5pl3 5pl3 5pl3 5pl3 5pl3 2q37 5pl3 lp_13  10pl5 lOqll 10q22 10q26 llq23 13pl2 13ql4 22ql3 21q22 5pl3 14q32 5pl3  [11]  5pl2  9p24  [11] [11] [11] [11] [11] [11] [11]  5pl2 5pl2 5pl2 5pl2 5pl2 5pl2 5pl2  10pl5 llq23 15pll 15pl2 18pll 19ql3 20pll  [11]  1^21  5q35  [11] [11] [11] [11] [11] [11] [11] [11] [24]  2pl2 4pll 4q22 5q35 5q35 5q35 5q35 Xq21 3q25  5q35 5q35 5q35 8q24 9pl3 13q22 15ql3 5q35 5g35  [11]  lq43  6q21  [11] [11] [11] [11] [11] [11] [11] [11] [11] [11]  2q35 2q37 5pl3 6q21 6q21 6q21 6q21 6q21 6q21 6q21  6q21 6q21 6q21 7q32 8ql3 10q26 17q25 18ql2 21qll 22pl3  P  Appendix  A.  Band  7p22  7q22  7q32  8p23  Rearrangements  Associated  Hypothesis  I  I  I k II  I & II  With  Bands  of Frequent  Breakage  Reference  Breakpoint 1  Breakpoint 2  [51] [24]  6q21 6q21  8q22 20ql3  [11]  lq21  7p22  [11] [11] [11] [11] [11] [11] [11] [11] [11] [24]  2q33 3 14 5q32 7p22 7p22 7p22 7p22 7p22 7p22 7p22  7p22 7p22 7p22 8ql2 13ql4 15q22 15q22 15q24 19ql2 22qll  [11]  Iqll  7q22  [11] [11] [11] [11] [11] [11] [11] [11] [11] [51] [51] [51] [51]  2q33 2q35 7q22 7q22 7q22 7q22 7q22 7q22 7q22 7q22 7q22 6q24 6q25  7q22 7q22 8p21 13ql2 13q32 13q34 17q25 18q23 21q? 20ql3 14q32 7q22 7q22  [11]  4q35  7q32  [11] [11] [11] [11] [11] [11] [11] [11] [11] [51] [51] [24] [24]  4q35 6q21 7q32 7q32 7q32 7q32 7q32 7q32 7q32 7q32 4q26 4pl6 5ql3  7q32 7q32 9p23 9p24 12q24 13ql4 13q22 18q22 21q22 18q23 7q32 7q32 7q32  [11]  lq31  8p23  [11] [11] [11] [11] [11] [11] [11] [11]  2q33 3pl4 4pl6 4pl6 6p21 8p23 8p23 8 23  8p23 8p23 8p23 8p23 8p23 9q32 llpll llql4  P  P  Appendix  A.  Band  9p24  9p22  9pl3  9pll  Rearrangements  Associated  Hypothesis  I & II  I, II  I  I, II  With  Reference  Bands  of Frequent  Breakpoint 1  Breakage  Breakpoint 2  [51] [51] [24] [24]  8p23 8p23 8p23 8 23  10pl2 14q31 15q22 13ql4  [11]  3q26  9p24  [11] [11] [11] [11] [11] [11] [11] [11] [51] [51] [51] [51] [51] [51] [51] [24] [24]  4q23 5pl2 5pl5 7q32 8p21 9p24 9p24 9p24 9p24 9p24 2p24 2pl2 3q23 9p24 9p24 5q32 9p24  9p24 9p24 9p24 9p24 9p24 10q24 13q22 18q21 18q21 15q22 9p24 9p24 9p24 18ql2 14q24 9p24 12q24  [11]  6q27  9p22  [11] [11] [11] [11] [11] [11] [11] [11] [51] [24]  9p22 9p22 9p22 9p22 9p22 9p22 9p22 9p22 3q21 9p22  14ql3 14q22 14q24 15ql3 16q24 17pll 18qll 22ql3 9p22 lOpll  [11]  4q35  9pl3  [11] [11] [11] [11] [11] [11] [11] [51] [51] [51]  5pl5 5q35 7q35 9pl3 9pl3 9pl3 9pl3 9pl3 9pl3 7qll  9pl3 9pl3 9pl3 llq24 14pl2 18pll 18pll 15q24 13? 9 13  [11]  8q24  9pll  [11] [11] [11] [51] [51]  9pll 9pll 9pll 6q27 9pll  14pll 15qll 18pll 9pll 10pl5  P  P  Appendix  A.  Band  9qll  Rearrangements  Hypothesis  I, II  Associated  With  llqll  13ql4  I, II  I, II  I, II  of Frequent  Breakage  Reference  Breakpoint 1  Breakpoint 2  [51] [24]  9pll Yql2  14? 9pll  [11]  9qll  20pl3  [11] [11]  9qll 9qll 9qll 9qll 9qll  22qll 22ql2 14pll 17q25 17pll  [11]  lq32  10q26  [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [51] [51] [51] [24] [24]  2p23 4q31 4q31 5 13 6p21 6q21 8p21 8p21 9q32 10q26 10q26 10q26 10q26 lq31 10q26 5pl5 10q26  10q26 10q26 10q26 10q26 10q26 10q26 10q26 10q26 10q26 llql3 16pll 12pll 21q21 10q26 21q22 10q26 15pll  [11]  lOpll  llqll  [11] [51] [51] [51]  llqll 2q22 4q22 llqll  22ql3 llqll llqll 15?  [11]  lq44  13ql4  [11] [11] [11] [11]  2q34 3q21 3q24 3q28 4q23 5pl3 6p21 6q24 7pl5 7pl5 7 22 7q32 9q34 12q24 13ql4 13ql4 13ql4 13ql4  13ql4 13ql4 13ql4 13ql4 13ql4 13ql4 13ql4 13ql4 13ql4 13ql4 13ql4 13ql4 13ql4 13ql4 15q26 15q26 20ql3 17 13  [11] [51] [51] 10q26  Bands  [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [51]  P  P  P  Appendix  A.  Band  13q22  13q34  14q32  15q22  Rearrangements  Associated  Hypothesis  I, II  I  I  I, II  With  Bands  of Frequent  Breakage  Reference  Breakpoint 1  Breakpoint 2  [24] [24] [11]  lq42 8p23 3q29  13ql4 13ql4 13q22  [11]  3q21  13q22  [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [51] [51] [51]  5q35 6p23 6p23 7q32 9p24 12q22 13q22 13q22 13q22 4q27 13q22 13q22 5pl5 10q22  13q22 13q22 13q22 13q22 13q22 13q22 15q21 15q26 18pli 13q22 21q22 15q25 13q22 13q22  [11]  lq31  13q34  [11] [11] [11] [11] [11] [11] [11] [11] [11] [51]  2ql4 2q22 2q33 6ql4 7 13 7q22 8q21 10q21 13q34 lq31  13q34 13q34 13q34 13q34 13q34 13q34 13q34 13q34 14qll 13q34  [11]  lp32  14q32  [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [51] [51] [51] [51] [24]  lq32 3ql3 3q21 3q25 6p21 7qll 8p21 14q32 14q32 14q32 9q21 7q22 8q24 5pl3 8q21  14q32 14q32 14q32 14q32 14q32 14q32 14q32 20qll 22ql2 21qll 14q32 14q32 14q32 14q32 14q32  [11]  lp32  15q22  [11] [11] [11] [11]  2q32 3q28 5pl5 5pl5  15q22 15q22 15q22 15q22  P  .  Appendix  A.  Rearrangements  Band  17pl3 [  17g25 |  18pll [  Hypothesis  l7n  Associated  With  Reference  of Frequent  Breakpoint  Breakage  Breakpoint  1  [11] [11] [11] [11] [11] [11] [51] [51] [51] [24] [24]  6p25 7p22 7p22 15q22 15q22 15q22 3q27 8q22 9p24 8p23 15g22  [ll]  lqn  [11] [11] [11] [11] [11] [11] [11] [11] [51] [51] [51] [51] [51] [51] [51] [24] [24]  lq41 2pll 3pl2 3pl3 3q27 5q33 10q24 17pl3 17pl3 23p22 15ql5 15q24 10q21 13ql4 16pl3 15q22 15ql3 lp36  17g25  [IT] [11] [11] [11] [11] [11] [11] [11] [51] [51] [51] [24] [24]  3q2l 4pl3 5q33 6q21 7q22 8q24 17q25 17q25 17q25 9qll 16ql2 13ql2 6p21  17q25 17q25 17q25 17q25 17q25 17q25 22qll 22ql2 19ql3 17q25 17q25 17q25 17g25  \U]  lpli  18pll  [11] [11] [11] [11] [11] [11]  2q34 4pl4 4q31 5pl2  18pll 18pll 18pll 18pll . 18pll 18pll  LJI  l7n  Bands  6pll 6ql6  '  15q22 15q22 15q22 20pl3 20ql3 21q22 15q22 15q22 15q22 15q22 17pl3 17pl3 17pl3 17pl3 17pl3 17pl3 17pl3 17pl3 17pl3 22qll 21pll 17pl3 17pl3 17pl3 ' 17pl3 17pl3 17pl3 17pl3 17pl3  2  Appendix  A.  Rearrangements  Band  18g21 |  18g23 I  Associated  Hypothesis  I  l7n  With  Bands  of Frequent  Breakage  Reference  Breakpoint 1  Breakpoint 2  [Tl] [11] [11] [11] [11] [11] [11] [11] [11] [11] [51] [51] [51] [51] [51] [51] [24] [24] [24]  9pii 9pl3 9pl3 llql4 Hq23 12pll 13q22 18pll 18pll 15ql5 2pl3 2p23 18q23 6p21 18pll 10q24 3q23 18pll 18pll  i8pii 18pll 18pll 18pll 18pll 18pll 18pll 19ql3 20pll 18pll 18pll 18pll 18pll 18pll 18qll 18pll 18pll 21qll 22ql3  [11]  1^32  18q21  [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [51] [24] [24] [24] [24]  lq44 2q37 4pl4 4q27 5pl5 6p23 8q23 9p24 llq21 18q21 18q21 9p24 4q21 7q34 8q24 15qll  18q21 18q21 18q21 18q21 18q21 18q21 18q21 18q21 18q21 22ql3 21q22 18q21 18q21 18q21 18q21 18q21  [Tl]  1^32  18q23  [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11]  2q33 3pl4 3p21 3q27 4pl5 4q24 4q28 4q31 7q22 8q24 9p23 10q25 llql4  18q23 18q23 18q23 18q23 18q23 18q23 18q23 18q23 18q23 18q23 18q23 18q23 18q23  Appendix  A.  Rearrangements  Band  21qll [  21q22 [  22qll |  Hypothesis  Ml  iTlI  I  Associated  With  Reference  Bands  of Frequent  Breakpoint  1  101  Breakage  Breakpoint  [11] [11] [11] [11] [51] [51] [24]  llpl3 12q23 14q24 15q21 7q32 18q23 13q?  18q2l 18q23 18q23 18q23 18q23 18pll 18q23  [il]  2pTl  21qil  [11] [11] [11] [11] [11] [51] [51] [24] [24]  4pll 6ql5 6q21 12qll 14q32 23q22 19pl3 18pll 4pl5  21qll 21qll 21qll 21qll 21qll 21qll 21qll 21qll 21gll  [ll]  lq24  21q22  [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [51] [51] [51] [24]  lq32 lq42 4q31 5ql3 5q31 6ql2 6q25 7q32 lOpll llpll llq23 12ql5 15q22 16pll 17q23 18qll 5pl3 5q22 15ql3 19ql3 3q27 13q22 18q21 6pll lp31 10q26 2p23  21q22 21q22 21q22 21q22 21q22 21q22 21q22 21q22 21q22 21q22 21q22 21q22 21q22 21q22 21q22 21q22 21q22 21q22 21q22 21q22 21q22 21q22 21q22 21q22 21q22 21q22 21q22  [ll]  2ql3  22qll  [11] [11] [11] [11]  3p26 7pl3 9qll 13q21  22qll 22qll 22qll 22qll  2  Appendix  A.  Rearrangements  Band  22gi21  Hypothesis  iji  Associated  With  Bands  of Frequent  Breakage  Reference  Breakpoint 1  Breakpoint 2  [TT| [11] [11] [11] [11] [11] [51] [24] [24]  14ql3 15q24 17pl3 17q25 4pl6 21pl3 13pll 7p22 4q35  22qll 22qll 22qll 22qll 22qll 22qll 22qll 22qll 22qll  [H]  ^ 7 J  [11] [11] [11] [11] [11] [11] [11] [51]  6p21 9qll 14q32 15q26 16q21 17q25 19ql3 4pl6  22ql2 22ql2 22ql2 22ql2 22ql2 22ql2 22ql2 22ql2  102  Appendix  A.  Rearrangements  Associated  With  Bands  of Frequent  Breakage  103  Rep Ascertained Incidentally (Group A2) Band lpll  lq21  2q33  4q22  5ql3  Hypothesis I, II  I  I  II  I  Reference  Breakpoint  1  Breakpoint  [11]  lpll  3pll  IHJ [11] [11] [11] [22] [22]  lpll lpll lpll lpll lpll lpll  8p23 llqll 16qll 19pll 19qll 19pll  [11]  lq21  3q23  [11] [11] [11] [11] [22] [32] [32] [25]  lq21 lq21 lq21 lq21 lq21 lq21 lq21 lq21  3q29 16q22 19qll 2p23 2q37 17q23 llql3 17q21  [11]  2q33  5ql3  [11] [11] [11] [11] [11] [32] [32] [24]  2q33 2q33 2q33 2q33 2q33 2q33 2q33 2q33  7p22 13ql2 16q22 19pl3 22ql3 10q22 3q29 5q22  [11]  lq44  4q22  [11] [11] [11] [11] [22]  4q22 4q22 4q22 4q22 4q22  llqll 15q21 16pl3 16q23 5pl4  [11]  2q33  5ql3  [11] [11] [11] [11] [11] [22] [22] [22]  4q31 5ql3 5ql3 5ql3 5ql3 5ql3 5ql3 5ql3  5ql3 llqll 13ql4 16pl3 18pll 6p23 7q22 19ql3  2  Appendix  A.  Band 5q35  7 22 P  9p22  10q22  10q24  Ilpl5  Rearrangements  Associated  Hypothesis  I  I  I, II  I  I  I  With  Bands  of Frequent  Breakage  Reference  Breakpoint 1  Breakpoint 2  [50]  5ql3  15q25  [11]  lq23  5q35  [11] [11] [11] [11] [11] [22] [32] [32] [32] [32]  5q35 5q35 5q35 5q35 5q35 5q35 5q35 2q33 10q22 4 13  6ql3 8q21 15ql5 16pl3 19pl3 14q24 6ql5 10q22 14q24 10q22  [11]  lq32  7p22  [11] [11] [11] [22] [24] [5] [5]  2q33 7p22 7p22 7p22 7p22 7p22 3pl4  7p22 8q24 19ql3 20qll 18qll llq23 7p22  [11]  9p22  10pl5  [11] [11] [22] [20]  9p22 9p22 9p22 9p22  llq25 12ql3 10q24 18q21  [11]  2p25  10q22  [11] [11] [11] [11] [11] [11] [25] [20]  2q37 4q25 7qll 7q32 10q22 2q34 3q28 2q35  10q22 10q22 10q22 10q22 17q25 10q22 10q22 10q22  [11]  1 36  10q24  [11] [11] [11] [22] [32] [51] [20] [42]  3p25 6q25 10q24 9p22 10q24 10q24 10q24 10q24  10q24 10q24 16q22 10q24 14q32 17q21 12ql5 21q22  [11]  lp36  llpl5  [11] [11] [11] [11] [22] [32]  3q24 6p23 8p21 4pl4 1 31 3pl2  llpl5 llpl5 11 15 11 15 llpl5 11 15  P  P  P  P  P  P  Appendix  A.  Rearrangements  Band  Associated  Hypothesis  With  Reference [32] [32] [32] [20] [5]  iiq2i |  13gl4 |  17q25 |  19qll 1  iTn  I  iTn  I, II  Bands  of Frequent  Breakpoint 5q31 11 15 4ql3 9q22 llp!5 P  1  105  Breakage  Breakpoint 11 15 19pll llpl5 H 15 15gl5 P  P  [TTj  n 2i  i4 32  [11] [22] [22] [51] [24]  llq21 7p21 llq21 4q35 2pl3  20pl3 llq21 15q24 llq21 llq21  [ll]  5^13  13qH  [11] [11] [11] [22] [22] [22] [32] [32] [51] [50]  6ql5 10q25 13ql4 5q31 13ql4 13ql4 9pl3 4q25 12qll 6p21  13ql4 13ql4 17q23 13ql4 15pll 16q22 13ql4 13ql4 13ql4 13ql4  [U]  lpl6  17q25  [11] [11] [11] [11] [11] [22] [22] [32] [25] [25]  lq32 5q33 10q22 15q22 16ql3 7qll 17q25 3q25 2q31 lql2  17q25 17q25 17q25 17q25 17q25 17q25 22ql2 17q25 17q25 17q25  [11]  l 21  19qll  [11] [11] [22]  5qll 16q24 lpll  19qll 19qll 19qll  q  q  q  2  Appendix  A.2  A.  Rearrangements  Associated  With  Bands  of Frequent  Breakage  Lists of Inversions Associated with Bands of Frequent Breakage Inversions Ascertained Through Abnormalities (Group B l ) Band  Hypothesis  2pll  I, II  Reference  Breakpoint 1  Breakpoint 2  [11] [51] [51] [51]  2pll  2ql3  2pll 2pll 2pll 2pll 2pll 2pll  2ql2 2ql3 2ql3 2ql3 2ql3 2ql3  [11] [51] [51] [51] [24] [24]  2pll  2ql3  2pll 2pll 2pll 2pll 2pll  2ql3 2ql3 2ql3 2ql3 2ql3  [11]  3p25  3q21  [11] [51] [51]  3p25 3p25 3p25  3q25 3q21 3ql3  [51]  3pll  3ql2  [51] [51]  3pll 3pll  3ql2 3ql2  [51]  3pll  3ql2  [51] [51]  3pll 3pll  3ql2 3ql2  [51]  6p25  6q23  [51] [24] [24]  6p25 6pl2 6pl2  6ql5 6p25 6p25  [11]  8p23  8q22  [11] [11] [51]  8p23 8p23 8p23  8q22 8q22 8ql3  [51] [24] [24] 2ql3  3p25  3pll  3ql2  6p25  8p23  I, II  I, II  I, II  I, II  I, II  I  106  Appendix  A.  Rearrangements  Associated  With  Bands  of Frequent  Breakage  Inversions Ascertained Incidentally (Group B2) Band  Hypothesis  Reference  Breakpoint 1  Breakpoint 2  lpll  I, II  [11]  lpll  lq21  [22] [22] [32]  lpll lpll lpll  lql2 lq23 lql2  [11]  lpll  lq21  [11] [11] [11] [11] [11] [11] [32] [32]  lpl2 1 13 lpl3 lpl3 lpl3 lq21 lp36 lpl3  lq21 lq21 lq21 lq21 lq21 lq31 lq21 lq21  [11]  2pll  2ql2  [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [22] [22] [22] [32] [32] [32] [32] [32] [24] [24]  2pll 2pll 2pll 2pll 2pll 2pll 2pll 2pll 2pll 2pll 2pll 2pll 2pll 2pll 2pll 2pll 2pll 2pll 2pll 2pll 2pll 2pll  2ql3 2ql3 2ql3 2ql3 2ql3 2ql3 2ql3 2ql3 2ql3 2ql3 2ql3 2ql3 2ql3 2ql2 2ql3 2ql3 2ql3 2ql3 2ql3 2ql3 2ql3 2ql3  lq21  2pll  2ql3  I, II  I, II  I, II  P  [11]  2pll  2ql3  [11] [22]  2pll Ypll Ypll Ypll Ypll Ypll 2pll 2pll 2pll  2ql3 Yqll Yqll Yqll Yqll Yqll 2ql3 2ql3 2ql3  [22] [22] [51] [24] [11] [11] [11]  107  Appendix  A.  Rearrangements  Band  Hypothesis  r  __^ 6pl2|  lOpll 1  With  Bands  of Frequent  Breakage  Reference  Breakpoint 1  Breakpoint 2  [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [22] [22] [32] [32] [32] [32] [32] [24] [24]  2pll 2pll 2pll 2pll 2pll 2pll 2pll 2pll 2pl2 2pl2 2pll 2pll 2pll 2pll 2pll 2pll 2pll 2pll 2pll 2pll  2ql3 2ql3 2ql3 2ql3 2ql3 2ql3 2ql3 2ql3 2ql3 2ql3 2ql3 2ql3 2ql3 2ql3 2ql3 2ql3 2ql3 2ql3 2ql3 2gl3  \U]  Ipljj  gqU  [11] [11] [11] [11] [11] [24]  5pl3 5pl3 5pl3 5pl3 5pl3 5pl3  5ql3 5ql3 5ql3 5ql3 5ql3 5gl3  iTn  [IT]  ipTI  5 i3  II  [11] [11] [11] [11] [11] [11] [11] [11] [11] [32] [24] \U]  5pl3 5pl3 5pl3 5pl3 5pl3 5pl3 5ql3 5ql3 5pl3 5ql3 5pl3 6pT2  5ql3 5ql3 5ql3 5ql3 5ql3 5ql3 5q35 5q35 5ql3 5q34 5gl3 6ql5  [11] [11] [11] [51]  6pl2 6pl2 6pl2 6p25  6ql5 6p24 6p22 6pl2  5pl3|  5gi31  Associated  U l  .  q  [H]  lOpll  10q21  [11] [11] [11] [11] [11]  lOpll lOpll lOpll lOpll lOpll  10q21 10q21 10q21 10q21 10q21  108  Appendix  A.  Rearrangements Associated  Band  Hypothesis  lOqll |  I  10q21 |  iTn  • Ypll [  IE  With  Bands  of Frequent  Breakage  Reference  Breakpoint 1  Breakpoint 2  [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [22] [32] [32]  lOpll lOpll lOpll lOpll lOpll lOpll lOpll lOpll lOpll lOpll lOpll lOpll lOpll  10q21 10q21 10q21 10q21 10q21 10q21 10q21 10q21 10q21 10q21 lOqll lOqll 10q22  [il]  10pl3  lOqll  [11] [11] [22] [32] [32]  lOqll lOqll lOpll lOqll lOpll  10q23 10q21 lOqll 10q26 lOqll  [H]  lOpll  10q21  [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [32]  lOpll lOpll lOpll lOpll lOpll lOpll lOpll lOpll lOpll lOpll lOpll lOpll lOpll lOpll 10pl3 lOqll lOpll 10pl2  10q21 10q21 10q21 10q21 10q21 10q21 10q21 10q21 10q21 10q21 10q21 10q21 10q21 10q21 10q21 10q21 10q21 10q21  \U]  Ypll  Yqll  [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11]  Ypll Ypll Ypll Ypll Ypll Ypll Ypll Ypll Ypll Ypll Ypll  Yqll Yqll Yqll Yqll Yqll Yqll Yqll Yqll Yqll Yqll Yqll  109  Appendix  A.  Band  Ypll  Rearrangements  Hypothesis  I, II  Associated  With  Bands  of Frequent  Breakage  Reference  Breakpoint 1  Breakpoint 2  [11] [11] [22] [22] [22] [22] [51] [24]  Ypll Ypll Ypll Ypll Ypll Ypll Ypll Ypll  Yqll Yqll Yqll Yqll Yqll Yqll Yqll Yqll  [11]  Ypll  Yqll  [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [11] [22] [22] [22] [22] [51] [24]  Ypll Ypll Ypll Ypll Ypll Ypll Ypll Ypll Ypll Ypll Ypll Ypll Ypll Ypll Ypll Ypll Ypll Ypll Ypll  Yqll Yqll Yqll Yqll Yqll Yqll Yqll Yqll Yqll Yqll Yqll Yqll Yqll Yqll Yqll Yqll Yqll Yqll Yqll  110  Appendix B Band Measurements  The ISCN idiograms [34] were used for direct measurements of chromosome bands. The values obtained for a 320 band karyotype, and a 329 band karyotype defined for use with sperm chromosome data are listed below. Measurements are given in cm. Each band is designated as a light (L) or a dark (D) G band. The assignment of light and dark bands was based on a 400 band karyotype. Therefore, some bands in the 320 band karyotype consist of several subbands. In these cases, a band was designated light or dark based on the total relative lengths of light and dark subbands  111  Appendix  B.l  B.  Band  Measurements  Band measurements at 320 band resolution Band  L/D  Length (cm)  lp36 1 35 1 34 lp33 lp32 lp31 lp22 lp21 lpl3 lpl2 lpll Iqll lql2 lq21 lq22 lq23 lq24 lq25 lq31 lq32 lq41 lq42 lq43 lq44 2p25 2p24 2p23 2p22 2p21 2pl6 2pl5 2pl4 2pl3 2pl2 2pll 2qll 2ql2 2ql3 2ql4 2q21 2q22 2q23 2q24 2q31 2q32 2q33 2q34 2q35  L D L D L D L D L D D D D L D L D L D L D L D L L D L D L D L D L D L L D L D L D L D L D L D L  1.29 0.30 0.62 0.30 0.58 1.54 0.85 0.81 0.84 0.19 0.12 0.15 1.10 0.70 0.31 0.49 0.40 0.43 1.19 1.09 0.65 0.50 0.38 0.32 0.48 0.45 0.61 0.57 0.51 0.77 0.26 0.29 0.59 0.49 0.60 0.53 0.40 0.35 0.72 0.68 0.70 0.32 0.81 0.65 0.90 0.60 0.43 0.40  P  P  Appendix  B.  Band  Measurements  Sand  L/D  Length (cm)  2q36 2q37 3p26 3p25 3 24 3p23 3p22 3p21 3pl4 3 13 3pl2 3pll 3qll 3ql2 3ql3 3q21 3q22 3q23 3q24 3q25 3q26 3q27 3q28 3q29 4pl6 4pl5 4pl4 4pl3 4pl2 4pll 4qll 4ql2 4ql3 4q21 4q22 4q23 4q24 4q25 4q26 4q27 4q28 4q31 4q32 4q33 4q34 4q35 5pl5 5pl4  D L D L D L D L D L D D D L D L D L D L D L D L L D L D L D D L D L D L D L D L D L D L D L L D  0.49 0.79 0.26 0.40 0.69 0.21 0.38 1.35 0.77 0.32 0.99 0.12 0.48 0.15 1.20 0.57 0.30 0.29 0.65 0.46 1.05 0.32 0.30 0.37 0.81 0.97 0.45 0.32 0.31 0.15 0.10 0.45 0.75 0.79 0.52 0.11 0.50 0.23 0.70 0.31 0.82 1.01 0.60 0.27 0.42 0.47 0.99 0.83  P  P  Appendix  B.  B a n d Measurements  Band  L/D  Length (cm)  5 13 5pl2 5pll 5qll 5ql2 5ql3 5ql4 5ql5 5q21 5q22 5q23 5q31 5q32 5q33 5q34 5q35 6p25 6p24 6p23 6p22 6p21 6pl2 6pll 6qll 6ql2 6ql3 6ql4 6ql5 6ql6 6q21 6q22 6q23 6q24 6q25 6q26 6q27 7p22 7p21 7pl5 7pl4 7pl3 7pl2 7pll 7qll 7q21 7q22 7q31 7q32  L D D L D L D L D L D L D L D L L D L D L D D D D L D L D L D L D L D L L D L D L D L L D L D L  0.58 0.25 0.31 0.70 0.42 0.72 0.80 0.26 0.81 0.23 0.90 1.01 0.43 0.40 0.68 0.50 0.40 0.21 0.41 0.69 1.22 0.60 0.29 0.14 0.44 0.29 0.42 0.29 0.68 0.70 1.04 0.45 0.61 0.55 0.29 0.41 0.48 0.80 0.61 0.40 0.38 0.34 0.33 1.25 1.00 0.70 1.20 0.48  P  Appendix  B.  Band  Measurements  Band  L/D  Length (cm)  7q33 7q34 7q35 7q36 8p23 8p22 8p21 8pl2 8pll 8qll 8ql2 8ql3 8q21 8q22 8q23 8q24 9p24 9p23 9p22 9p21 9pl3 9pl2 9pll 9qll 9ql2 9ql3 9q21 9q22 9q31 9q32 9q33 9q34 10pl5 10pl4 10pl3 10pl2 lOpll lOqll 10q21 10q22 10q23 10q24 10q25 10q26 llpl5 llpl4 11 13 llpl2  D L D L L D L D L L D L D L D L L D L D L D D D D L D L D L D L L D L D L L D L D L D L L D L D  0.23 0.25 0.26 0.61 0.60 0.49 0.51 0.58 0.50 0.45 0.49 0.52 1.37 0.80 0.75 1.23 0.36 0.48 0.11 0.69 0.50 0.47 0.08 0.14 0.94 0.18 1.00 1.00 0.50 0.21 0.35 1.00 0.45 0.37 0.43 0.63 0.70 0.62 1.19 0.98 0.68 0.60 0.69 0.72 1.12 0.78 0.31 0.35  P  Appendix  B.  Band  Measurements  Band llpll llqll llql2 llql3 llql4 llq21 llq22 llq23 llq24 llq25 12pl3 12pl2 12pll 12qll 12ql2 12ql3 12ql4 12ql5 12q21 12q22 12q23 12q24 13pl3 13pl2 13pll 13qll 13ql2 13ql3 13ql4 13q21 13q22 13q31 13q32 13q33 13q34 14pl3 14pl2 14pll 14qll 14ql2 14ql3 14q21 14q22 14q23 14q24 14q31 14q32 15 13 P  L/D L D D L D L D L D L L D L D D L D L D L D L D L D D L D L D L D L D L D L D L D L D L D L D L D  Length (cm) 0.72 0.12 0.53 0.89 0.73 0.14 0.62 1.00 0.30 0.33 0.75 0.69 0.60 0.10 0.50 0.75 0.50 0.38 0.99 0.44 0.50 1.30 0.28 0.31 0.56 0.13 0.69 0.29 0.76 1.05 0.56 0.63 0.50 0.29 0.45 0.30 0.30 0.60 0.63 0.40 0.40 0.80 0.48 0.25 0.91 0.50 0.70 0.26  Appendix  B.  Band  Measurements  Band  L/D  Length (cm)  15pl2 15pll 15qll 15ql2 15ql3 15ql4 15ql5 15q21 15q22 15q23 15q24 15q25 15q26 16pl3 16pl2 16pll 16qll 16ql2 16ql3 16q21 16q22 16q23 16q24 17pl3 17pl2 17pll 17qll 17ql2 17q21 17q22 17q23 17q24 17q25 18pll 18qll 18ql2 18q21 18q22 18q23 19pl3 19pl2 19pll 19qll 19ql2 19ql3 20 13 20pl2 20pll  L  0.35 0.58 0.30 0.19 0.21 0.33 0.48 0.80 0.66 0.25 0.59 0.39 0.70 1.00 0.49 0.66 0.65 0.30 0.30 0.37 0.48 0.45 0.49 0.59 0.41 0.58 0.45 0.30 0.80 0.50 0.30 0.46 0.55 1.25 0.59 0.83 0.78 0.78 0.44 1.43 0.30 0.12 0.15 0.32 1.80 0.50 0.50 0.70  P  D  L D L D L D L D L D L L D L D L L D L D L L D L L D L D L D L D L D L D L L D D D D L L D L  Appendix  B.  Band  Measurements  Band  L/D  Length (cm)  20qll 20ql2 20ql3 21pl3 21pl2 21pll 21qll 21q21 21q22 22pl3 22pl2 22pll 22qll 22ql2 22ql3 Xp22 Xp21 Xpll Xqll Xql2 Xql3 Xq21 Xq22 Xq23 Xq24 Xq25 Xq26 Xq27 Xq28 Ypll Yqll Yql2  L D L D L D L D L D L D L D L L D L D D L D L D L D L D L L L D  0.69 0.35 1.05 0.30 0.33 0.56 0.33 0.68 1.01 0.30 0.34 0.56 0.69 0.39 1.13 0.85 0.60 0.94 0.10 0.24 0.56 0.85 0.32 0.29 0.24 0.38 0.37 0.37 0.34 0.22 0.33 0.38  Appendix  B.2  B.  Band  Measurements  Band Measurements for Sperm Chromosomes Band  L/D  Length (cm)  lp36 1 35 1 34 lp33 1 32 lp31 1 22 1 21 1 13 1 12 lcen lql2 lq21 lq22 lq23 lq24 lq25 lq31 lq32 lq41 lq42 lq43 lq44 2p25 2p24 2p23 2p22 2p21 2pl6 2pl5 2pl4 2pl3 2pl2 2pll 2cen 2qll 2ql2 2ql3 2ql4 2q21 2q22 2q23 2q24  L D L D L D L D L D D D L D L D L D L D L D L L D L D L D L D L D L D L D L D L D L D  1.29 0.30 0.62 0.30 0.58 1.54 0.85 0.81 0.84 0.19 0.24 1.10 0.70 0.31 0.49 0.40 0.43 1.19 1.09 0.65 0.50 0.38 0.32 0.48 0.45 0.61 0.57 0.51 0.77 0.26 0.29 0.59 0.49 0.48 0.24 0.41 0.40 0.35 0.72 0.68 0.70 0.32 0.81  2q31 2q32 2q33 2q34 2q35  L D L D L  0.65 0.90 0.60 0.43 0.40  P  P  P  P  P  P  P  Appendix  B.  Band  Measurements  Band  L/D  Length (cm)  2q36 2q37 3p26 3p25 3p24 3p23 3p22 3p21 3pl4 3pl3 3pl2 3cen 3qll 3ql2 3ql3 3q21 3q22 3q23 3q24 3q25 3q26 3q27 3q28 3q29 4pl6 4pl5 4pl4 4pl3 4pl2 4cen 4ql2 4ql3 4q21 4q22 4q23 4q24 4q25 4q26 4q27 4q28 4q31 4q32 4q33 4q34 4q35 5pl5 5pl4 5pl3  D L D L D L D L D L D D D L D L D L D L D L D L L D L D L D L D L D L D L D L D L D L D L L D L  0.49 0.79 0.26 0.40 0.69 0.21 0.38 1.35 0.77 0.32 0.99 0.24 0.36 0.15 1.20 0.57 0.30 0.29 0.65 0.46 1.05 0.32 0.30 0.37 0.81 0.97 0.45 0.32 0.31 0.24 0.45 0.75 0.79 0.52 0.11 0.50 0.23 0.70 0.31 0.82 1.01 0.60 0.27 0.42 0.47 0.99 0.83 0.58  Appendix  B.  Band  Measurements  Band  L/D  Length (cm)  5pl2 5cen 5qll 5ql2 5ql3 5ql4 5ql5 5q21 5q22 5q23 5q31 5q32 5q33 5q34 5q35 6p25 6p24 6p23 6p22 6p21 6pl2 6pll 6cen 6ql2 6ql3 6ql4 6ql5 6ql6 6q21 6q22 6q23 6q24 6q25 6q26 6q27 7p22 7p21 7pl5 7pl4 7pl3 7pl2 7pll 7cen 7qll 7q21 7q22 7q31 7q32  D D L D L D L D L D L D L D L L D L D L D L D D L D L D L D L D L D L L D L D L D L D L D L D L  0.25 0.24 0.58 0.42 0.72 0.80 0.26 0.81 0.23 0.90 1.01 0.43 0.40 0.68 0.50 0.40 0.21 0.41 0.69 1.22 0.60 0.17 0.24 0.44 0.29 0.42 0.29 0.68 0.70 1.04 0.45 0.61 0.55 0.29 0.41 0.48 0.80 0.61 0.40 0.38 0.34 0.21 0.24 1.13 1.00 0.70 1.20 0.48  '  Appendix  B.  Band  Measurements  Band  L/D  Length (cm)  7q33 7q34 7q35 7q36 8p23 8p22 8p21 8pl2 8pll 8cen 8qll 8ql2 8ql3 8q21 8q22 8q23 8q24 9p24 9p23 9p22 9p21 9pl3 9pl2 9cen 9ql2 9ql3 9q21 9q22 9q31 9q32 9q33 9q34 10pl5 10pl4 10pl3 10pl2 lOpll lOcen lOqll 10q21 10q22 10q23 10q24 10q25 10q26 llpl5 llpl4  D L D L L D L D L D L D L D L D L L D L D L D D D L D L D L D L L D L D L D L D L D L D L L D L  0.23 0.25 0.26 0.61 0.60 0.49 0.51 0.58 0.38 0.24 0.33 0.49 0.52 1.37 0.80 0.75 1.23 0.36 0.48 0.11 0.69 0.50 0.47 0.24 0.94 0.18 1.00 1.00 0.50 0.21 0.35 1.00 0.45 0.37 0.43 0.63 0.58 0.24 0.50 1.19 0.98 0.68 0.60 0.69 0.72 1.12 0.78 0.31  llpl3  Appendix  B.  Band  Measurements  Band  L/D  Length (cm)  llpl2 llpll llcen llql2 llql3 llql4 llq21 llq22 llq23 llq24 llq25 12pl3 12pl2 12pll 12cen 12ql2 12ql3 12ql4 12ql5 12q21 12q22 12q23 12q24 13pl3 13pl2 13pll 13cen 13ql2 13ql3 13ql4 13q21 13q22 13q31 13q32 13q33 13q34 14pl3 14pl2 14pll 14cen 14qll 14ql2 14ql3 14q21 14q22 14q23 14q24 14q31  D L D D L D L D L D L L D L D D L D L D L D L D L D D L D L D L D L D L D L D D L D L D L D L D  0.35 0.60 0.24 0.53 0.89 0.73 0.14 0.62 1.00 0.30 0.33 0.75 0.69 0.48 0.24 0.50 0.75 0.50 0.38 0.99 0.44 0.50 1.30 0.28 0.31 0.44 0.24 0.69 0.29 0.76 1.05 0.56 0.63 0.50 0.29 0.45 0.30 0.30 0.48 0.24 0.51 0.40 0.40 0.80 0.48 0.25 0.91 0.50  Appendix  B.  Band  Measurements  Band  L/D  Length (cm)  14q32  L D L D D L D L D L D L D L D L L D L D D L L D L D L L D L D L D L D L D L L D L D L D L L D D  0.70 0.26 0.35 0.46 0.24 0.18 0.19 0.21 0.33 0.48 0.80 0.66 0.25 0.59 0.39 0.70 1.00 0.49 0.54 0.24 0.53 0.30 0.30 0.37 0.48 0.45 0.49 0.59 0.41 0.46 0.24 0.33 0.30 0.80 0.50 0.30 0.46 0.55 1.13 0.24 0.47 0.83 0.78 0.78 0.44 1.43 0.30 0.24  15pl3 15pl2 15pll 15cen 15qll 15ql2 15ql3 15ql4 15ql5 15q21 15q22 15q23 15q24 15q25 15q26 16pl3 16pl2 16pll 16cen 16qll 16ql2 16ql3 16q21 16q22 16q23 16q24 17pl3 17pl2 17pll 17cen 17qll 17ql2 17q21 17q22 17q23 17q24 17q25 18pll 18cen 18qll 18ql2 18q21 18q22 18q23 19pl3 19pl2 19cen  Appendix  B.  Band  Measurements  Band  L/D  Length (cm)  19ql2 19ql3 20pl3 20pl2 20pll 20cen 20qll 20ql2 20ql3 21pl3 21pl2 21pll 21cen 21qll 21q21 21q22 22pl3 22pl2 22pll 22cen 22qll 22ql2 22ql3 Xp22 Xp21 Xpll Xcen Xql2 Xql3 Xq21 Xq22 Xq23 Xq24 Xq25 Xq26 Xq27 Xq28 Ypll Ycen Yqll Yql2  D L L D L D L D L D L D D L D L D L D D L D L L D L D D L D L D L D L D L L D L D  0.32 1.80 0.50 0.50 0.58 0.24 0.57 0.35 1.05 0.30 0.33 0.44 0.24 0.21 0.68 1.01 0.30 0.34 0.44 0.24 0.57 0.39 1.13 0.85 0.60 0.85 0.18 0.24 0.56 0.85 0.32 0.29 0.24 0.38 0.37 0.37 0.34 0.19 0.06 0.30 0.38  Appendix C Computer Programs  The following computer programs were utilized in both the processes of data management and statistical analysis. These programs were all written by A. R. Rutherford and are not available in any other published source. For this reason, they are reproduced in the following sections. C.l  Checking for Invalid Bands  In the data used from published sources, some bands reported are nonexistent bands based on the ISCN nomenclature. This is probably the result of typing mistakes in the original papers. All breakpoints in each data set were checked against the list of valid bands defined according to ISCN [34] using the following program written in Shellscript. Invalid bands werre removed from the data set. # # Program to check f o r i n v a l i d bands. # o n i n t r close foreach i ($argv) awk '{print $3}' $ i I s p e l l o u t "/minus/bb.hash >! tempi.$$ awk '-{print $4}' $ i I s p e l l o u t "/minus/bb.hash >! temp2.$$ set f i r s t = " ' s o r t -u tempi.$$"' set second = ' " s o r t -u temp2.$$'" i f ( $ # f i r s t == 0) then echo 'No bad bands i n the t h i r d column of »>"$±">".> else echo 'Bad bands i n the t h i r d column of ">"$i">" are:' foreach word ( $ f i r s t ) sed -n "/ $word /p" $ i end  126  Appendix  C.  Computer  Programs  127  endif i f ($#second == 0) then echo 'No bad bands i n the fourth column of ">•'$!»>".> else echo 'Bad bands in the fourth column of '""$i">" are:' foreach word ($second) sed -n " / $word /p" $i end endif echo " end close: \rm - f tempi.$$ temp2.$$  C.2  Checking for Duplicate Rearrangements  The following program written i n C shell script compares breakpoints i n any file or between all data i n a string of files and prints all rearrangements with identical set of breakpoints as a group. T h e program handles those cases where one breakpoint is either unknown or partially defined, and prints all possible matches as a group to allow subsequent decisions as to probability of duplication based on additional identifying i n formation. # # C s h e l l s c r i p t t o look f o r d u p l i c a t e p a i r s of breakpoints. # o n i n t r close \rm - f f i l e . $ $ foreach f i l e ($argv) awk '{printf '7.-16s y.-ls\n", " ' $ f i l e ' : " , $0}' $ f i l e » ! f i l e . $ $ end echo "DATA FILES: $argv[*]" awk '{\ i f ($4 * A ? / && $5 " A ? / ) p r i n t NR. > "'workl .$$'"\ else i f ($4 == "?") p r i n t NR., $5 > '"work2.$$'"\ else i f ($5 == "?") p r i n t NR, $4 > "'work2.$$'"\ else i f ($4 A ? / I I $5 " A ? / ) p r i n t NB, $4, $5\ else i f ($4 < $5) p r i n t NR, $4, $5 > "'work4.$$'"\ else p r i n t NR, $5, $4 > '"work4.$$"'\ }' f i l e . $ $ I sort I tee work3a.$$ I sed 's/?//g' >! temp.$$ sed 'sA./\\\./' work3a.$$ I sed 's/?/\.\*/' I j o i n - temp.$$ |\ -  sort -n >! work3b.$$  Appendix  C.  Computer  Programs  #Bad data l i n e s . i f (-e workl.$$) then echo 'Data l i n e s ignored:' echo ' ' set l i n e s = 'cat workl.$$' foreach i ( $ l i n e s ) sed -n {$i}p f i l e . $ $ end echo \ >  >  endif echo 'Duplicate Data echo '  Lines:' '  #Data with an unknown breakpoint, i f (-e work2.$$) then echo 'Data l i n e s with an unknown breakpoint:' echo \  i  >  set wc = 'wc -1 work2.$$' set count = 1 while ($count <= $wc[l]) \rm - f matches.$$ awk 'NR. == '$count'{print $l"\n"$2>' work2.$$ >! bl.$$ set b l = "'cat b l . $ $ " ' awk ' NR. != "'$count'" && $2 ==• " ' $ b l [ 2 ] " ' { p r i n t $1>' \ work2.$$ » ! matches.$$ i f (-e work4.$$) then awk ' $2 == " ' S b l t e ] ' " II $3 == " ' $ b l [ 2 ] ' " { p r i n t $1}' work3a.$$ work4.$$ » ! matches.$$ else awk ' $2 == ' " $ b l [ 2 ] " ' II $3 == " ' $ b l [ 2 ] ' " { p r i n t $1}' work3a.$$ » ! matches.$$ endif i f (! -z matches.$$) then set matches = ( $ b l [ l ] 'cat matches.$$') foreach l i n e ($matches) sed -n {$line>p f i l e . $ $ end echo \  J  j  endif <D. count++ end endif #Breaks with some missing information, i f (! -z work3b.$$) then echo 'Data l i n e s with a p a r t i a l l y known breakpoint:' echo \  Appendix  C.  Computer  Programs  set wc = 'wc -1 work3b.$$' set count - 1 while ($count <= $wc[l]) \rm - f matches.$$ awk 'NR. == »$count' {print $l"\n"$2"\n"$3"\n"$4"\n"$5}' work3b.$$ >! bl.$$ set bl = '"cat bl.$$*" set record = 1 while ($record <= $wc[l]) i f ($count != $record) then awk »HR == '$record' {print $2"\n"$3"\n"$4"\n"$5>' work3b.$$ >! b2.$$ set b2 = '"cat b2.$$"' awk 'HR == '$record' {\ i f C"$blC4]'" ' / - ' " $ b 2 [ l ] " ' $ / I I "'$b2[3]"' " /~"'$bl[2]"'$/){\ i f C"$bl[5]'" " /~'"$b2[2]"'$/ I I "'$b2[4]'" " /~'"$bl [3]">$/){\ print $1>}\ i f ("'$bl[4] " ' * /-"'$b2[2]'"$/ I I "'$b2[4]'" " /*'"$bl[2]"'$/){\ i f C"$bl[5]'" " / - " ' $ b 2 [ l ] " ' $ / II '"$b2[3]'" ~ / * " ' $ b l [3]"'$/){\ print $ 1 > » ' work3b.$$ » ! matches.$$ endif 19 record++ end i f (-e work4.$$) then awk '{\ i f ($2 " / - ' " $ b l [ 2 ] " ' $ / && $3 " /-"'$bl[3]"'$/) print $1 \ else i f ($3 - /-'"$blC2]"'$/ && $2 ' /""'$bl[3]"'$/) print $1 \ >' work4.$$ » ! matches.$$ endif i f (! -z matches.$$) then set matches = ($bl[l] 'cat matches.$$') foreach line ($matches) sed -n {$line>p file.$$ end echo \ >  >  endif ID count++ end endif #Completely known breakpoints, i f (-e work4.$$) then echo 'Data lines with completely known breakpoints:' echo \ > i set wc = 'wc -1 work4.$$' i f ($wc[l] >= 2) then awk >{\ i f ($2 < $3)\ {print $1, $2, $3}\ else\  Appendix  C.  Computer  130  Programs  {print $1, $3, $2}\ >' work4.$$ I sort +1 >! temp.$$ set record = 1 set eof = 0 while ($record < $wc[l]) set b l = 'sed -n {$record}p temp.$$' 0 next = $record + 1 set b2 = 'sed -n {$next}p temp.$$' while ("$bl[2]" == "$b2[2]" && "$bl[3] == "$b2[3]" && ! $eof) next++ i f ($next <= $wc[l]) then set b2 = 'sed -n {$next}p temp.$$ ' else <D eof = 1 endif end i f ($next >= $record + 2) then awk 'NR == '$record', NR == '$next' - Imprint $1>'\ temp.$$ >! matches.$$ set matches = 'sort -n matches.$$' foreach line ($matches) sed -n {$line}p file.$$ end echo \ M  >  >  endif <S record = $next end endif endif close: \rm -f file.$$ temp.$$ workl.$$ work2.$$ work3a.$$ work3b.$$ work4.$$\ bl.$$ b2.$$ matches.$$  C.3  Statistical Analysis U s i n g B i n o m i a l Confidence Limits  T h e following program written i n C shell script counts the number of breakpoints i n a list of rearrangements  for each band i n a 320 band karyotype, calculates  intervals, breakage densities, and expected values.  confidence  It also compares hypothetical and  observed values to determine if some bands are hot spots or cold spots for breakage. # # C shell script to look for hot spots. #  Appendix  C.  Computer  131  Programs  onintr close i f ($#argv < 2) then echo 'Heed at least 2 arguments.' echo 'Have you forgotten the bands f i l e ? ' exit endif set bf = $argv[l] shift echo "DATA FILES: $argv[*]" echo "BAUDS FILE: $bf" \rm -f file.$$ breaks.$$ hlhs.temp hies.temp h2hs.temp h2cs.temp sort -b +1 $bf >! bands.$$ foreach f i l e ($argv) awk '{printf '"/.-15s '/.ls\n", " ' $ f i l e ' : " , $0 » " ' f i l e . $ $ " ' \ print $3"\n"$4}' $ f i l e » ! breaks.$$ end sort breaks.$$ I uniq -c I join -a2 -e 0 - j 2 \ -o 2.1 2.2 2.3 2.4 1.1 - bands.$$ I sort -n I tee temp.$$ l \ awk '{\ i f ($3 == "L"){\ TL += $5\ LL += $4>\ i f ($3 == "D"){\ TD += $5\ LD += $4}\ TB += $5\ LB += $4>\ END { EL = TL / LL\ ED = TD / LD\ EB = TB / LB\ print TL, TD, TB, LL, LD, LB, EL, ED, EB}' l\ cat - temp.$$ I awk -f hs.awk i f (-e hlhs.temp) then echo 'Hot spots for hypothesis 1:' set spots = "'cat hlhs.temp'" foreach break ($spots) echo " " echo $break':' fgrep " $break " file.$$ end else echo 'No hot spots for hypothesis 1.' endif echo \ 11  if  II  (-e hies.temp) then echo 'Cold spots for hypothesis set spots = "'cat hies.temp'" foreach break ($spots)  1:'  Appendix  C.  Computer  Programs  echo "" echo $break':' fgrep " $break " f i l e . $ $ end else echo 'No c o l d spots f o r hypothesis 1.' endif echo \ M  (-e h2hs.temp) then echo 'Hot spots f o r hypothesis set spots = "'cat h2hs.temp'" foreach break ($spots) echo "" echo $break':' fgrep " $break " f i l e . $ $ end else  if  2:'  2.'  echo 'No hot spots f o r hypothesis endif echo \ II  (-e h2cs.temp) then echo 'Cold spots f o r hypothesis set spots = "'cat h2cs.temp'" foreach break ($spots) echo "" echo $break':' fgrep " $break " f i l e . $ $ end else if  2:'  echo 'No cold spots f o r hypothesis endif echo \  2.'  •i  close: \rm - f bands.$$ breaks.$$ f i l e . $ $ temp.$$ hlhs.temp \ hies.temp h2hs.temp h2cs.temp  BEGIN { print \ " print \ " print \ " band print \  length  num.  upper  lower  num.  u.c.l.  of  conf.  conf.  per  per  per  length  length  breaks  lim.  l i m . length  l.c.l." hyp. hy 1  2  Appendix  C.  Computer  Programs  #The Poisson confidence l i m i t s axe added below. #These are the 99'/, confidence i n t e r v a l s LCL[0] = 0.00000 UCL[0] = 5.30 LCL[1] = 0.00501 UCL[1] = 7.43 LCL[2] = 0.10300 UCL[2] = 9.27 LCL[3] = 0.33800 UCL[3] 10.98 LCL[4] = 0.67200 UCL[4] = 12.59 LCLC5] 1.08000 UCL[5] = 14.15 LCL[6] = 1.54000 UCLC6] = 15.66 LCL[7] 2.04000 UCL[7] = 17.13 LCL[8] = 2.57000 UCL[8] = 18.58 #Z-value f o r the normal confidence l i m i t s . Z = 2.576 > NR == 1 { T [ l ] = $1 ; T[2] = $2 ; T[3] = $3 T[4] = $4 ; T[5] = $5 ; T[6] = $6 T[7] = $7 ; T[8] = $8 ; T[9] = $9 D = 2 * (T[3] + (Z * Z))> NR > 1 { i f ($5 <= 8) { ULIM = UCL[$5] LLIM = LCL[$5] } e l s e •[ X = Z * s q r t ( ( 4 * $5 * T[3] * (T[3] - $5)) + (Z * Z * T[3] * T[3])) ULIM = ((2 * $5 * T[3]) + (Z * Z * T[3]) + X) / D LLIM = ((2 * $5 * T[3]) + (Z * Z * T[3]) - X) / D > CPL = $5 / $4 ; ULIMPL = ULIM / $4 ; LLIMPL = LLIM / $4 i f (LLIMPL > T[9]) { HI = "H" p r i n t $2 » "hlhs.temp"} e l s e i f (ULIMPL < T[9]) { HI = "C" p r i n t $2 » "hies.temp"} else HI = "-" i f ($3 == "L") { i f (LLIMPL > T[7]) { H2 = "H" p r i n t $2 » "h2hs.temp"} e l s e i f (ULIMPL < T[7]) { H2 = "C" p r i n t $2 » "h2cs.temp"} e l s e H2 = "-"} e l s e i f ($3 == "D") { i f (LLIMPL > T[8]) { H2 = "H" p r i n t $2 » "h2hs.temp"} e l s e i f (ULIMPL < T[8]) { H2 = "C"  133  Appendix  C.  Computer  134  Programs  print $2 » "h2cs.temp"> else H2 = "-"} else H2 = " - " printf \ '7.-7s '/.Is '/.5.2f '/.4d '/.6.2f '/.6.2f '/.6.2f */.7.2f */.7.2f '/,1s $2, $3, $4, $5, ULIM, LLIM, CPL, ULIMPL, LLIMPL, HI, H2> END { print \ print print print print print print print print print print  '/.ls\n",\  "Total number of breaks i n light bands = "T[l] "Total number of breaks i n dark bands = "T[2] "Total number of breaks i n a l l bands = "T[3] "Total length of light bands = "T[4] "Total length of dark bands = "T[5] "Total length of a l l bands = "T[6] "Expected number per unit length in light bands (Hyp. 2) = "T[7] "Expected nummber per unit length in dark bands (Hyp. 2) = "T[8] "Expected number per unit length in a l l bands (Hyp. 1) = "T[9] \  II  C.4  II  Testing for Nonrandomness and Homogeneity  T h e following program written i n Pascal tests overall nonrandomness of breakpoint distributions and homogeneity between data sets using Pearson's % statistic and the log-lin 2  statistic. PROGRAM HomTest (input,output); (********************************************************** Program to do homogeneity testing of a set of hot spot data f i l e s . Written in Vax Pascal. For Turbo Pascal, replace the data type varying ] ( of char with string ] (, and modify the commands to open f i l e s . Possibly, the command sngl to convert from double to single precision i n the function Chisq may also need to be change. *********************************************** CONST maxfiles = 10; datalines = 320; (* Number of lines read from data f i l e s . *) VAR name, exfilename : varying ]30( of char; results, datafile, exfile : text; degfree, Ze, N, i , j : integer; X2 , G2 : real; x : char; filenum : 1 .. maxfiles;  Appendix  C.  breaks : expect : filename colsum :  Computer  Programs  array ]1..maxfiles.l..datalines( of integer; array ]1..datalines( of real; : array ]1..maxfiles( of varying ]30( of char; array ]1..maxfiles( of integer;  FUNCTION Norm ( x : double) : double;  (*************************************************************  Computes the probability function, P(z > x ), for the normal distrubution. Returns 0 for arguments greater than 5. Norm (5) = 2.9E-7. Returns 1 for arguments less than - 5 . ********************************************************* CONST sqrttwopi = 2.506628274631000502415765D+0; VAR i : integer; sum , term : double; BEGIN (* Norm *) IF x > 5 THEN Norm := 0 ELSE IF x < -5 THEN Norm := 1 ELSE BEGIN sum := x; term := x; i:= 2; REPEAT term := -term * x * x * (i-1) / (i*(i+l)) ; sum := sum + term; i := i + 2 UNTIL (sum + term = sum ) or ( i > 200) ; IF ( i > 200 ) THEN writeln('Norm failed to converge!'); Norm := 0.5 - ( sum / sqrttwopi ) END END; (* Norm *) FUNCTION Chisq (nu : integer ; x : double) : real; (********************************************************************* Calculates the probability i n the t a i l of the Chi squared distribution for nu degrees of freedom. Uses approximation 26.4.14 from Abramowitz and Stegun, which i s valid for nu > 30. In this range, i t seems to accurate to about 2E-S. **********************************************************************) CONST third = 0.33333333333333D+0; BEGIN (* Chisq *) Chisq := sngl(Norm( (9*(x*nu*nu)**third END; (* Chiqs *)  - 9*nu + 2) / (3*sqrt(2*nu))))  Appendix  C.  Computer  Programs  PROCEDURE Skipcolumn;  (*********************************************************************  This procedure skips over a column in the data f i l e . ********************************************* VAR  i : integer; x : char; BEGIN (* Skipcolumn *) read(datafile.x); WHILE x = ' ' D O read(datafile,x); WHILE not (x = ' ') DO read(datafile.x) END; (* Skipcolumn *)  (************************************************************) BEGIN (* main Program *) filenum := 0; (* Initialize the f i l e number. *) write ('Data f i l e : ' ) ; readln (name); WHILE not (name = " ) DO BEGIN filenum := filenum +1; f ilename]f ilenum( := name; open(datafile,name,readonly); reset(datafile); REPEAT readln(datafile,x) UNTIL x = ' - ' ; colsum]filenum( := 0; FOR i := 1 to datalines DO BEGIN Skipcolumn; Skipcolumn; Skipcolumn; readln(datafile,breaks]filenum,i(); colsum]filenum( := colsum]filenum( + breaks]filenum,i(; END; close(datafile); write('Data f i l e : ' ) ; readln(name) END; N := 0; FOR i := 1 to filenum DO N := N + colsum]i(; write('File of expected values: ' ) ;  Appendix  C.  Computer  Programs  readln(exfilename); IF exfilename = ' ' THEN BEGIN Ze:= 0 ; FOR j := 1 to datalines DO BEGIN expect]j( : = 0 ; FOR i := 1 to filenum DO expect]j( := expect]j( + breaks]i,j(; IF expect] j( = 0 THEN Ze := Ze + 1; expect]j( := expect]j( / N END END ELSE BEGIN open(exfile,exfilename.readonly); reset(exfile); FOR j:= 1 to datalines DO readln(exfile,expect]j(); close(exfile) END; X2 := 0 ; G2 := 0 ; FOR j:= 1 to datalines DO FOR i := 1 to filenum DO BEGIN (* Calculating the Pearson chi-square s t a t i s t i c . Undefined values are treated as zero. *) IF not (expect] j( = 0) THEN X2 := X2 + (sqr(breaks]i,j(-colsum]i(*expect]j() / (colsum]i(*expect]j()); (* Calculating the l o g - l i n s t a t i s t i c . Undefined values are treated, as zero. *) IF not (breaks]i,j( = 0) THEN G2 := G2 + 2 * breaks]i,j( * In(breaks]i,j( / (colsum]i(*expect]j()) END; (* Calculate the number of degrees of freedom. *) IF exf ilename = " THEN degfree := (filenum - 1 ) * (datalines - Ze) ELSE degfree := filenum * datalines; open(results,'homtest.out'); rewrite(results); IF exfilename = " THEN writeln(results,' TESTING FOR HOMOGENEITY OF PROPORTIONS') ELSE BEGIN  Appendix  C.  Computer  Programs  writeln(results,' CHI-SQUARE TEST QF EXPECTATION VALUES'); writeln(results); writeln(results,'Expectations taken from f i l e : ',exfilename) END; writeln(results); writeln(results,'Data f i l e s : ' ) ; FOR i:= 1 to filenum DO write(results.filename]i(,' '); writeln(results); writeln(results); writeln(results,'total number of breaks = ' , N : 0 ) ; IF (exfilename = " ) THEN writeln(results,'number of zero marginal sums = ' , Z e : 0 ) ; writeln(results,'number of degrees of freedom = ',degfree:0); writeln(results); writeln(results,'Pearson chi-square s t a t i s t i c : ' ) ; writeln(results,'X2 =',X2); writeln(results,'Compare to chi-square distribution with ' , degfree:0,' degrees of freedom.'); writeln(results,'P(z > X2) =',Chisq(degfree,X2)); writeln(results); writeln(results,'Log-lin statistic:'); writeln(results,'G2 =',G2); writeln(results,'Compare to chi-square distribution with ' , degfree:0,' degrees of freedom.'); IF G2> 0 THEN writeln(results,'P(z > G2) =',Chisq(degfree,G2)) ELSE writeln(results,'P(z > G2) = 1'); writ eln (results,' ') close(results) END (* main program *) .  References  [1] Aurias, A., Prieur, M., Dutrillaux, B., and Lejeune, J. (1978) Systematic analysis of 95 reciprocal translocations in autosomes. Hum Genet 45:259-282. [2] Benet, J., Fuster, C., Genesca, Navarro, J., Miro, R., Egozcue, J., Templado, C. (1989) Expression of fragile sites in human sperm and lymphocyte chromosomes. Hum Genet 81:239-242. [3] Boue, A., and Gallano, P. (1984) A collaborative study of the segregation of inherited chromosome structural rearrangements in 1356 prenatal diagnoses. Prenat Diagn 4: (Special Issue, Spring 1984) 45-67. [4] Brandriff, B., Gordon, L., Ashworth, L., Watchmaker, G., Moore, D., Wyrobek, A. J., and Carrano, A. V. (1985) Chromosomes of human sperm: variability among normal individuals. Hum Genet 70:18-24. [5] Buckton, K. E., O'Riordan, M. L., Slight, J., Mitchell, M., McBeath, S., Keay, A. J., Barr, D., and Short, M. (1980) A G-band study of chromosomes in liveborn infants. Ann Hum Genet 43:227-239. [6] Burns, J. P., Koduru, P. R. K., Alonso, M. L., and Chaganti, R.S.K. (1986) Analysis of meiotic segregation in a man heterozygous for two reciprocal translocations using the hamster in vitro penetration system. Am J Hum Genet 38:954-964. [7] Campana, M., Serra, A., and Neri, G. (1985) Role of chromosome aberrations in recurrent abortions: A study 259 balanced translocations. Am J Med Genet 24:341-356. [8] Chandley, A. C. (1989) Meiotic studies and fertility in human translocation carriers. In: Daniel, A. (ed.) The Cytogenetics of Mammalian Autosomal Rearrangements. Vol. 8 in Sandberg, A. A. (series ed.) Progress and Topics in Cytogenetics pp:361382. New York: Alan R. Liss, Inc. [9] Daniel, A. (1979) Structural differences in reciprocal translocations: potential for a model of risk in reciprocal translocation. Hum Genet 51:171-182. [10] Daniel, A., Hook, E. B., and Wulf, G. (1988) Collaborative U.S.A. data on prenatal diagnosis for parental carriers of chromosome rearrangements: risks of unbalanced progeny. In: Daniel, A. (ed.) The Cytogenetics of Mammalian Autosomal Rearrangements. Vol. 8 in Sandberg, A. A. (series ed.) Progress and Topics in Cytogenetics pp:73-162. New York: Alan R. Liss, Inc.  139  References  140  [11] Daniel A., Hook, E. B., and Wulf, G. (1989) Risks of unbalanced progeny at amniocentesis to carriers of chromosome rearrangements: data from United States and Canadian laboratories. Am J Med Genet 31:14-53. [12] Davis, J. R., and Hagaman, R. M. (1987) Fragile sites are unrelated to reciprocal translocation breakpoints. Clin Genet 31:308-310. [13] Davis, J. R., Hagaman, R. M., Thies, A. C, and Veomett, I. C. (1985) Balanced reciprocal translocations: risk factors for aneuploid segregate viability. Clin Genet 27:1-19. [14] Davis, J. R., Rogers, B. B., and Hagaman, R. M. (1988) Factors influencing viability in reciprocal translocation segregants in man. In: Daniel, A. (ed.) The Cytogenetics of Mammalian Autosomal Rearrangements. Vol. 8 in Sandberg, A. A. (series ed.) Progress and Topics in Cytogenetics pp:419-451. New York: Alan R. Liss, Inc. [15] Dutrillaux, B. (1979) Chromosomal evolution in primates: Tenative phylogeny from Microcebus murinus (Prosimian) to man. Hum Genet 48:251-314. [16] De Braekeleer, M. (1985) Fragile sites and chromosome breakpoints in constitutional rearrangements. Clin Genet 27:523-524. [17] De Braekeleer, M., and Smith, B. (1988) Two methods for measuring nonrandomness of chromosome abnormalities. Ann Hum Genet 52:63-67. [18] De Braekeleer, M., Smith, B., and Lin, C.C. (1985) Fragile Sites and Structural Rearrangements in Cancer. Hum Genet 69:112-116. [19] Evans, H. J. (1973) Molecular Architecture of human chromosomes. Br Med Bulletin 29:196-202. [20] Evans, J. A., Canning, N., Hunter, A. G. W., Martsolf, J. T., Ray, M., Thompson, D. R., and Hamerton, J. L. (1978) A cytogenetic survey of 14,069 newborn infants III. An analysis of the significance and cytologic behaviour of the Robertsonian and reciprocal translocations. Cytogenet Cell Genet 20:96-123. [21] Feichtinger, W., and Schmid, M. (1989) Increased frequencies of sister chromatid exchanges at common fragile sites (l)(q42) and (19)(ql3) Hum Genet 83:145-147. [22] Ferguson-Smith, M. A., and Yates, J. R. W. (1984) Maternal age specific rates for chromosome aberrations and factors influencing them: report of a collaborative european study on 52 965 amniocenteses. Prenat Diagn 4:5-44. [23] Fienberg, S. E. (1980) The Analysis of Cross-Classified Categorical Data. Massachusetts: The MIT Press. [24] Friedman, J. M., Smith, J. P., Lerner, B. N., Helgeson, J. S., Howard-Peebles P. N., Mize, C. E., Mize, S. G., Singleton, W. L., and Smith, M. E. (1987) ReCAP: the registry of cytogenetic abnormalities and phenylketonuria. Am J Med Genet 27:325-336.  References  141  Friedrich, U., and Nielsen, J. (1974) Autosomal reciprocal translocations in newborn children and their relatives. Humangenetik 21:133-144. Funderburk, S. J., Spence, A. M., and Sparkes, R. S. (1977) Mental retardation associated with "balanced" chromosome rearrangements. Am J Hum Genet 29:136141. Glover, T. W., and Stein, C. K. (1987) Induction of sister chromatid exchanges at common fragile sites. Am J Hum Genet 41:882-890. Glover, T. W., and Stein, C. K. (1988) Chromosome breakage and recombination at fragile sites. Am J Hum Genet 43:265-273. Hamerton, J. L., Canning, N., Ray, M., and Smith, S. (1975) A cytogenetic survey of 14,069 newborn infants. Clin Genet 8:223-243. Hecht, F., and Hecht, B. K. (1984) Fragile sites and chromosome breakpoints in constitutional rearrangements I. Amniocentesis. Clin Genet 26:169-173. Hecht, F., and Hecht, B. K. (1984) Fragile sites and chromosome breakpoints in constitutional rearrangements II. Spontaneous abortions, stillbirths, and newborns. Clin Genet 26:174-177. Hook, E. B., and Cross, P. K. (1987) Rates of mutant and inherited structural cytogenetic abnormalities detected at amniocentesis:results on about 63 000 fetuses. Ann Hum Genet 51:27-55. Hook, E. B., Schreinemachers, D. M., Willey A. M., and Cross, P.K. (1983) Rates of mutant structural chromosome rearrangements in human fetuses: Data from prenatal cytogenetic studies and associations with maternal age and parental mutagen exposure. Am J Hum Genet 35:96-109. "ISCN (1985) An international system for human cytogenetic nomenclature. Cytogenet Cell Genet 40:50-57. Jacobs, P. A. (1977) Population survellance: a cytogenetic approach. In: Morton, N. E., and Chung, C. S. (eds) Genetic Epidemiology New York: Academic Press. Jacobs, P. A., Melville, M., Ratcliffe, S., Keay, A. J., and Syme, J. (1974) A cytogenetic survey of 11,680 newborn infants. Ann Hum Genet 37:359-376. Jalbert, P., Jalbert., H., and Sele, B. (1988) Types of imbalances in human reciprocal translocations: risks at birth. In: Daniel, A. (ed.) The Cytogenetics of Mammalian Autosomal Rearrangements. Vol. 8 in Sandberg, A. A. (series ed.) Progress and Topics in Cytogenetics pp:267-291. New York: Alan R. Liss, Inc. [38] Jalbert, P., and Sele, B. (1979) Factors predisposing to adjacent 2 ans 3:1 disjunctions: study of 161 human reciprocal translocations. J Med Genet. 16:467  References  142  [39] Jalbert, P., Sele, B., and Jalbert, H. (1980) Reciprocal translocations: a way to predict the mode of imbalanced segregation by pachytene-diagram drawing. Huma Genet 55:209-222. [40] Kaiser, P. (1988) Pericentric Inversions: Their problems and clinical significance. In: Daniel, A. (ed.) The Cytogenetics of Mammalian Autosomal Rearrangements. Vol. 8 in Sandberg, A. A. (series ed.) Progress and Topics in Cytogenetics pp:163247. New York: Alan R. Liss, Inc. [41] Larsen, R. J., and Marx, M. L. (1981) An Introduction to Mathematical and Its Applications. Englewood Cliffs, New Jersey: Prentice-Hall  Statistics  [42] Lin, C. C, Gedeon, M. M., Griffith, P., Newton, D. R., Wilkie, L., and Sewell, L. M. (1976) Chromosome analysis on 930 consecutive newborn children using quinacrine fluorescent banding technique. Hum Genet 31:315-328. [43] Madan, K. (1988) Paracentric inversions and their clinical implications. In: Daniel, A.  (ed.)  The  Cytogenetics  Sandberg, A. A. (series ed.) York: Alan R. Liss, Inc.  of Mammalian Progress  and  Autosomal Topics  in  Rearrangements. Cytogenetics  Vol.  8  in  pp:249-266. New  [44] Martin, R. H. (1988) Abnormal spermatozoa in human translocation and inversion carriers. In: Daniel, A. (ed.) The Cytogenetics of Mammalian Autosomal Rearrangements. Vol. 8 in Sandberg, A. A. (series ed.) Progress and Topics in Cytogenetics pp:397-417. New York: Alan R. Liss, Inc. [45] Martin, R. H., Rademaker, A. W., Hildebrand, K., Long-Simpson, L., Peterson, D., and Yamamoto, J. (1987) Variation in the frequency and type of sperm chromosome abnormalities among normal men. Hum Genet 77:108-114. [46] Maserati, E., Pasquali, F., and Peretti, D. (1986) Different break-points in Philadelphia chromosome variant translocations and in constitutional and sporadic translocations. Ann Hum Genet 50:153-162. [47] Mattei, M. G., Souiah, N., and Mattei, J. F. (1979) Distribution of spontaneous chromosome breaks in man. Cytogenet Cell Genet 23:95-102. [48] Mendenhall, W., Schaeffer, R. L., & Wackerly, D. D. (1986) Mathematical with Applications. 3rd edition. Boston: Duxbury Press.  Statistics  [49] Nielsen, J., and Rasmussen, K. (1976) Distribution of breakpoints in reciprocal translocations in children ascertained in population studies. Hereditas 82:73-77. [50] Nielsen, J., and Sillesen, I. (1975) Incidence of chromosome aberrations among 11 148 newborn children. Humangenetik 30:1-12.  References  143  [51] Palmer, C. G. (1981) Are there "hot spots" on the human genome? Evidence from breakpoint analysis in a collaborative study of germinal chromosome rearrangement. In Population and Biological Aspects of Human Mutation (ed E. B. Hook and I. H. Porter), pp. 147-165. New York: Academic Press. [52] Pearson, E. S., and Hartley, H. 0. eds. (1966) Biometrika vol. 1. Cambridge: Cambridge University Press.  Tables  for  Statisticians  [53] Porfirio, B., Dallapiccola, B., and Terrenato, L. (1987) Breakpoint distribution in constitutional chromosome rearrangements with respect to fragile sites. Ann Hum Genet 51:329-336. [54] Runyon, R. P. (1985) Fundamentals of Statistics Health Sciences. Boston: Duxbury Press.  in  the  Biological,  Medical,  and  [55] Savage, J. R. K. (1977) Assignment of aberration breakpoints in banded chromosomes. Nature 270:513-514. [56] Schinzel, A. (1984) Catalogue Berlin:Walter de Gruyter.  of  Unbalanced  Chromosome  Aberrations  in  Man.  [57] Schinzel, A. (1988) Phenotype in autosomal chromosome aberrations: distinctiveness, variability, and karyotype correlations. In: Daniel, A. (ed.) The Cytogenetics of Mammalian Autosomal Rearrangements. Vol. 8 in Sandberg, A. A. (series ed.) Progress and Topics in Cytogenetics pp:725-738. New York: Alan R. Liss, Inc. [58] Schwartz, S., Palmer, C. G., and Yu, P.-L. (1982) Evaluation of factors differentiating translocations ascertained in couples with fetal wastage and translocations ascertained through an unbalanced carrier. Am J Hum Genet 34:142A [59] Schwartz, S., Palmer, C. G., Yu, P.-L., Boughman, J. A., and Cohen, M. M. (1986) Analysis of translocations observed in three different populations. I. Reciprocal translocations. Cytogenet Cell Genet 42:42-52. [60] Schwartz, S., Palmer, C. G., Yu, P.-L., Boughman, J. A. and Cohen, M. M. (1986) Analysis of translocations observed in three different populations. II. Robertsonian translocations. Cytogenet Cell Genet 42:53-56. [61] Searle, J. B. (1988) Selection and robertsonian variation in nature: the case of the common shrew. In: Daniel, A. (ed.) The Cytogenetics of Mammalian Autosomal Rearrangements. Vol. 8 in Sandberg, A. A. (series ed.) Progress and Topics in Cytogenetics pp:507-531 New York: Alan R. Liss, Inc. [62] Smith, C. A. B. (1986), Chi-squared test with small numbers. Ann Hum Genet. 50:163-167.  144  References  [63] Stene, J. and Stengel-Rutkowski, S. (1988) Genetic risks of familial reciprocal and Robertsonian translocation carriers. In: Daniel, A. (ed.) The Cytogenetics of Mammalian Autosomal Rearrangements. Vol. 8 in Sandberg, A. A. (series ed.) Progress and Topics in Cytogenetics pp:3-72. New York: Alan R. Liss, Inc. [64] Stoll, C. (1980) Nonrandom distribution of exchange points in patients with reciprocal translocations. Hum Genet 56:89-93. [65] Sturtevant, A. H., and Beadle, G. W. (1936) The relation of inversion in the X chromosome of Drosophila Melanogaster to crossing over and disjunction. Genetics 21:554-604. [66] Sutherland, G. R., and Hecht, F. (1985) Fragile Sites on Human Chromosomes. Oxford Monographs on Medical Genetics, No. 13. New York: Oxford University Press. [67] Sutherland, G. R., and Ledbetter, D. H. (1989) Report of the committee on cytogenetic markers. Human Gene Mapping 10. Cytogenet Cell Genet 51:452-458 [68] Sutton, H. E. (1988) Human  Cytogenetics.  Harcourt Brace Jovanich, Inc.  [69] Therman, E., Susman, B., and Denniston, C. (1989) The nonrandom participation of human acrocentric chromosomes in Robertsonian translocations. Ann Hum Genet 53:40-65. [70] Vasarhelyi, K., and Friedman, J. M. (1989) Analysing rearrangement breakpoint distributions by means of binomial confidence intervals. Ann Hum Genet 53:375380. [71] Warburton, D. (1984) Outcome of cases of de novo structural rearrangements diagnosed at amniocentesis. Prenat Diagn 4:69-70. [72] Yu, C. W., Borgaonkar, D. S., Boiling, D. R. (1978) Break points in human chromosomes. Hum Hered 28:210-225. [73] Yunis, J. J., and Soreng, A. L. (1984) Constitutive fragile sites and cancer. Science 226:1199-1204.  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0098346/manifest

Comment

Related Items