UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Statistical study of human constitutional chromosome rearrangement breakpoint distributions Vásárhelyi, Krisztina 1990

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1990_A6_7 V37.pdf [ 9.21MB ]
Metadata
JSON: 831-1.0098346.json
JSON-LD: 831-1.0098346-ld.json
RDF/XML (Pretty): 831-1.0098346-rdf.xml
RDF/JSON: 831-1.0098346-rdf.json
Turtle: 831-1.0098346-turtle.txt
N-Triples: 831-1.0098346-rdf-ntriples.txt
Original Record: 831-1.0098346-source.json
Full Text
831-1.0098346-fulltext.txt
Citation
831-1.0098346.ris

Full Text

STATISTICAL STUDY OF H U M A N CONSTITUTIONAL CHROMOSOME R E A R R A N G E M E N T BREA K P O I N T DISTRIBUTIONS By Krisztina Vasarhelyi B. Sc. (Biology) University of British Columbia A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE FACULTY OF GRADUATE STUDIES GENETICS We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA September 1990 © Krisztina Vasarhelyi, 1990 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Genetics The University of British Columbia 2075 Wesbrook Place Vancouver, Canada V6T 1W5 Date: Abstract In this study the question of nonrandomness in the distribution of human constitutional rearrangements was evaluated. The distribution of breakpoints were analysed in three groups of reciprocal translocations and three groups of inversions, subdivided according to method of ascertainment of cases for study. In addition, one data set of structural aberrations obtained from sperm chromosomes was also analysed. The method of statistical analysis, based on the binomial distribution, was developed specifically to allow testing distributions in chromosome segments as small as chromosome bands. The distribution of breakpoints was analysed in all data sets using this method, in addition to testing for overall nonrandomness using goodness of fit statistics. Nonrandomness in breakpoint distributions was found in reciprocal translocations (rep) and inversions ascertained through abnormalities and through incidental events. However, random distribution was observed in incidentally ascertained de novo rear-rangements as well as in sperm chromosome aberrations. The nonrandomness in the distribution of rep breakpoints can be largely attributed to a bias in ascertainment of cases based on the phenotypic manifestations of chromosomal imbalance resulting from a rearrangement. A dependence of the probability of producing specific types of balanced or unbalanced progeny on the position of breakpoints is a likely explanation for the nonrandomness produced in breakpoint distributions. However, some bands including, 5q35, 7p22, 9p22, 13ql4, and 17q25, were observed in different ascertainment groups, excluding selection bias as a likely explanation for this observation. These bands may represent true sites of nonrandom rearrangement due to some factor associated with an underlying DNA sequence or structural characteristic of chromatin n that predisposes to rearrangement at specific sites. The nonrandomness observed in the distribution of inversion breakpoints is most likely the product of a founder effect. Many identical inversions in apparently unrelated individuals have been found suggesting that a few ancestral mutations have become widespread in the population. A large data set of incidentally ascertained de novo inver-sions is required to distinguish between sites of frequent breakage and nonrandomness produced by the ascertainment of related cases. All evidence considered together, indisputable predisposition to rearrangement at specific sites was not found in this study. Furthermore, an overall random association of constitutional rearrangement breakpoints in bands with known oncogenes and fragile sites was observed. However, the possibility of oncogenes and fragile sites as factors involved in constitutional rearrangements in a few isolated cases cannot be excluded. Nonrandomness was found when distribution of breakpoints in light and dark G bands was compared. An excess of breakpoints in some light G bands was observed even after a conservative correction for a possible pattern recognition bias which may lead to the overascertainment of breakpoints in light G bands. m Table of Contents Abstract ii List of Tables viii List of Figures x Acknowledgement xi 1 Introduction 1 1.1 General Introduction 1 1.1.1 Human Chromosome Rearrangements 2 1.1.2 Meiotic Segregation in Rep and Inversion Heterozygotes 4 1.2 Nonrandomness in Constitutional Rearrangements . 7 1.2.1 Review of Breakpoint Distribution Studies 8 1.2.2 Approach to Testing for Nonrandomness in the Present Study . . 13 1.2.3 A Note on "Hot Spots" 19 2 Data Sources and Methods 2 1 2.1 Overview of Data Analysis 21 2.2 Sources of Data 23 2.3 Organization Of Data In Ascertainment Groups 25 2.3.1 Ascertainment of Rearrangements in the Original Studies . . . . . 26 2.3.2 Definition of Ascertainment Groups Used for Classification of Re-arrangements in the Present Study 30 iv 2.4 Elimination of Duplicate Cases 34 2.5 Statistical Analysis 35 2.5.1 Mathematical Model of Chromosome Breakage 35 2.5.2 Hypotheses of Random Breakage 36 2.5.3 Binomial Confidence Limits 38 2.5.4 Testing for Nonrandomness Using Binomial Confidence Limits . . 39 2.5.5 Comparison of Results Between Data Sets 40 2.5.6 Summary of Statistical Analysis 41 3 Results 43 3.1 Reciprocal Translocations 43 3.1.1 Rep Ascertained Through Abnormalities (Al) 43 3.1.2 Incidentally Ascertained Balanced Rep (A2) 47 3.1.3 Incidentally Ascertained De Novo Rep (A3) 49 3.1.4 Comparison of Results: Rep Ascertained Through Abnormalities (Al) and Rep Ascertained Incidentally (A2 & A3) 49 3.1.5 Distribution of Rep Breakpoints in Dark and Light Bands . . . . 55 3.1.6 Association of Rep Breakpoints with Fragile Sites and Oncogenes 55 3.2 Inversions . 57 3.2.1 Inversions Ascertained Through Abnormalities (Bl) 57 3.2.2 Incidentally Ascertained Balanced Inversions (B2) 59 3.2.3 Incidentally Ascertained De novo Inversions (B3) 60 3.2.4 Comparison of Results: Inversions Ascertained Through Abnor-malities (Bl) and Inversions Ascertained Incidentally (B2 &: B3) . 62 3.2.5 Distribution of Inversion Breakpoints in Dark and Light Bands . . 66 v 3.2.6 Association of Inversion Breakpoints with Fragile Sites and Onco-genes 66 3.3 Sperm Chromosome Aberrations 68 3.4 Comparison df Results: Rep (Group A), Inversions (Group B), and Sperm Chromosome Aberrations (Group C) 69 4 Discussion and Conclusions 70 4.1 Distribution of Breakpoints in Constitutional Rearrangements 70 4.1.1 The Effects of Ascertainment Bias 71 4.1.2 The Effects of Chance Fluctuations 77 4.1.3 Candidate Sites of True Nonrandom Involvement 79 4.2 Evidence for Random Breakage 81 4.3 Distribution of Breakpoints in Light and Dark Bands 84 4.4 Coincidence with Fragile Sites and Oncogenes 85 4.5 Improvements for Future Analysis 88 4.6 Conclusions 90 Appendices 92 A Rearrangements Associated W i t h Bands of Frequent Breakage 92 A.l Lists of Rep Associated with Bands of Frequent Breakage 92 A. 2 Lists of Inversions Associated with Bands of Frequent Breakage 106 B Band Measurements 111 B. l Band measurements at 320 band resolution 112 B.2 Band Measurements for Sperm Chromosomes 119 C Computer Programs 126 vi C.l Checking for Invalid Bands 126 C.2 Checking for Duplicate Rearrangements 127 C.3 Statistical Analysis Using Binomial Confidence Limits 130 C.4 Testing for Nonrandomness and Homogeneity 134 References 139 vii List of Tables 1.1 Summary of Previous Studies of Rep 9 1.2 Summary of Previous Studies of Pooled' Structural Rearrangements . . . 10 2.1 Ascertainment Groups in the Original Studies for Rep Detected Through an Abnormalities 27 2.2 Ascertainment Groups in the Original Studies for Inversions Ascertained Through Abnormalities 28 2.3 Ascertainment Groups in the Original Studies for Rep Detected Incidentally 29 2.4 Ascertainment Groups in the Original Studies for Inversions Ascertained Incidentally 29 2.5 Sources of Data 32 3.1 Tests for Nonrandomness in Rep Ascertained Through Abnormalities (Group Al) 44 3.2 Hot Spot Bands in Rep Ascertained Through Abnormalities (Group Al) 45 3.3 Cold Spots in Data Set a 46 3.4 Tests for Nonrandomness in Rep Ascertained Incidentally (Group A2) . . 47 3.5 Hot Spot Bands in Rep Ascertained Incidentally (Group A2) 48 3.6 Hot Spot Bands in Pooled Rep Data 51 3.7 Cold Spots in Rep Ascertained Through Abnormalities (Al) 52 3.8 Distribution of Rep Breakpoints in Dark and Light G Bands 56 3.9 Tests for Nonrandomness in Inversions Ascertained Through Abnormalities 57 vm 3.10 Hot Spot Bands in Inversions Ascertained Through Abnormalities (Group Bl) 58 3.11 Tests for Nonrandomness in Inversions Ascertained Incidentally 60 3.12 Hot Spot Bands in Inversions Ascertained Incidentally (Group B2) . . . . 61 3.13 Hot Spot Bands in Pooled Inversion Data 63 3.14 Distribution of Inversion Breakpoints in Dark and Light G Bands . . . . 67 3.15 Hot Spot Bands in Sperm Chromosome Aberrations 68 ix L i s t of F i g u r e s 3.1 Distribution of Rep Breakpoints Ascertained Through Abnormalities (Group Al) 53 3.2 Distribution of Rep Breakpoints Ascertained Incidentally (Group A2) . . 54 3.3 Distribution of Inversion Breakpoints Ascertained Through Abnormalities (Group Bl) 64 3.4 Distribution of Inversion Breakpoints Ascertained Incidentally (Group B2) 65 x Acknowledgement I would like to give special thanks to my supervisor, Dr. J . M. Friedman, who helped make the past two years an interesting and rewarding experience by providing all the support and expertise I needed, while always encouraging me to think independently. I would also like to thank the members of my committee for their valuable comments, Dr. Roy Douglas and Dr. Glen Cooper for consultations regarding the statistical analysis, and many members of the Physics Department at ETH (Zurich, Switzerland) for access to their facilities and expert advice on computing problems. I would like to thank my husband, Sandy Rutherford, for writing the programs in Appendix C, and for valuable discussions throughout the project. I am especially grateful for his love and unfailing interest in my work. I would like to express my appreciation to Lynn Bernard for her interest and friend-ship, and my family and all my friends for their patience and support. xi Chapter 1 Introduction 1.1 General Introduction This thesis describes a statistical study of human constitutional chromosome rearrange-ment data to determine whether breakpoints i n two types of rearrangements, reciprocal translocations (rep) and inversions, are distributed in any specific nonrandom fashion on the chromosomes. As described i n the following, rearrangements may lead to imbalances of genetic material in an individual , which can have significant detrimental manifesta-tions, such as mental retardation and other major and minor congenital defects [57]. Definition and subsequent characterization of sites of nonrandom involvement may represent an init ial step in the identification of the factors involved in the production of constitutional rearrangments. For this purpose, we attempted to assemble data sets of rearrangements which are as representative of a random sample of constitutional rep or inversions as possible, and tested for nonrandomness i n the distribution of breakpoints. Furthermore, we tested for associations of these points of nonrandom involvement with known fragile sites and oncogenes in order to test the hypotheses that these events have a role in the generation of constitutional rearrangements. The following sections give a general introduction to human constitutional rearrange-ments, and the aberrant segregation processes at meiosis, which ultimately affect the ascertainment of rearrangements for breakpoint distribution studies. Section 1.2 is dedi-cated to a more specific introduction to breakpoint distribution analysis as well as a brief 1 Chapter 1. Introduction 2 review of previous work on the subject. 1 . 1 . 1 Human Chromosome Rearrangements In rare instances one or more human chromosomes undergo structural rearrangment in a germline cell of an individual. These rearrangments may involve breakage of chromo-some arms at one or more sites and the subseqent loss or reattachment of the separated fragment in an inappropriate position or orientation. A wide variety of chromosome rearrangements have been observed in humans. In its simplest form, a breakage of a chromosome arm is followed by the subsequent loss of a chromosome fragment in simple deletions. More than a single break on one chromosome may lead to interstitial deletion, inversion, insertion, or duplication of chromosome mate-rial, or the formation of a ring chromosome. Breakages involving two chromosomes may result in insertion of an interstitial segment of one chromosome at a breakpoint on an-other chromosome, or in exchanges of chromosome fragments in reciprocal translocations. Occasionally complex rearrangements are observed involving 3 or more chromosomes. In this project reciprocal translocations and inversions were studied, both involving 2 break-points on two different and on a single chromosome respectively. Reciprocal translocation is the most common form of structural chromosome abnor-mality in humans. In a study of pooled data from surveys of 59,452 consecutively born infants, 52 cases of balanced rep were found [35]. This translates to a frequency of 0.87 in 1000 live births. Rep are formed subsequent to breakage of two nonhomologous chromosomes followed by the exchange and reunion of end fragments. The two new chro-mosomes are refered to as derivatives (der) of the original chromosomes depending on the centromeric fragment retained in the exchange [34]. A special type of rep, Robert-sonian translocation, involves breakpoints at the centromeres of two of the five pairs of acrocentric chromosomes (13, 14, 15, 21, or 22) followed by an exchange resulting in the Chapter 1. Introduction 3 union of two long and two short arms. The tiny chromosome formed by the union of short arms is frequently lost in subsequent cell divisions. Robertsonian translocations are known to be important in chromosome evolution and they may have a role in speciation [61]. Since Robertsonian translocations are nonrandomly restricted to centric fusions of the acrocentric chromosomes, they were not included in this study. The issue of nonran-domness in these rearrangements has been examined by other investigators (for example [60] [69]). Inversions are less common than rep in liveborn infants. The frequency of inversions in newborn survey data was found to be 0.15 in 1000 live births [35]. Inversions may be produced if an interstitial segment defined by two breakpoints on the same chromosome is inverted 180° and the breakpoints are subsequently repaired in the reversed orientation. In pericentric inversions breakpoints are on different arms of the chromosome including the centromere in the inverted segment. As a result, the chromosome arm ratio is often altered. The arm ratio is unaffected in paracentric inversions because both breakpoints are located on the same chromosome arm, leaving centromere position unchanged [68]. Rep and inversions in their balanced forms are compatible with a normal phenotype. Rearranged chromosomes may be passed on to an offspring and become constitutionl parts of all cells in that individual. The rearrangements are transmitted undisturbed in mitosis. However, meiotic segregation of chromosomes in both balanced rep and inversion heterozygotes can lead to the production of unbalanced gametes with duplications and deletions of specific chromosome segments. The phenotypic consequences of chromosomal imbalance can range from lethality of sperm or ovum to abnormalities in a liveborn offspring. This variability in the influence of chromosomal imbalances on the viability and phenotype of carriers is of direct concern in the selection of a random sample of cases for the study of nonrandomness in breakpoint distributions. Therefore, the principles of how imbalances arise in rep and inversion are considered in the following section, prior Chapter 1. Introduction 4 to a discussion of ascertainment bias in section 1.2.2. 1.1.2 Meiotic Segregation in Rep and Inversion Heterozygotes Rearrangement of chromosomes through reciprocal translocation and inversion leads to problems in homologous pairing in meiosis. Abnormal pairing can lead to abnormal segregation resulting in chromosomally unbalanced gametes. Meiotic Segregation in Balanced Rep Carriers Meiotic pairing in a balanced rep carrier is possible only through a quadriradial con-figuration involving both derivative chromosomes and their normal counterparts (see figure 9.7 in [68]). This quadriradial structure of 4 chromosomes may undergo 2:2, 3:1, or 4:0 disjunction to produce various combinations of the 4 chromosomes at the two poles. In alternate segregation chromosomes positioned diagonally in the quadriradial structure move to opposite poles. This type of segregation results in balanced offspring with either two normal or two derivative chromosomes (see figure 14 in [63]). The other 2:2 segregation types always lead to imbalances of chromosome material. In adjacent-1 segregation neighbouring nonhomologous centromeres and in adjacent-2 segregation ho-mologous centromeres move to the same pole. In both cases, one derivative and one normal chromosome segregate together resulting in combined duplications and deficien-cies of specific chromosome segments defined by the rep breakpoint. In 3:1 disjunction, trisomies of specific chromosome segments are produced through either segregation of 2 normal chromosomes together with one derivative (tertiary trisomy), or through segre-gation of 2 derivative chromosomes with one of the normal chromosomes (interchange trisomy). In liveborn offspring, the most common segregation type is adjacent-1, 3:1 is less frequent, and adjacent-2 is rarely seen [38] [37]. The same order in the frequency of segregation types was observed in sperm [44]. The 4:0 disjunction type has never Chapter 1. Introduction 5 been seen in live births or abortions, but it was observed in one case in the sperm of a carrier of two balanced rep [6]. In addition to the segregation outcomes described here further variations are possible if chiasma formation and crossing over takes place in the quadriradial structure. Imbalances of chromosome material arising through aberrant segregation has a direct influence on the viability potential of the resulting segregation product, and that in turn affects the probability that the rearrangement is detected in a given population, such as abnormal live births or in a survey of normal individuals. This has direct relevance to studies of breakpoint distributions which require that the rearrangement studied repre-sent a random sample of all possible rep that are formed. Therefore, it is important to understand the factors determining viability potentials of unbalanced rep. There is extensive evidence that chromosomal imbalance leads to abnormalities, but what characteristics of an imbalance determine the extent of abnormality is debated. Intuitively, larger chromosomal imbalance is expected to lead to greater abnormalities and is thought to be less likely to result in live birth. Instead, large imblances are expected to lead to conceptuses that fail to implant or that are aborted at various stages of gestation. In studies of rep ascertained through live births and abortions, the size of the imbalance was found on average to be smaller in the former than in the latter group [1] [3] [9] [13] [59]. However, this relationship is not a straight correlation. Specific chromosome regions have been observed to have decreased tolerance to imbalances compared to other regions, indicating that genetic content is an important factor [20] [39]. In fact, in one study, the average size of imbalance was found not to be significantly different for rep ascertained through an abnormality and rep ascertained through recurrent abortions [10]. Additional factors that possibly contribute to viability potentials of unbalanced rep through physically affecting meiotic segregation or through imposing selective forces include breakpoint and centromere position, comparative sizes of interstitial segments, Chapter 1. Introduction 6 chiasma frequency and position, formation of chain or ring multivalents, and comparative sizes of the chromosomes involved (see [9] for details). The parental origin of unbalanced segments may also possibly affect viability potentials of segregation products through a genomic imprinting mechanism. Meiotic Segregation in Balanced Inversion Carriers In contrast to rep, unbalanced gametes in inversion carriers are produced only when crossing over occurs in the inverted segment. The probability of producing unbalanced chromosomes is dependent on the likelihood of pairing between an inversion chromosome and its normal homologue [40]. The likelihood of pairing is in turn dependent on the length of the inversion segment. Very small inversions do not pair in the inverted seg-ment and therefore recombination cannot take place [43]. Crossover supression in the inversion region in the Drosophila male with no effect on fertility is a well known example of a comparable process [65]. Inversions of intermediate size achieve pairing and chiasma formation through a loop configuration of the homologoues. In the loop the inversion segment lines up with the homologous segment in the correct orientation. Loop forma-tion does not occur in very large inversions with breakpoints at opposite ends, near the telomeres. In such cases the inversion segment pairs and pairing is absent in the two end segments [40]. An uneven number of crossovers in the inverted segment produces unbalanced chromosomes. A single crossover in a paracentric inversion loop results in a dicentric chromosome and an acentric fragment, with duplications and deficiencies of spe-cific segments. The uninvolved chromatids produce balanced chromosomes, one normal and one with the inversion. In pericentric inversions, a single crossover produces two un-balanced and two balanced chromosomes, all monocentric. Additional crossover events may produce additional imbalances, such as 100% unbalanced products in a 4 strand double crossover event [43]. Crossovers outside the inversion do not produce imbalances. Chapter 1. Introduction 7 As in the case of rep, the extent of imbalance in recombinant chromosomes deter-mines the risk of a liveborn abnormal offspring to an inversion heterozygote. In general, larger imbalances are less compatible with live birth. Furthermore, unbalanced offspring may be aborted at various stages of gestation, or prior to implantation in some cases. An imbalance may also be disruptive to the meiotic process preventing the production of gametes [40]. In pericentric inversions recombination in a small inversion segment produces large imbalances (see Figure 7 in [40]). In paracentric inversions the acentric fragment produced by crossing over in an inversion loop is often lost leading to deficien-cies of the corresponding chromosome segments. The dicentric chromosome may break at anaphase and the resulting imbalance is determined by the breakpoint. In general, paracentric inversions that are large enough to produce pairing loops but involve only a small portion of the chromosome produce large imbalances. 1.2 Nonrandomness in Constitutional Rearrangements Constitutional rearrangements may arise through random breakage of chromosomes at two or more sites followed by repair of breakpoints joining the wrong fragments or the the original fragment in the wrong orientation. Alternatively, the process of chromosome breakage, the union of fragments, or both these processes may be nonrandom. There may be sites on the genome that are predisposed to breakage due to some characteristic of the DNA sequence or of the higher order chromatin structure. Random breakage together with preferential participation of specific combinations of breakpoints in rearrangements, due perhaps to sequence similarities, would also lead to nonrandomness in the distribution of rearrangement breakpoints The development of chromosome banding techniques has sparked an interest in defin-ing specific sites of nonrandom breakage in constitutional rearrangements. Similar efforts Chapter 1. Introduction 8 in cancer rearrangements resulted in the association of several rearrangements with cer-tain types of malignancies. In constitutional rearrangements however, the question of nonrandomness has been difficult to resolve. The problems primarily involve the selec-tion of an appropriate study population. In addition, conventional methods of statistical analysis proved inadequate for dealing with precise localization of sites of nonrandomness allowed by the improvements in banding techniques. In the following, previous studies on the subject are briefly reviewed and the approach employed in the present study is outlined. 1.2.1 Review of Breakpoint Distribution Studies The need to deal with problems of ascertainment bias in order to assess properly the question of nonrandom breakage has been recognized in early studies of breakpoint dis-tribution analysis [36] [51]. Consequently, various approaches have been employed with respect to subdivision of data according to modes of ascertainment, and types of rear-rangements included in the data. The most common criterion for subdivision of data was according to ascertainment through unbalanced carriers, balanced carriers, or recur-rent abortions. However, the exact definitions of these groups were not identical in each study, probably contributing to the variation observed in the results. The majority of studies involved reciprocal translocations, as it is the most common form of structural rearrangement [35]. In some studies, all rearrangements involving breakage were ana-lyzed as a single group. Very few studies on the distribution of inversion breakpoints have been done. The methods and results of previous studies are summarized in table 1.1 for reciprocal translocations and in table 1.2 for all structural rearrangements. Several observations can be made in comparing previous studies: (1) Nonrandom-ness in the distribution of breakpoints is repeatedly observed. The exception to this are rearrangements detected in surveys of consecutive newborns [49] (not shown in tables) Chapter 1. Introduction 9 Rep Ascertained Through Unbalanced Carrier Reference # of Breaks Results Statistics Excess Deficit Schwartz, et. al., 1986 326 9p, 14p, 18p | none z test Rep Ascertained Through Balanced Carrier Reference / of Breaks Results Statistics Excess Deficit Jacobs, et. al., 1974 84 l l q none by observation Aur ias , et. al., 1978 106 4p, 9p, lOq 21q, 22q l p , 2p, 6q X 2 test Davis, et al., 1985 420 9, 21, 22 1, 2, 3, 6 8, 17, 19 plot 95% confidence interval Schwartz, et al., 1986 292 l l q none z test Rep Asceriainec Through Recurrent Abortions Reference # of Breaks Results Statistics Excess Deficit Campana, et. al., 1985 312 6, 7, 22 12 X 2 test Davis, et al., 1985 190 none none plot 95% confidence interval Schwartz, et al., 1986 256 7p, 17p, 22p none z test Rep Ascertained Through Various Means Reference # of Breaks Results Statistics Excess Deficit Stol l , 1980 770 4p, 9p, 9q 13q, 18q, 21p, 22p l p , l q , 3p, 3q, 5q, 6q 7p, 12p, 16p Y p , Y q , X p , X q Analys is of X 2 components Palmer, 1981 353 9, 18, 21. 1, 19 X 2 test Maserat i , et al., 1986 213 I q l l , 2p l 3 2p21, 9p22 l l q 23 , 1 5 q l l 15ql3, 1 8 q l l 18q23, 2 0 q l l 21q22, 2 2 q l l Xq22 not reported X 2 test for smal l numbers (see [62]) Table 1.1: Summary of results and methods from previous studies of breakpoint distri-butions in reciprocal translocations. Chapter 1. Introduction 10 Structural Rearrangements Ascertained Through Various Means Reference # of Breaks Results Statistics Excess Deficit Y u , et. al., 1978 1134 9, 13, 18, 21 2, 3, 6, 16 Regression 22, Y 19, 20 Analys is Palmer, et al., 1981 615 18, 21, X 1, 19 X2 test Porf ir io, et a l , 1987 6391 2q l3 , 3p25 5p l5 , 5p l 3 5 p l l , 9p24 9p22, 9 p l l 9 q l l , 9q l3 l l q l 3 , l l q 2 5 13ql4, 1 5 q l l 1 8 p l l , 18q21 2 1 q l l , 21q22 2 2 q l l , 22q l3 Xp22, X p l l Y p l l not reported Regression Analys is Table 1.2: Summary of results and methods from previous studies of breakpoint distri-butions in pooled data on structural rearrangements. Chapter J. Introduction 11 where random breakage was found. (2) The level of agreement with respect to chromo-somes frequently involved in breakage varies according to ascertainment group as well as between studies utilizing similar ascertainment criteria. (3) A number of chromosomes with excess breakpoints, including 9, 18, 21 and 22 are repeatedly observed in different studies, in some cases independent of mode of ascertainment and rearrangement type. (4) Results are reported at various levels of resolution ranging from whole chromosomes to chromosome bands at the 320 band level. This reflects a lack of ability of the statistical methods used, to detect deviations from nonrandomness in small chromosome segments. The variation in results between studies utilizing similar ascertainment criteria may be attributed to the differences in interpretation of specific ascertainment groups. For example, balanced carriers have been considered suitable for breakpoint distribution anal-yses because such individuals carry the full chromosome complement and do not usually have phenotypic manifestations of the rearrangement. However, balanced carriers can be ascertained in various situations including surveys of specific populations [36], in prenatal diagnosis [59], or through referal for cytogenetic study because of recurrent abortions [1] [13], possibly introducing systematic bias in selection of cases for study. The variation observed between studies using different ascertainment criteria is likely, at least in part, to be the product of this difference in case selection. Although advances in chromosome banding techniques allowed precise localization of breakpoints in chromosome bands, conventional statistical methods were frequently inadequate to test significant deviations from nonrandomness at that level (see tables 1.1 and 1.2). Sites of frequent breakage were identified by straight observation [36], or by other tests not adequate to detect nonrandomness at the level of chromosome bands, where the expected values are often very small [1] [7] [51] [59]. The % 2 test is frequently used in testing for overall deviations from nonrandomnes, but its sensitivity deteriorates as the number of classes increases [62], as is the case in testing for nonrandomness in Chapter 1. Introduction 12 chromosome bands. In addition, identification of specific segments that are nonrandomly involved in breakage can be ambiguous. However, some attempts have been successful in identifying nonrandomness at the chromosome band level using regression analysis [53] and a special % 2 [46] devised to deal with large number of classes and small expected numbers in the classes [62]. Based on the comparisons of previous studies, it appears necessary to define pre-cise criteria for selection of study cases in such a way as to reduce ascertainment bias maximally and also to find a method of statistical analysis that unambiguously detects deviations from random breakage at the level of chromosome bands. As explained in subsequent sections, we have defined ascertainment criteria that are based not strictly on the nature of the chromosome complement in probands. Instead we divided rearrangements based on their relationship to the method of ascertainment. Accordingly, balanced rearrangements ascertained through reproductive failure, including infertility and recurrent spontaneous abortions, were classified together with unbalanced rearrangements associated with abnormalities in aborted fetuses, stillbirths, and live born infants. This group was compared to rearrangements that were ascertained for reasons unrelated to the presence of the rearrangement. This approach may help to determine if the similarities in results observed in tables 1.1 and 1.2 are the function of an underlying biological predisposition to breakage. Alternatively, it may help identify the sources of bias leading to these results. The method of statistical analysis used in our study is also different from previous methods in that it is specifically developed to deal with detecting nonrandom breakage in small segments, such as chromosome bands. For this purpose we used the binomial distribution as a model for chromosome breakage in bands, and used binomial confidence intervals to describe the observed distribution of breakpoints on the chromosomes without reference to a hypothetical distribution. Testing a hypothesis of breakage is a separate Chapter 1. Introduction 13 process. The novelty of this method, in addition to easily detecting nonrandomness in chromosome bands, is that it gives an overall picture of the observed distribution of breakpoints, instead of merely making a statement about the fit of the distribution to an arbitrary hypothesis. 1.2.2 Approach to Testing for Nonrandomness in the Present Study In order to explore the question of nonrandom breakage and rearrangement, we felt it was necessary to address both the problem of ascertainment bias in breakpoint data and the inadequacy of statistical methods used to deal with breakpoint data. The following is a discussion of ascertainment bias which includes an explanation of some forms of bias dealt with previously by others, in addition to some theoretical ideas that await support by experimental demonstrations. Furthermore, some aspects of statistical testing that were considered in the development of our statistical method are also considered. Ascertainment Bias Interest in testing for nonrandom breakage was prompted by the advances in chromosome banding technology in the early 1970's, that allowed mapping of breakpoints to specific sites on the chromosome arm. However, it was also recognized that ascertainment bias in human data produces apparent nonrandomness in the distribution of breakpoints that is not due to a biological predisposition to nonrandom participation of specific sites in structural rearrangements [36], [51]. Cytogenetic data from clinical laboratories have been the most available source of information on rearrangements. Patients are referred for testing usually for some form of physical abnormality, mental retardation, recurrent abortions or infertility. When a rearrangement is found in these cases, either the individual is a carrier of an unbalanced rearrangement or a balanced carrier is ascertained as a result of pregnancy failure or Chapter 1. Introduction 14 infertility, likely to be due to unbalanced gametes or conceptuses. In most studies utilizing this source of information, nonrandom breakage has been found. Based on the discussion in section 1.1.2, the extent and nature of the imbalance influences the outcome of a pregnancy. Therefore, breakpoints of rearrangements ascertained through an unbalanced carrier or reproductive failure are expected to represent a subset of rearrangements more likely to be compatible with a specific phenotypic outcome, such as abnormalities in a live birth, or abortion of unbalanced offspring. For this reason, breakpoints from rearrangements ascertained through unbalanced carriers is considered to be a highly nonrandom sample. In contrast to unbalanced carriers, balanced carriers of rearrangements have a full complement of the normal chromosome set and they are usually phenotypically normal. Therefore, ascertainment bias relating to variations in viability potential and to other phenotypic manifestations of unbalanced rearrangements may be avoided in an analysis of breakpoints ascertained through normal balanced carriers. However, detection of balanced carriers may be biased as well if selection of cases is nonrandom. with respect to the location of breakpoints on the chromosomes. For example, balanced carriers ascertained through abnormalities [1], recurrent abortions [13], or population surveys of institutions, including mental or psychiatric hospitals, subfertility clinics, or prisons [36], where patients may be admitted for reasons that are related to their carrier status. Balanced carriers may also be detected in routine prenatal diagnosis screening procedures [59], or in surveys of consecutive newborns [36]. In table 1.1 results in the latter two populations vary from those of the former studies utilizing data ascertained mainly through phenotypic abnormalities. A possible explanation for this is that the balanced carriers in prenatal diagnosis screens and in surveys of newborns were frequently ascertained by chance, with no a priori suspicion of the presence of the rearrangement. Chapter 1. Introduction 15 Clearly, ascertainment through balanced carriers can represent various biased samples of rearrangements directly related to how a "balanced carrier" is defined. For this reason we did not subdivide rearrangements according to ascertainment through balanced or unbalanced carriers. Instead, we analyzed groups of rearrangements ascertained through abnormalities and rearrangements ascertained incidentally. The former group consisted of both unbalanced cases and balanced carriers if ascertainment was in any way related to phenotype or abnormal pregnancy outcome. Incidentally ascertained cases are all phenotypically normal carriers of balanced rearrangements (see below for further details). In the early 1970's the most readily available source of incidentally ascertained bal-anced rearrangements was surveys of consecutive newborns [36] [49]. Unfortunately, data from these studies are nevertheless insufficient for statistical analysis of breakpoint dis-tributions because the incidence of rearrangements in the population is relatively low. Furthermore, our interest in this study is to define breakpoint distributions at the 320 band level of resolution, which excludes newborn data from surveys that were carried out before chromosome banding techniques were used. Today, the most readily available source of incidentally ascertained information is pro-vided by cytogenetic studies of fetal samples obtained in prenatal diagnosis procedures. Routine prenatal diagnosis screening for trisomies in mothers of advanced maternal age is now widely available in many countries of Europe and North America. Unexpected balanced rearrangements are occasionally detected in such screens. Therefore, the sample of balanced carriers detected through a balanced prenatal diagnosis result represents a group of incidentally ascertained breakpoints similar to that in newborn surveys, with the exception that the balanced carrier is detected at the early stage of about 14 to 16 weeks of gestation in prenatal diagnosis, and later at birth in newborn surveys. In this study, incidentally ascertained balanced rearrangements from several large studies involving prenatal diagnosis data were analyzed in addition to some data obtained in Chapter 1. Introduction 16 newborn surveys. Based on the above discussion, subdivision of rearrangements according to ascer-tainment through abnormalities and incidental ascertainment seems superior to criteria frequently used in previous studies. However, incidentally ascertained balanced rear-rangements cannot be considered completely free of bias. Particularly, inherited rear-rangements ascertained incidentally in prenatal diagnosis are in all likelihood affected by ascertainment bias which is indirectly related to pregnancy histories of balanced carrier parents and thus have to be taken into consideration. Studies of rep in sperm chromosomes suggest that the known disjunctions and seg-regations types occur with unique frequencies for each rearrangement [44], which is pre-sumably at least to some extent the function of breakpoint position [63]. For inversions it is less clear clear what factors are involved in producing recombinants. It appears that the probability of crossing over in an inversion segment and producing unbalanced gametes, is related to the size of the segment which in turn is defined by the location of breakpoints [40] [43]. In conclusion, the position of breakpoints at least partially determines the probabil-ities for balanced and unbalanced inversion recombinants and rep segregation products respectively. Therefore, a subset of breakpoints from balanced rearrangements that have a higher probability of leading to balanced offspring are likely to be overrepresented in incidentally ascertained carriers. For this reason, we felt that it would be of interest to study samples of incidentally ascertained balanced de novo rep and inversions, and compare these results to similarly ascertained balanced inherited rearrangements. So far in this introduction, the focus has been on the forms of bias directly or indi-rectly related to abnormal meiotic segregation of rearranged chromosomes resulting in nonrandom ascertainment of cases for study. However, elimination of all forms of bias from this source would not ensure a completely random sample of breakpoints. In the Chapter 1. Introduction 17 discussion of balanced carriers, we assumed that no abnormalities are associated with balanced rearrangements. However, balanced rearrangements are thought to be associ-ated with abnormalities in some cases [26]. Since these individuals appear to carry a full chromosome complement, the abnormalities may result from very small submicroscopic damage to genes at the breakpoint site. For example, small deletions or duplications in genes may lead to abnormalities. Detrimental influence of the rearrangement on gene function may also arise from position effects. It is possible that some abnormalities involving breakpoints in essential genes are lethal at a very early stage and are never observed, removing a subset of breakpoints from the total sample of balanced rep break-points. Furthermore, there may be other rearrangements that are stable in sperm cells that do not undergo further mitotic division but are unstable in mitotic divisions in the zygote and these are also aborted at a very early stage. Nonrandom effects of this type cannot be dealt with in a study restricted to constitu-tional rearrangements. Therefore studies of sperm cells may be enlightening as to possible sites that are predisposed to breakage, or both breakage and subsequent rearrangements. We analyzed a small set of data points from sperm chromosome rearrangements Due to the various forms of bias affecting virtually all constitutional rearrangement data, we took a comparative analytical approach in evaluating nonrandomness in distribu-tions of constitutional rearrangement breakpoints. We have analyzed and compared data ascertained either through some form of abnormality (mainly from unbalanced carriers) and inherited and de novo balanced rearrangements ascertained incidentally. This ap-proach may help identify breakpoints that are solely involved in a specific ascertainment group and therefore could be considered to be the product of the bias in ascertainment. Any bands nonrandomly participating in all ascertainment groups, and especially also in sperm chromosome data are candidates for hot spots for breakage and rearrangement due to intrinsic properties of the chromatin in a given area. Chapter 1. Introduction 18 Development of Statistical Method As discussed in section 1.2.1,previously used methods of breakpoint distribution analysis were often inadequate to detect nonrandomness at the level of individual chromosome bands. The first comprehensive attempt to test specifically for nonrandomness in chro-mosome bands, taking the differences in band lengths into account, was made by De Braekeleer et al., in testing for association of fragile sites and cancer rearrangements [18]. This method, involving a Monte Carlo simulation, was subsequently applied to a data set of constitutional rearrangements [16], and was also the basis for the method of analysis developed in the present study. A computer simulation of random breakage was carried out based on the assumption that breakage is random at all sites on the chromosomes. A random number generator was used to distribute a given number of breakpoints in pro-portion to the relative lengths of individual bands in the haploid genome. This process was repeated a large number of times to produce a probability distribution of breakpoints in each band. In comparison of the generated distribution to an observed distribution with the same total number of breakpoints, those bands found to have less than the ob-served number of breaks in 95% of the simulations were considered to have a significant excess of breakpoints at the 5% level of significance. Analogously, bands with greater than the observed number of breaks in 95% of the simulations were considered to have a significant deficit of breakpoints at the 5% level. This method has the advantage that it can be used to test unambiguously the distri-bution of breakpoints in individual bands in order to detect sites of unusually frequent or infrequent breakage. One disadvantage is that the simulation makes approximations to expected probabilities of breakage that can otherwise be calculated easily based on a hypothesis of random breakage. The simple observation that there are only two possi-bilities for the outcome of a breakage event with respect to a specific band, namely that Chapter 1. Introduction 19 a break will occur and that a break will not occur in a specific band, with respective probabilities adding up to 1, allows the binomial probability distribution to be used as a model for chromosome breakage [17] [70]. Using the binomial model, tail probabili-ties for observed breakpoints in chromosome bands have been calculated [17], using a hypothetical probability of random breakage (see chapter 2 for further details). The approach to statistical testing used in this study also employs the binomial prob-ability distribution model. However, confidence intervals around observed breakpoints in chromosome bands were calculated instead of point estimates for the probability [17]. A point estimate is particularly sensitive to chance fluctuations [41], which is especially pronounced for small samples, characteristic of breakpoint data at the chromosome band level. The binomial confidence interval for each band is calculated without any assump-tion about the distribution of breakpoints in a band. Therefore, the confidence interval enclosing the set of values likely to be observed at a given significance level, is a more de-scriptive parameter and is more revealing about the observed distribution of breakpoints. In the testing of hypotheses, expected values are calculated separately and the test for nonrandomness simply involves observation of the location of expected values relative to the confidence interval. Detailed description of the method of binomial confidence intervals in testing for nonrandom breakage is included in chapter 2. 1.2.3 A Note on "Hot Spots" The terms "hot spot" and "cold spot" have been loosely coined in statistical studies to describe chromosome bands found to have an excess or a deficit of breakpoints respec-tively. Although the underlying purpose of breakpoint distribution studies is to try to identify sites on the human genome that are biologically predisposed to breakage and/or rearrangement, the previous discussion clearly indicates that indisputable demonstration that specific sites are true "hot spots" or "cold spots" for breakage is beyond the scope of Chapter 1. Introduction 20 any statistical study. Bands found to have an excess or deficit of breakpoints in a care-fully designed study are potential candidates for true "hot spots" or "cold spots", but this status has to be confirmed by experimental methods. For the sake of convenience, in this study the terms "hot spot" and "cold spot" are used to refer to bands with excess or deficit of breakpoints in a data set. However, it must be stressed that this designation does not signify an underlying assumption about the true predisposition at certain bands either to frequent or infrequent breakage. Chapter 2 Data Sources and Methods 2 . 1 Overview of Data Analysis In this study, we analyze data on human constitutional chromosomal rearrangements to determine if points of nonrandom breakage exist. The primary sources of data include large scale studies published in the literature and registries of cytogenetic information. The data sets are described in detail in section 2.2 below. Inversions and rep were considered separately. Within each group of rearrangements data were analysed in subgroups denned according to mode of ascertainment of cases (see section 2.3). For purposes of analysis, a pair of breakpoints in a specific rearrangement was assumed to be independent. The distribution of breakpoints in a 320 band haploid karyotype was studied to detect any nonrandomness overall on the genome and in specific bands. The 320 band karyotype was chosen because cytogenetic information is generally reported at that level of chromosome resolution, and because higher resolution data can always be reduced to a lower resolution breakpoint. For tests of nonrandomness, two hypotheses were formulated as described in section 2.5.2. Expected values, calculated based on these hypotheses, were corrected to take account of the differences in the repesentation of the X and Y chromosomes in males and females. In a haploid karyotype, the probability of observing an X or a Y chromo-some is 0.75 and 0.25 respectively, assuming equal numbers of males and females in the study population is equal. Accordingly, the expected probabilities for the X and Y were 21 Chapter 2. Data Sources and Methods 22 multiplied by these values. To test for overall nonrandomness in the distribution of breakpoints, two goodness of fit statistics were calculated, including Pearson's % 2 and the likelihood-ratio G2 [23]. To our knowledge, there is no reliable statistical method available for testing goodness of fit in situations when the number of classes is large (320 in this study) and the expected values are small (less than 1 for several data sets). One method developed specifically for breakpoint distribution studies [62] is based on assumptions that were later shown to be incorrect [23]. However, one view is that reasonable agreement between Pearson's %2 and the likelihood-ratio G2 indicates a reliable result [23]. For very small data sets, we tested the overall distributions using chromosomes instead of chromosome bands as the smallest class. The analysis of rearrangements involved pooling of data sets from independent studies. Prior to pooling, data sets were tested for homogeneity using Pearson's x2 a n d the likelihood ratio G2. Nonrandomness in a specific chromosome band was tested using the method of bino-mial confidence intervals [70] considered in detail in section 2.5. For testing breakpoint distributions in chromosome bands, the size of each band in the 320 band karyotype was measured from the standard idiograms of human G banded chromosomes in the ISCN nomenclature [34]. Each band in the 320 band karyotype was designated as a light or a dark band based on the ISCN idiograms. For some bands several subbands are present in the diagrams. In these cases the band was designated as light or dark based on the total relative sizes of light and dark subbands within that band. The overall distribution of breakpoints in light and dark G band was tested by cal-culation of a z statistic for one sample proportions (see pages 207-210 in [54]). In addition to inherited rearrangements, breakpoints from aberrations of sperm chro-mosomes were also studied. For the assignment of breakpoints a 329 band karyotype was Chapter 2. Data Sources and Methods 23 used to allow inclusion of breaks designated only as centromeric with no specification of the chromosome band involved. On some chromosomes (1, 4, 9, 19) we defined a single centromeric band as bands p l l and ql l together. Breakpoints on these chromosomes designated either as centromeric or in bands p l l or in ql l, were assigned to this cen-tromeric band. On the other chromosomes the centromeric region was defined originally in ISCN by either subbands of p l l and ql l, or by a combination of a subband of one of p l l or q l l together with p l l or ql l . Breaks on these chromosomes in the centromere region were assigned to the centromeric band denned by the appropriate band and/or subbands as before. However, breaks specified to be in p l l or q l l were not considered to be centromeric. These breakpoints were assigned to bands p l l and q l l which we defined as the subbands adjacent to the centromeric band. In the following sections several aspects of the general methods described above are considered in detail. In section 2.2, the sources of data are described. The ascertainment of cases in the original studies, and the ascertainment groups used to organize information in this study are described in section 2.3. The elimination of duplicate cases is considered in section 2.4. Finally, the method of statistical analysis used to test for nonrandomness in chromosome bands is explained in section 2.5. 2.2 Sources of D a t a Data were obtained from two chromosome registries, several large scale studies, and a few smaller studies. Breakpoint information on reciprocal translocations (rep) and inversions (inv) were analyzed separately. Robertsonian translocation breakpoints were not included in the analysis because they are known to be nonrandomly restricted to centric fusions of the five acrocentric chromosomes. Analysis of breakpoint distributions of other types of constitutional rearrangements was not possible due to lack of sufficient Chapter 2. Data Sources and Methods 24 numbers of available cases for sound statistical analysis. A total of 1,272 cases of rep and 310 cases of inversions were studied. Of these, 1,012 rep, and 242 inversions were from large scale prenatal diagnosis studies [11], [22], [32], [71], 232 rep, and 68 inversions were contributed to the Registry of Cytogenetic Abnormalities and Phenylketonuria (ReCAP) [24] and the Interregional Cytogenetic Registry System (ICRS) [51]. 28 cases of rep were detected in cytogenetic surveys of consecutive newborns. [5], [20], [25], [50],[42]. The number of inversion cases detected in newborn surveys was not sufficient for statistical analysis. For several rearrangements, only one breakpoint was included in the data set because the available information on the other breakpoint was incomplete. The data described above represent three distinct sources of information, including (1) rearrangements from prenatal diagnosis studies, (2) rearrangements contributed to a chromosome registry, primarily from clinical laboratories, and (3) rearrangements de-tected in a systematic cytogenetic survery of consecutively born babies. In the first group of studies, the primary aim was to determine frequencies of structural abnormalities at prenatal diagnosis [32], [22], or to estimate the risk of unbalanced segregants in balanced carriers of structural rearrangments [11]. Balanced carrier couples were ascertained either through an abnormality or for reasons unrelated to the rearrangement, most frequently advanced maternal age. Rearrangements in the second group were obtained from two cytogenetic registries, established to provide an organized collection of cytogenetic data obtained from a number of actively participating laboratories. Information on all types of chromosome aberrations were included in these registries. Therefore, the data are varied, including balanced and unbalanced rearrangments from abnormal individuals, spontaneous abortions, infertile individuals and prenatal diagnosis results. The third group of rearrangements, ascertained in newborn surveys, includes both balanced and Chapter 2. Data Sources and Methods 25 unbalanced rearrangements detected in liveborn infants. The methods of case ascertain-ment and methods of analysis, as well as the ultimate purposes of the three groups of studies vary widely and this can be expected to affect the nature of the data. However, a useful feature of all studies is that ascertainment information was provided for each rearrangement. Consequently, analysis of breakpoint distributions using data from these sources could be carried out on groups of specific rearrangements selected from the orig-inal studies based on the original ascertainment criteria and assigned to ascertainment groups defined according to the specific purpose of testing for nonrandom breakage in constitutional rearrangements, as described later in this chapter. In addition to constitutional rep and inversions, aberrations detected in sperm chro-mosomes of normal men were also analyzed. Although points of breakage in aberrations of sperm chromosomes may not all be equally frequently involved in subsequent rear-rangments, such points may represent areas that are especially prone to breakage. An additional advantage of this type of data is that it is free from some of the phenotypic and viability effects that influence the selection of cases of constitutional rearrangements for study. In two studies, 104 metaphases with chromosome aberrations were found with a total of 109 breakpoints. Only a small number of rep were observed. The majority of aberrations of sperm chromosomes were gaps and breaks of chromatids and chromosomes, in addition to a few deletions and other rearrangements. 2.3 Organization Of Data In Ascertainment Groups The primary objective of this study was to compile and compare data on chromosome rearrangements ascertained in one of two general ways: through abnormalities or through incidental means. Description of the ascertainment groups used for analysis in this study are found in section 2.3.2 below. The ascertainment groups in the original studies that Chapter 2. Data Sources and Methods 26 are subsequently assigned to one of the general ascertainment groups in section 2.3.2 are described in detail in section 2.3.1. 2.3.1 Ascertainment of Rearrangements in the Original Studies Rearrangements from the sources described were selected for analysis in the present study based on the original mode of ascertainment of the cases. Ascertainment criteria in the original studies were defined in terms of study objectives giving rise to systematic variations between data sets. For example, the number of ascertainment groups and the generality of their definitions were different between studies. Furthermore, the number of cases in each ascertainment group varies substantially, both within and between studies. Ascertainment information for rearrangements from all data sources is presented in tables 2.1 and 2.3 for rep, and in tables 2.2 and 2.4 for inversions. Rearrangements in tables 2.1 and 2.2 were ascertained through various abnormalities and are as a result very heterogeneous with respect to reasons for ascertainment. In the study of Daniel et al. [11], and in the ReCAP registry, the cases were classified into one of a few relatively general groups. In contrast, specific criteria were used to classify cases in one of a number of ascertainment groups in the ICRS data set. Some ascertainment definitions, such as group "a" in the ICRS data set, are somewhat ambiguous, requiring interpretation to allow inclusion in the ascertainment group, denned in section 2.3.2, for purposes of breakpoint distribution analysis. Rearrangements in tables 2.3 and 2.4 include cases ascertained incidentally through a balanced carrier, with no a priori reason to suspect the presence of a balanced rear-rangement. The majority of these rearrangements were ascertained at prenatal diagnosis in women with advanced maternal age. A minor subset includes prenatal diagnosis pa-tients referred for reasons other than advanced maternal age, such as anxiety, that are Chapter 2. Data Sources and Methods 27 Data Source Ascertainment Group Description # of Breaks Daniel et al., 1989 a Offspring with 594 Unbalanced Rep b Multiple Spontaneous 553 Abortions c Infertility 8 d Balanced Proband with 12 Mental Retardation ICRS a Confirm/Rule Out Chromosome 183 Abnormality b Multiple Congenital 20 Anomalies c Multiple Spontaneous 20 Abortions d Suspected Autosomal 9 Abnormality e Down Syndrome 9 Suspected f Turner Syndrome 6 Suspected g Chromosome Abnormality 6 Suspected h Dysmorphic Features 6 i Ambiguous Genitalia 4 j Neoplasia Study 2 k Secondary Amenorrhea 2 1 Trisomy 18 Suspected 2 m Primary Amenorrhea •1 n Mental Retardation 1 R e C A P a Abnormal Phenotype 107 b Multiple Spontaneous 24 Abortions c Infertility 6 Table 2.1: Rep breakpoints listed according to mode of ascertainment in the original studies. All rearrangements were ascertained through abnormalities and were assigned to group Al, which is defined in section 2.3.2 Chapter 2. Data Sources and Methods 28 Data Source Ascertainment Group Description # of Breaks ICRS a Confirm/Rule Out Chromosome Abnormality 47 b Multiple Congenital Anomalies 8 c Dysmorphic Features 2 d Sex Chromosome Abnormality Suspected 2 e Multiple Spontaneous Abortions 2 f Abortion Material From Spontaneous Abortion 2 g Neoplasia Study 2 h Down Syndrome Suspected 4 i Trisomy 18 Suspected 2 Daniel et al., 1989 a Offspring with Unbalanced Rep 29 b Multiple Spontaneous Abortions 34 c Balanced Proband with Mental Retardation 2 R e C A P a Abnormal Phenotype 22 b Multiple Spontaneous Abortions 6 Table 2.2: Inversion breakpoints listed according to mode of ascertainment in the original studies. A l l rearrangements were ascertained through an abnormality and were assigned to group B l , which is denned in section 2.3.2 Chapter 2. Data Sources and Methods 29 Data Source Ascertainment Group Description # of Breaks Danie l et a l , 1989 a Incidental Ascertainment for Reasons Unrelated to the Rearrangement. 575 Hook et a l . , 1987 a Amniocentesis for Advanced Materna l Age 125 b Amniocentesis for Anxie ty 4 Ferguson-Smith & a Amniocentesis for Advanced 90 Yates, 1984 Materna l Age R e C A P a Amniocentesis for Advanced Materna l Age 22 b Amniocentesis for Anxie ty 2 I C R S a Amniocentesis for Advanced Materna l Age 12 b Survey of N o r m a l Chi ldren 2 Newborn Surveys a Newborn Surveys 54 Table 2.3: Rep breakpoints listed according to mode of ascertainment in the original stud-ies. All rearrangements were ascertained through incidental means, and were assigned to groups A2 and A3, which are defined in section 2.3.2 Data Source Ascertainment Group Description # of Breaks Danie l et a l , 1989 a Incidental Ascertainment for Reasons Unrelated to the Rearrangement. 282 Hook et a l . , 1987 a Amniocentesis for Advanced Materna l Age 83 Ferguson-Smith & Yates, 1984 a Amniocentesis for Advanced Maternal Age 42 R e C A P a Amniocentesis for Advanced Maternal Age 18 b Amniocentesis for Anxie ty 2 I C R S a Amniocentesis for Advanced Maternal Age 14 Table 2.4: Inversion breakpoints listed according to mode of ascertainment in the original studies. All rearrangements were ascertained incidentally and were assigned to groups B2 and B3, which are defined in section 2.3.2 Chapter 2. Data Sources and Methods 30 also unrelated to the balanced rearrangement. For reasons explained in section 2.3.2, bal-anced carriers detected in newborn surveys were analyzed together with balanced carriers detected at prenatal diagnosis. A number of laboratories contributed data to more than one study or registry. To ensure that each rearrangement represents an independent mutation event, we tried to eliminate duplicate cases from the data. The procedures used to identify duplicate rear-rangements are described in section 2.4. 2.3.2 Definition of Ascertainment Groups Used for Classification of Rear-rangements in the Present Study The distribution of breakpoints was analyzed and compared in different ascertainment groups. Rep, inversions and sperm chromosome rearrangements, referred to as group A, group B and group C, respectively, were each analysed separately. Groups A and B were further subdivided according to the mode of ascertainement of the rearrangement as follows: Group A . Reciprocal Translocations • Group A l . Rep Ascertained Through Abnormalities • Group A 2 . Rep Ascertained Incidentally • Group A 3 . Incidentally Ascertained De novo Rep Group B . Inversions • Group B l . Inversions Ascertained Through Abnormalities • Group B2. Inversions Ascertained Incidentally • Group B3. Incidentally Ascertained De novo Inversions Chapter 2. Data Sources and Methods 31 The number of breakpoints from the original studies in each of groups A, B, and C, defined above, are listed in table 2.5. The ascertainment groups defined above are very general requiring further clarifica-tion. For statistical analysis, all cases ascertained for some form of abnormality were assigned to group A l for rep and group Bl for inversions. The "abnormality" may be a single or multiple congenital defect, mental retardation or other abnormality in a live-born individual, an abnormal stillbirth or aborted fetus. Furthermore, the abnormality may also refer to a healthy and normal balanced carrier with abnormal reproductive out-comes. For example, carriers of balanced rearrangements are occasionally found among individuals with multiple spontaneous abortions. When abortion material is not avail-able for study, the aborted fetus is assumed to be an abnormal unbalanced segregant. Similarly, in cases of infertility when a structural aberration is found, we assume that the segregation of rearranged chromosomes produces very large imbalances that lead to reproductive failure. Groups A2 and B2 include balanced rep and inversions ascertained incidentally at prenatal diagnosis or newborn surveys. The ascertainment of a balanced carrier at pre-natal diagnosis may be a direct or indirect process. The balanced carrier may be (1) a normal fetus with a balanced rearrangement inherited from one of the parents, (2) a nor-mal fetus with a balanced de novo rearrangement, (3) a balanced carrier parent detected incidentally prior to entering prenatal diagnosis study1, (4) a balanced carrier parent with an unbalanced fetus detected at prenatal diagnosis subsequently followed by cyto-genetic studies of the parents. In this study (4) is excluded from data in order to avoid ascertainment bias resulting from different viability potentials of carriers of unbalanced rearrangement s. Although newborn surveys involve a systematic screening of consecutive liveborn 1 T h i s group is found only in the study of Daniel et al. [11] Chapter 2. Data Sources and Methods 32 Group Reference # of Breaks A l Danie l et a l . , 1989 [11] 1167 R e C A P [24] 137 I C R S [51] 279 Total 1583 A 2 Danie l et a l . , 1989 [11] 575 Ferguson-Smith & Yates, 1984 [22] 90 Hook & Cross, 1987 [32] 129 R e C A P [24] 24 Evans et a l . , 1978 [20] 22 I C R S [51] 14 Friedrich & Nielsen, 1974 [25] 12 Nielsen & Sillesen, 1975 [50] 11 Buckton et a l . , 1980 [5] 7 L i n et a l . , 1976 [42] 2 Total 886 A 3 Warburton , 1984 [71] 49 Hook & Cross, 1987 [32] 38 Ferguson-Smith & Yates, 1984 [22] 16 Hook et a l . , 1983 [33] 8 R e C A P [24] 6 Buckton et a l . , 1980 [5] 2 Friedrich & Nielsen, 1974 [25] 2 Nielsen & Sillesen, 1975 [50] 1 Total 122 B l I C R S [51] 81 Daniel et a l . , 1989 [11] 65 R e C A P [24 28 Total 174 B 2 Daniel et a l . , 1989 [11] 282 Hook & Cross, 1987 [32] 83 Ferguson-Smith & Yates, 1984 [22] 42 R e C A P [24] 20 I C R S [51] 14 Total 441 B 3 Warburton, 1980 [71] 10 Hook & Cross, 1987 [32] 8 Ferguson-Smith & Yates, 1984 [22] 2 R e C A P [24] 2 Total 24 C Brandrif f et a l . , 1985 [4] 62 M a r t i n et a l . , 1987 [45] 47 Total 109 Table 2.5: Sources of data. Groups A, B, and C consist of breakpoints from rep, inversions and sperm chromosome aberrations respectively. Subdivisions of groups A and B are according to mode of ascertainment as described previously in this section. Chapter 2. Data Sources and Methods 33 infants, rearrangements detected by this procedure were grouped together with other incidentally ascertained rearrangements (Groups A2, A3, B2, and B3). As with rear-rangements ascertained at prenatal diagnosis, there is no a priori reason to suspect a chromosome aberration in balanced carriers from newborn surveys. A balanced carrier ascertained in newborn surveys may be (1) a normal infant who is a carrier of a bal-anced inherited rearrangement, (2) a normal infant who is a carrier of a balanced de novo rearrangement, (3) an abnormal liveborn infant who is a carrier of an unbalanced rearrangement inherited from a balanced carrier parent, or (4) an abnormal liveborn in-fant who is a carrier of an unbalanced de novo rearrangement. Groups (3) and (4) were not included in our analysis. Since Groups (3) and (4) were excluded, the ascertainment of balanced carriers is comparable through prenatal diagnosis for advanced maternal age and through newborn surveys. There are two important differences between these two methods of ascertain-ment, however. First, balanced carriers are detected at different stages of development. In prenatal diagnosis, balanced carriers are detected at about 14 to 16 weeks of gesta-tion, while in newborn surveys infants are studied at birth. In this study, we included only balanced rearrangements associated with normal phenotype. Although balanced rearrangements are occasionally associated with abnormalities, these abnormalities are expected to be often compatible with live birth, unless the rearrangement itself disrupts an essential gene. Therefore, we do not expect the sample of balanced carriers to be substantially different at prenatal diagnosis than at birth. Secondly, in newborn surveys all consecutive births are studied while only a specific group of pregnancies is studied at prenatal diagnosis. In both situation, we expect the sampling with regard to balanced carrier status to be random. For these reasons, we felt it was justified to analyse prenatal diagnosis data and newborn survey data in a single group of incidentally ascertained balanced rearrangements. Chapter 2. Data Sources and Methods 34 Groups A3 and B3 consist of incidentally ascertained de novo rep and inversions representing subgroups of A2 and B2 respectively, with additional information from a study of de novo rearrangements [71]. 2.4 Elimination of Duplicate Cases Since data in Groups A and B were compiled from several large published studies, it was necessary to consider the possible duplicate inclusion of some rearrangements. For breakpoint distribution analysis, it is essential to ensure that the same individual, or in-dividuals from the same family are entered only once in order to avoid overrepresentation of a single set of breakpoints. In some of the original studies duplicate cases were not removed by the authors [11] [32], so it was necessary for us to do so. An additional prob-lem involved overlap between data sets, because several laboratories contributed their data to more than one registry or study. In the ICRS and ReCAP registries individuals from the same family were identifiable by the assigned identification number. This was not possible for the data in other studies because only names of contributing laborato-ries was provided. Since complete information on family relationships between subjects was not always available, we employed a conservative approach for removal of duplicate rearrangements. The search for identical rearrangements within and between data sets was carried out using a C shell script program2. This program searches for rearrangements with identical breakpoints and prints all matches with the appropriate identification information. The program was initially run on each set of data from the original studies separately. Fol-lowing removal of all but one of matching records from within data sets, data from the original studies were pooled in one of the ascertainment groups defined in the previous 2 T h i s and other programs used in these studies were wr i t ten by A . R. Rutherford Code for a l l programs is reproduced in Append i x C Chapter 2. Data Sources and Methods 35 section. The pooled sets were checked for matching records once again and duplicates were individually evaluated to determine if they were likely to represent a single mutation event (see below). For rep, the majority of matches were reported by the same laboratory. In a few cases, only the country of origin could be identified. As all cases originate from either Europe or North America, and since information on the origin of rearrangements was incomplete in many cases, matching rep were considered to be identical mutations if both originated in Europe or both in North America. Although this is a conservative approach, it has the advantage of reducing the chance of including identical mutations from related individuals who are unaware of this relationship. In this study we found that identical rep rarely arise in unrelated individuals. The opposite is true for inversions, as identical rearrangements were frequently reported by different laboratories (see Appendix A). The conservative approach of eliminating all but one of several identical inversions originating from either Europe or from North America would require elimination of most inversion data. For this reason, only cases identified by the same laboratory were removed. If the contributing laboratory was not known, duplicates were removed only if there was a possible overlap of data sources in two studies. It is unlikely that all inversions originating from the same mutation were eliminated by this method of duplicate removal. However, any criterion of eliminating identical inversions is highly arbitrary and is unlikely to aid in clarification of the data. 2.5 Statistical Analysis 2.5.1 Mathematical Model of Chromosome Breakage Chromosome breakage in specific small regions, such as chromosome bands, can be de-scribed by a binomial probability model. Considering a breakage event in a chromosome Chapter 2. Data Sources and Methods 36 band, each band i can be described to have an unknown probability of breakage, pi, determined by various biological factors. The only other possible outcome is no breakage in the specified band, and this event occurs with a probability of = 1 — p; . If each breakage event is assumed to be independent of every other, breakage of chromosomes corresponds to a series of Bernoulli trials [54]. The probability P that an X{ of breakage events are observed in band i can therefore be described by a binomial distribution as follows: P(X = x{) p ? q f - X ' } (2.1) V *i J where N is the total number of breakpoints in the sample, and A ' is a random variable representing all possible values of X{. The value P calculated in 2.1 is a point estimate of the probability that an a:,- number of breakpoints are observed i n band i, given the inherent probability of breakage p; in that band. 2.5.2 Hypotheses of Random Breakage In testing a distribution of breakpoints for nonrandomness, the observed distribution is compared to an expected distribution based on a hypothesis of random breakage. In this study two hypotheses were tested, both assuming random breakage with respect to a particular chromosome band. In hypothesis I, breakage of chromosomes was assumed to occur at equal probability everywhere on the genome. Accordingly, the expected distribution of breakpoints in a specific chromosome segment is calculated as the relative length of that segment in a haploid genome. The expected probability of breakage in a chromosome band i is calculated as: PiE = \ (2.2) where piE = the expected probability of breakage i n band i; Z; = the length of band i; Chapter 2. Data, Sources and Methods 37 and L = the total length of the haploid genome. The lengths of bands were measured from the diagrams of G banded chromosomes in the ISCN Nomenclature [34] (Appendix B). For comparison of breakage frequencies between chromosome bands, the expected breakage density in a band is a more useful entity. Expected breakage density (dis) is obtained by the modification of 2.2 as follows: iiE = P f N = ?- (2.3) In G banded data sets, breakpoints are more often found in light bands than in dark bands [55]. In hypothesis II, the assumption of random breakage is modified to account for the excess number of breakpoints in light G bands. The underlying assumption in this hypothesis is the extreme possibility that all excess breakage in light bands is due to bias, and probabilities of breakage were corrected accordingly as follows: Expected Probability of Breakage in a Dark Band (piED) P * D = (2-4) Expected Probability of Breakage in a Light Band (piEL ) ViEL = (2.5) where PiED = expected probability of breakage in dark band i; pish = expected probability of breakage in light band i\ ND = total number of breaks in all dark bands; NL = total number of breaks in all light bands; LL = total length of all light bands; Lrj = total length of all dark bands. To obtain expected breakage densities in light and dark bands, 2.4 and 2.5 are mul-tiplied by yV, and divided by to give: Chapter 2. Data Sources and Methods 38 Expected Breakage Density in Dark Bands (diED) ND L D (2.6) Expected Breakage Density in Light Bands (d{EL) (2.7) Both hypotheses I and II were tested on all ascertainment groups as well as on all sets of data from the original studies. 2.5.3 B i n o m i a l C o n f i d e n c e L i m i t s Hypotheses I and II may be tested by inserting expected values of probability into equa-tion 2.1 to calculate a point estimate of the probability P that X; number of breakpoints are observed. However, such point estimates are unsatisfactory measures of probability because they are quite sensitive to chance variation [41]. A more descriptive measure is the confidence interval (ci) around the the observed probability of breakage (pio — %i/N) [41] A ci at a specific significance level, such as 0.01, defines a range of Xi values likely to be observed in a given proportion (in this case 99%) of samples in repeated samplings. The value of Xi is expected to he outside the ci by chance 1% of the time. 99% ci were calculated for all bands in this study to allow not only the detection of bands that qualify as hot spots for breakage but also to help identify trends and understand the distribution of breakpoints on the entire genome. Since we have no control over errors introduced at the technical and at the study design level, a relatively conservative nominal significance level of 99% was chosen to reduce the effects of systematic and random errors on the data. The exact upper and lower confidence limits on p ^ can be derived by setting the value of P to the desired significance level, and calculating the upper and lower values of pio in Chapter 2. Data Sources and Methods 39 2.1. The calculations involved in this derivation are substantially simplified by making approximations to the discrete binomial distribution using either the continuous normal or Poisson distributions [41]. The appropriate choice for the approximation is determined by the sizes of both iV and a;,-. The decision to use one or the other distribution is rather arbitrary, however. Through graphing a series of binomial probability distributions, we found that sample sizes N of about 400 breakpoints or greater and a;,- of 9 or more produce a near normal probability distribution. For smaller Xi, the Poisson distribution is more appropriate. Tables of confidence limits are readily available for the Poisson distribution [52]. For the normal distribution, we adapt the formulae in [41] to the notation used in this study as follows: Lower Confidence Limit for p& (pio,)'-N 2N vTVT V \Nj \X N) 1" AN Pio, = 7—T* 2-8 1 4- —  ^ N Upper Confidence Limit for p^ {piOu)'-N + 2N + v/iVV (w) I 1 N) + 4N piou = r r u (• ) 1 + N Confidence limits for breakage densities are obtained by modifying equations 2.8 and 2.9 respectively as follows: d t 0 t = ^ (2.10) <ko. = (2.11) 2.5.4 Testing for Nonrandomness Using Binomial Confidence Limits The central assumption in testing for nonrandom breakage in a set of structual rearrange-ments is that the observed probability of breakage in a chromosome band (p,o) estimates Chapter 2. Data Sources and Methods 40 the true unknown probability of breakage (pi). The confidence interval defines the set of values this probability may have at a specific significance level. As a first step in the analysis, breakpoint distributions were tested for nonrandomness using Pearson's % 2 test and the likelihood ratio G2 which detect overall deviations from random breakage. To define specific sites of nonrandom breakage the method of binomial confidence limits [70] was used. Confidence intervals were calculated for all bands in the haploid chromosome set for each data set. The test for nonrandomness involves the formulation of a hypothesis of random breakage (Ho) and comparison of hypothetical breakage densities to observed breakage densities. If the hypothetical breakage density lies within the confidence interval for the observed breakage density dio, the hypothesis of random breakage is accepted. Unusually frequent breakage is indicated by values of diE that he outside of the lower confidence limit for d^. Conversely, areas of unusually infrequent breakage can be detected as well by values of diE higher than the upper confidence limit for dio-2-5.5 Comparison of Results Between Data Sets Chromosome bands with unusually high number of breakpoints may or may not represent true sites of frequent chromosome breakage. Bias in ascertainment of cases, in addition to random variation, may produce nonrandomness in the results. Lists of bands with excess breakpoints from two different data sets can be statistically compared to evaluate the probability that identical bands are observed by chance in independent data sets. The hypergeometric distribution [41] is used to calculate the probability P that k identical bands in two data sets with P>i and Bi bands of frequent breakage are observed by chance as follows: Chapter 2. Data Sources and Methods 41 \ k ) , \ B 7 - k t P ( X = k) = — — (2.12) 1 B X \ B 2 2.5.6 Summary of Statistical Analysis In the test for nonrandom breakage, the following steps of statistical analysis were carried out on sets of data with duplicate rearrangements removed. 1. A x2 a n d C2 statistic was calculated to test for homogeneity in breakpoint distri-butions between individual data sets prior to pooling of data. 2. Individual data sets, including pooled data, were tested for overall nonrandomness in the distribution of breakpoints by computing % 2 and G2. 3. Distribution of breakpoints in specific chromosome bands were tested for nonran-domness using the method of binomial confidence limits in individual data sets as well as the pooled data. 4. Comparison of results between data sets were carried out using the hypergeometric distribution to calculate the probability that identical bands occur in two indepen-dent data sets by chance. 5. The hypergeometric distribution was also used to calculate the probability of ran-dom association of bands of frequent constitutional breakage with fragile sites and oncogenes. 6. The observed distribution of breakpoints in dark and light G bands was compared to the distribution expected based on the total relative lengths of light and dark G Chapter 2. Data Sources and Methods 42 bands. The proportion of breakpoints in light and dark G bands was compared by calculation of a 2 statistic in each data set, including pooled data. Chapter 3 Results In the following sections, the results of breakpoint distribution analysis in reciprocal translocations, inversions and sperm chromosome rearrangements are presented. 3.1 Reciprocal Translocations Results for inherited rep ascertained through abnormalities (group A l ) and incidentally (group A2) , and for de novo rep ascertained incidentally (group A3) are presented in sections 3.1.1, 3.1.2, and 3.1.3 respectively. This is followed by a comparative analysis of these ascertainment groups (section 3.1.4), a description of the distribution of rep breakpoints in light and dark G bands (section 3.1.5), and their association with fragile sites and oncogenes (section 3.1.6). 3.1.1 Rep Ascertained Through Abnormalities ( A l ) Breakpoints from rep ascertained through abnormalities ( A l ) were tested for nonrandom-ness and the results are summarized in tables 3.1 and 3.2. The distribution of breakpoints overall the entire chromosome complement was nonrandom in all three independent data sets as measured both by the % 2 and G2 (table 3.1). In testing distributions specific to bands, nonrandomness was found in all three data sets under the assumptions of both hypotheses I and II 1 . The bands nonrandomly involved in breakage are listed in table 3.2. The number of hot spot bands in a data set appear 1See Chapter 2 for definitions of hypotheses I and II. 43 Chapter 3. Results 44 Data Source a b c Sample Size (N) 1167 273 137 df 320 320 24 x2 1170 631 53.8 P < 0.0001 < 0.0001 0.0005 G2 1042 486 42.5 P < 0.0001 < 0.0001 0.014 Interpretation Nonrandom Nonrandom Nonrandom Table 3.1: Summary of %2 and G2 in tests for overall randomness of breakpoint distri-butions of rep ascertained through abnormalities (group Al). Data sources: a. Daniel et al., 1989; b. ICRS; c. ReCAP. See table 2.5 Group A l for sample sizes and references for individual data sets. to correlate with the total sample size, with more hot spots found in larger data sets. Some bands are nonrandomly involved in more than one of the independent data sources. Among hypothesis I hot spot bands there were no identical results between ICRS (data set b) and ReCAP (data set c). However, the probability is very low (p = 0.00275) that the 3 hot spot bands observed in both data sets a and b is a chance occurrance. Similarly, the one identical band found in both data sets a amd c (18q21) is unlikely to be a chance observation (p = 0.00284). For hypothesis II, the 3 matches of hot spot bands in data sets a and b were also found to be highly significant (p — 0.0019). There were no matches between data sets a and c, or b and c for hypothesis II. Of the three data sets, bands of infrequent breakage were detected only in study a with the largest sample size. These are listed in table 3.3. Bands lpl3, 19pl3, Xp22, and Xpl l were nonrandomly involved in both hypotheses I and II. In all data sets, the overall distribution of breakpoints in light and dark bands was significantly different according to calculations of z statistic. In all cases breakpoints occur more frequently in light G bands. (See section 3.1.5 below.) In prenatal diagnosis data (study a), 30 and 20 bands were detected as hot spots in tests of hypothesis I and Chapter 3. Results 45 Hypothesis I. Hot Spots i n D a t a Set: N = 1167 lq42 (L) 2q33 (L) 3q21 (L) 3q27 (L) 4 p l l (D) 4q35 (L) 5 p l 5 (L) 5 P 13 (L) 5 p l 2 (D) 5q35 (L) 6q21 (L) 7p22 (L) 7q32 (L) 9p24 (L) 9p22 (L) 9 p l l (D) 10q26 (L) 13ql4 (L) 13q22 (L) 13q34 (L) 14q32 (L) 15q22 (L) 17pl3 (L) 17q25 (L) 1 8 p l l (D) 18q21 (L) 18q23 (L) 21q22 (L) 2 2 q l l (L) 22ql2 (D) N = 279 9p24 (L) 9 p l l (D) l l q l l (D) 17pl3 (L) N = 137 7q34 (L) 18q21 (L) Hypothesis II Hot Spots in Data Set: a b c N = 1167 N = 279 N = 137 lq42 (L) - -3q27 (L) - -4 p l l (D) - -4q35 (L) - -5 p l 5 (L) - -5 p l 2 (D) - -7p22 (L) - -7q32 (L) - -- - 7q34 (L) - - 8q21 (D) 9p24 (L) 9p24 (L) -9p22 (L) - -9 p l l (D) 9 p l l (D) -9 q l l (D) - -- l l q l l (D) -13ql4 (L) - -13q22 (L) - -13q34 (L) - -- 17pl3 (L) -1 8 p l l (D) 1 8 p l l (D) -18q22 (D) - -18q23 (L) - -21q22 (L) - -22ql2 (D) - -- - Xq21 (D) T a b l e 3.2: B a n d s of s ignif icant excess breakage detected at the 0.01 level of significance i n rep ascertained t h r o u g h abnormal i t ies ( A l ) . D a t a sources are ident i f i ed i n table 3.1. B o x e d bands are observed i n at least two d a t a sets. ( D ) a n d ( L ) are designations for d a r k a n d l ight G b a n d respectively. Chapter 3. Results 46 Cold Spots Hypothesis I Hypothesis 11 lp31 (D) -Ipl3 (L) 1P13 (L) 2q24 (D) 8q21 (D) 12q21 (D) 2P13 (L) 3P21 (L) 19pl3 (L) 19pl3 (L) XP22 (L) XP22 (L) Xp l l (L) Xp l l (L) Table 3.3: Bands with a significant deficit of breakpoints detected at the 0.01 confidence level in data set a [11]. Boxed bands are significant in both hypotheses. (D) and (L) refer to dark and light G bands. II respectively. Hypothesis I hot spot bands include 24 light G bands and 6 dark G bands (ratio 4:1), and hypothesis II hot spot bands include 13 light G bands and 7 dark G bands (ratio 1.9:1). According to expectation, the correction for excess breakpoints in light G bands produced a decreased ratio of light to dark G band hot spots in data source a. In ICRS and ReCAP, the distribution of breakpoints with respect to light and dark bands does not show obvious trends in analyses of breakpoint distributions in chromosome bands, due to the small number of bands with excess breakpoints in these data sets. Both tests of hypothesis I and II detected the same set of 4 hot spot bands in the ICRS data (study b), with one additional dark G hot spot (18pll) observed in hypothesis II. In ReCAP, 2 and 3 hot spots were found by tests of hypothesis I and II respectively. However, only band 7q34, a light G band, was seen in both tests. Chapter 3. Results 47 Data Source a b c d e f Sample Size (N) 575 129 90 14 24 54 df 320 24 24 24 24 24 x2 573 30.3 23.7 19.7 15.1 25.9 P < 0.0001 0.1761 0.4803 0.7108 0.9183 0.3573 G2 531 33.6 29.5 21.8 19.7 27.7 p(z < 0.0001 0.0900 0.2018 0.5904 0.7149 0.2708 Interpretation Nonrandom Random Random Random Random Random Table 3.4: Summary of %2 and G2 in tests for overall randomness of breakpoint distri-butions of rep ascertained incidentally (group A2). Data sources: a. Daniel et al, 1989; b. Hook et al., 1987; c. Ferguson-Smith., 1984; d. ICRS; e. ReCAP; /. Newborn Surveys. See also Table 2.5 Group A2 for sample sizes and references for individual data sets. 3.1.2 Incidentally Ascertained Balanced Rep (A2) Incidentally ascertained rearrangements that qualify for group A2 (see chapter 2) were tested for nonrandomness and the results are summarized in tables 3.4 and 3.5. Overall nonrandomness in breakage was found only in data set a. Distribution of breakpoints in all other data sets was random. This result was confirmed for data sets b, e, and /in testing for nonrandomness in specific chromosome bands (table 3.5). Two dark bands (12qll and Xqll) were frequently involved in breakage according to hypothesis II in data set d, but random breakage was observed in the test of hypothesis I. Nonrandom involvement was greater in the larger data set a than in data set c, and randomness was generally found in the smaller data sets with the exception of b. In addition to random breakage, the only identical result observed across data sets, is the frequent involvement of dark G band l p l l in two independent prenatal diagnosis studies (a and c). This result is significant at the 95% level as the chance occurrence of a single identical hot spot band in the two independent prenatal diagnosis studies has a low probability (p = 0.019 for hypothesis I, and p — 0.025 for hypothesis II). The overall distribution of breakpoints in dark and light bands follows a trend similar Chapter 3. Results 4 8 Hypethesis I. Hot Spots in Data Set: a b c d e / N = 575 N = 129 N = 90 N = 14 N = 24 A r = 54 l p l l (D) -l p l l (D) - - -3pl3 (L) 16pl3 (L) l lq21 (L) Hypothesis II. Hot Spots in Data Set: a b c d e / N = 575 N = 129 N = 90 N = 14 N - 24 iV = 54 l p l l (D) -l p l l (D) 12qll (D) X q l l (D) --4q22 (D) l l q l l (D) 19ql l (D) l lq21 (L) Table 3.5: Bands of significant excess breakage detected at the 0.01 level of significance in rep ascertained incidentally (A2). Data sources are identified i n table 3.4 Boxed bands are observed in at least two data sets. (D) and (L) are designations for dark and light G band respectively. Chapter 3. Results 49 to what was observed for rep ascertained through abnormalities (Al). The number of breakpoints is significantly greater in light G bands in all data sets (see section 3.1.5 below for further details). For data set a, an increase in the number of dark G band hot spots and elimination of light G band hot spots is produced by the correction in hypothesis II for excess breakpoints in light G bands. The number of bands with excess breaks was too few in data set c and d to make a similar comparison. 3.1.3 Incidentally Ascertained De Novo Rep (A3) A set of 122 de novo rep, from the sources listed in table 2.5 was tested for nonran-domness. All breakpoints in this data set were incidentally ascertained and represent only those cases where both parents were investigated and found not to be carriers of the rep discovered incidentally in the proband. Overall distribution of breakpoints was found to be random (p{x2 > 33.6) = 0.0927; p(G2 > 28.2) = 0.2522). Distribution in chromosome bands was also random by the test of hypothesis I. The test of hypothesis II reveals one dark G band, 4pl5, with greater number of breakpoints than expected at the 0.01 significance level. This band is not frequently involved in breakage in any of the other data sets. Distribution of breakpoints in dark and light G bands was nonrandom in favor of light G band breakpoints (see section 3.1.5). 3.1.4 Comparison of Results: Rep Ascertained Through Abnormalities (Al) and Rep Ascertained Incidentally (A2 & A3) Data from individual studies were tested for homogeneity prior to pooling using Pearson's X2 and the G2 statistic. Individual data sets of incidentally ascertained rep (A2) were found to be homogeneous p(x2 > 1194) = 0.997; p(G2 > 849) = 1.0000). However, individual data sets of rep ascertained through abnormalities (Al) were not homogeneous Chapter 3. Results 50 (p(x2 > 669) = 0.0061; p ( G 2 > 659) = 0.0128), as indicated also by the heterogeneous methods of case ascertainment (table 2.1). Data pooled into either groups of rep ascertained through abnormalities (Al) or inci-dentally ascertained rep (A2), were tested for overall nonrandomness and for nonrandom breakage in specific bands. The overall distribution in both ascertainment groups was found to be highly nonrandom (p(x2 > 1421) < 0.0001; p { G 2 > 1196) < 0.0001 for rep as-certained through abnormalities (Al), and p(%2 > 754) < 0.0001; p ( G 2 > 664) < 0.0001 for rep ascertained incidentally (A2)). Bands with a significant excess of breakpoints at the 0.01 confidence level are listed for both ascertainment groups and both hypotheses in table 3.6. The entire distribution of breakpoints on the chromosomes with the 99% confidence intervals and expected values for hypotheses I and II are shown in figures 3.1.4 and 3.1.4. The rep involving hot spot bands in table 3.6 are listed in Appendix A for both ascertainment groups. It is appearent in these lists that for the most part each rep involves a unique set of breakpoints. Identical rep are rarely observed between unrelated individuals. 5 light G bands, 5q35, 7p22, 9p22, 13ql4, and 17q25, were observed with increased frequency in both rep ascertained through abnormalities (Al) and rep ascertained in-cidentally (A2), when hypothesis I was tested. The probability that the coincidence of these bands is a chance event is low (p = 0.0075). In the test of hypothesis II only 2 bands were found in both data sets (9p22, 17q25) and this result does not reach significance at the 95% level (p = 0.061). The one band in incidentally ascertained balanced de novo rep data (A3), 4pl5, that had a significant excess of breakpoints after the hypothesis II correction was never seen in rep ascertained through abnormalities (Al) or in incidentally ascertained balanced rep (A2). Some bands in rep ascertained through abnormalities (Al) had a significant deficit of breakpoints and these are listed in in table 3.7. In incidentally ascertained rep (A2), only Chapter 3. Results Hypothesis I Ascertained Through Abnormalities N = 1583 lq42 (L) CFS 3q21 (L) 3q27 (L) CFS 4 p l l (D) 4q35 (L) 5p l5 (L) 5p l3 (L) CFS 5p l2 (D) 5q35 (L) 6q21 (L) CFS 7p22 (L) 7q22 (L) CFS 7q32 (L) CFS 8p23 (L) 9p24 (L) 9p22 (L) 9p l 3 (L) 9 p l l (D) 9 q l l (D) 10q26 (L) CFS l l q l l (D) 13ql4 (L) 13q22 (L) 13q34 (L) 14q32 (L) ONC 15q22 (L) CFS 17pl3 (L) 17q25 (L) ONC 1 8 p l l (D) 18q21 (L) C/0 18q23 (L) 2 1 q l l (L) 21q22 (L) ONC 2 2 q l l (D) 22ql2 (D) CFS Ascertained Incidentally N = 886 l p l l (D) lq21 (L) 2q33 (L) 5g l3 (L) 5q35 (L) 7p22 (L) 9p22 (L) 10q22 (L) 10q24 (L) 11 P 15 (L) l l q 21 (L) 13ql4 (L) 17q25 (L) ONC CFS CFS CFS ONC 51 Hypothesis II Ascertained Through Abnormalities N = 1583 lq42 (L) 3q27 (L) 4pll (D) 4q35 (L) 5p l5 (L) 5p l 2 (D) 7q32 (L) 9p24 (L) 9p22 (L) 9 p l l (D) 9 q l l (D) 10q26 (L) l l q l l (D) 13ql4 (L) 13q22 (L) 15q22 (L) 17pl3 (L) 17q25 (L) 1 8 p l l (D) 18q22 (D) 18q23 (L) 2 1 q l l (L) 21q22 (L) 22ql2 (D) CFS CFS CFS CFS CFS ONC ONC CFS Ascertained Incidentally N = 886 l p l l (D) 4q22 (D) 9p22 (L) l l q 21 (L) 17q25 (L) ONC 1 9 q l l (D) Table 3.6: Bands of significant excess breakage detected at the 0.01 confidence level in pooled rep data. Boxed bands are common to both ascertainment groups. (D) and (L) refer to dark and light G bands respectively. CF5=common fragile site; OiVC=oncogene; C/O=common fragile site and oncogene. Chapter 3. Results 52 Cold Spots in Rep Ascertained Through Abnormalities Hypothesis I Hypothesis II lp31 (D) l q l 2 (D) 2pl6 (L) 2q24 (D) 6pl2 (D) 6q22 (D) 12q21 (D) 3p21 (L) 17ql l (L) 19pl3 (L) 19pl3 (L) Xp21 (D) 19ql3 (L) . X p l l (L) X q l 3 (L) Table 3.7: Bands with a significant deficit of breakpoints detected at the 0.01 confidence level in rep ascertained through abnormalities (Al) under the assumptions of hypotheses I and II. two light bands (Xp22, and Xpll) were found to have a significant deficit of breakpoints under hypothesis II and none had deficits under the hypothesis I assumption. These bands were not among the cold spots in rep ascertained through abnormalities (Al). The overall distribution of breakpoints in light and dark bands follows the trend observed in individual data sets with a significantly higher number of breakpoints in light G bands. This observation is discussed in more detail in the following section. In the group of rep ascertained through abnormalities (Al), the hypothesis I test detected 28 light and 7 dark G band hot spots (ratio 4 to 1) and the hypothesis II test detected 16 light and 8 dark G band hot spots (ratio 2 to 1). The corresponding numbers for incidentally ascertained rep (A2) are 12 light and 1 dark band hotspots (ratio 12 to 1) for hypothesis I and 3 light and 3 dark band hotspots (ratio 1 to 1) for hypothesis II. 1p36 1p35 1p34 1p33 1p32 (c1) 1p31 1p22 1p21 1p13 1p12 1p11 1q11 (d) 1q12 1q21 1q22 1q23 1q24 1q25 1q31 1q32 1q41 (h12) 1q42 1q43 1q44 2p25 2p24 2p23 2p22 2p21 <c1) 2p16 2p15 2p14 2p13 2p12 2p11 2q11 2q12 2q13 2q14 2q21 2q22 2q23 (d)2q24 2q31 2q32 2q33 2q34 2q35 2q36 2q37 3p26 3p25 3p24 3p23 3p22 (c2) 3p21 3p14 3p13 3p12 3p11 3q11 3q12 3q13 (hi) 3q21 3q22 3q23 3q24 3q25 3q26 (h12) 3q27 3q28 3q29 77.2 ssass ***** wm * m I mm. mmmmmi • I , :iim<mmmmm ss*s~ I *: • \' * m i**;*!*** :*:*• I I • I **s*s I SxSSSS mmm * • *l +»* *smmi mm !• ***mmmml' ;s's*ss ss*ssas| • mmmmm SSSSSS m>m\ ssasss -. • fm mmmmm , r ********( m ; *•mm s i • 85 1 Sx m S H x*si m wm ^p73 . 2 68.7 4p16 4p15 4p14 4p13 4p12 (h12) 4p11 4q11 4q12 4q13 4q21 4q22 4q23 4q24 4q25 4q26 4q27 4q28 4q31 4q32 4q33 . ' 4q34 (M2) 4q35 It 104.4 3P ! | 99.8 Ijp 74.6 (h12) 5p15 5p14 (hi) 5p13 <h12) 5p12 5p11 5q11 5q12 5q13 Sq14 5q15 5q21 Sq22 5q23 5q31 5q32 5q33 5q34 (hi) 5q35 6p25 6p24 6p23 6p22 6p21 (d) 6p12 6p11 6q11 6q12 6q13 6q14 6q15 6q16 (hi) 6q21 (c1) 6q22 6q23 6q24 6q25 6q26 6q27 (hi) 7p22 7p21 7p15 7p14 7p13 7p12 7p11 7q11 7q21 (hi) 7q22 7q31 (h12) 7q32 7q33 7q34 7q3S 7q36 (hi) 8p23 8p22 8p21 8p12 8p11 8q11 8q12 8q13 8q21 8q22 8q23 8q24 (h12) 9p24 9p23 (h12) 9p22 9p21 (M) 9p13 9p12 (h12) 9p11 (h12) 9q11 9q12 9q13 9q21 9q22 9q31 9q32 9q33 9q34 10p15 10p14 10p13 10p12 10p11 10q11 10q21 10q22 10q23 10q24 10q25 (h12) 10q26 I f tSSi * f i =: ft ,.,..| «~ — - L , 1-. ::|: ••••••;•••••;•!••••• • I %mW q MB—^ .^ :.:.?.:.;.^ ^^ •.,,:.h,.,,;,.;,.;,,,,,,-,^  .,,,,,,.,,,,,,,,,.1 \ I m mm Imm 1 ^ l i fe—*i \m-'¥ L__^ I 1 - T - n 1 m±m g £ Z ^  km ta*L -.•.-.-V-.-.-.-.v.-.-.-.-.i fegg ; L H .•.:.v.v .^v.v.;.-...-.v..-.^ ........-„ . ^mmmmmmmmmmmm i p mm sis :| ::::-::::::::-:-.| r E '^f mmmmmmmm ; 1 m -:.Vxx-"":ix:.^^ x xxx.fexxxxxx x« xxx xxxl . |::::s:v:-::::-:-::^ ::.::K|:.:.:.:.:::-:..: xXxisSSSSSSSSI ' i :*: ^ ^ ^ ^ 1 1 I J p t g = * ^ ^ : 1 1 [•:•:::-:-:-: x l ixXxXWXxXXxXXxl l i s C i i (.:•:.¥.;.: :.:.:.v.:.:.x.:.:.:l XvXvx-x-x-x-x-j r ;v;-v;v;v;-;vv;-;-J t 1 mmmmm I-::*:-.:-: ' mmmmiyc :•:•:•:• U.:x: B a r * — — ' • • • l ; : : : x ^ tt-je» IN^ I- >• ^ — — i . , f •• : — ±m r------—1 • i- *« - 4 - mmmmmM±mmi s=L 1 mtsr 1 hm t I H m — ' •IS':® * * * * * ; *l 74.3 84.3 90.4 232.2 1.8 • Chapter 3. Results • 117.9 • 78.4 •I I • 92.7 : • 84.5 I 74.6 16p13 16p12 16p11 16q11 16q12 16q13 16q21 16q22 16q23 16q24 (h12) 17p13 17p12 17p11 (c2) 17q11 17q12 17q21 17q22 17q23 17q24 (M2) 17q25 (h12) 18p11 18q11 18q12 (hi) 18q21 (h2) 18q22 (h12) 18q23 (c12) 19p13 19p12 19p11 19q11 19q12 (c2) 19q13 20p13 20p12 20p11 20q11 20q12 20q13 21p13 21p12 21p11 (h12) 21q11 21q21 (h12) 21q22 22p13 22p12 22p11 (hi) 22q11 (h12) 22q12 22q13 Xp22 (d) Xp21 (c2) Xp11 Xq11 Xq12 <c2) Xq13 Xq21 Xq22 Xq23 Xq24 Xq25 Xq26 Xq27 Xq28 Yp11 Yq11 Yq12 mm 1 1 5 3 • 1 mm- s*ss* ™ s t mmmmi |x:;:xx:x • **'* : i mmsmm mm rtmmsm x|SSS»S :*:x:xx| 1 mmmmm mi SSI immimmm sas sssl • Is***: : •• mm IS: mmm sss s:: 1 \:m tsxssssssssi |.::x:xSS::«!S:SxS:SS:xx::;S* \ ' -m KSSSSSSSSl ,1 lm><± mmimmmm l i s ;:(™ss| 1 1 :Sx|::.S:S sssssas sixssl mmmmmm,m~ * m m m m m m m xxxx.xx: mm asssssssssi mm mmm i S:;|:::x: SXJ. ]82.6 74.3 Figure 3.1: Distribution of breakpoints per unit chromosome length for rep ascertained through abnormalities. Horizontal bars represent breakage densities for each band in a 320 band kary-otype. Shaded region within a bar represents the 99% confidence interval. Thin vertical line within a confidence interval represents the observed breakage density. Thick vertical line, span-ning all bands in a chromosome, is the expected value for breakage density based on hypothesis I (8.88). Hypothesis n expected values are shown as •; dark band values (4.90) are seen to the left of hypothesis I expected value, and light band values (12.45) are to the right. Hot spots are bands with expected values located outside the lower confidence limit, in the nonshaded region. Broken bars represent values greater than 65. Bands of frequent and infrequent breakage at the 99% confidence level are designated as h i , h2, or h l 2 for hot spots, and c l , c2, or cl2 for cold spots, for hypothesis I, hypothesis II, or both hypotheses, respectively. 1p36 1p35 1p34 1p33 1p32 1p31 1p22 1p21 1p13 1p12 <h12) 1p11 1q11 1q12 (hi) 1q21 1q22 1q23 1q24 1q25 1q31 1q32 1q41 1q42 1q43 1q44 2p25 2p24 2p23 2p22 2p21 2p16 2p15 2p14 2p13 2p12 2p11 2q11 2q12 2q13 2q14 2q21 2q22 2q23 2q24 2q31 2q32 (hi) 2q33 2q34 2q35 2q36 2q37 3p26 3p25 3p24 3p23 3p22 (c2) 3p21 3p14 3p13 3p12 3p11 3q11 3q12 3q13 3q21 3q22 . 3q23 3q24 3q25 3q26 3q27 3q28 3q29 4p16 4p15 4p14 4p13 4p12 4p11 4q11 4q12 4q13 4q21 (h2) 4q22 4q23 4q24 4q25 4q26 4q27 4q28 4q31 4q32 4q33 4q34 4q3S 142.8 91.5 99.8 5p1S 5p14 6p13 5p12 5p11 5q11 5q12 (M) 5q13 5q14 5q15 5q21 5q22 5q23 6q31 5q32 5q33 5q34 (hi) 6q35 6p25 6p24 6p23 6p22 6p21 6p12 6p11 6q11 6q12 6q13 6q14 6q15 6q16 6q21 6q22 6q23 6q24 6q25 6q26 6q27 (hi) 7p22 7p21 7p15 7p14 7p13 7p12 7p11 7q11 7q21 7q22 7q31 7q32 7q33 7q34 7q35 7q36 8p23 8p22 8p21 8p12 8p11 8q11 8q12 8q13 8q21 8q22 8q23 8q24 9p24 9p23 <h12) 9p22 9p21 9p13 9p12 9p11 9q11 9q12 9q13 9q21 9q22 9q31 9q32 9q33 9q34 10p15 10p14 10p13 10p12 10p11 10q11 10q21 (hi) 10q22 10q23 (M) 10q24 10q25 10q26 I E 128.6 I S2.9 mm Chapter 3. Results Zlt 91-5 3 (I 92.7 16p13 16p12 16p11 16q11 16q12 16q13 16q21 16q22 16q23 16q24 17p13 17p12 17p11 17q11 17q12 17q21 17q22 17q23 17q24 (h12) 17q25 18p11 18q11 18q12 18q21 18q22 18q23 19p13 19p12 19p11 (h2) 19q11 19q12 19q13 20p13 20p12 20p11 20q11 20q12 20q13 21p13 21p12 21p11 21q11 21q21 21q22 22p13 22p12 22p11 22q11 22q12 22q13 (c2) Xp22 Xp21 <02) Xp11 Xq11 Xq12 Xq13 Xq21 Xq22 Xq23 Xq24 Xq25 Xq26 Xq27 Xq28 Yp11 Yq11 Yq12 54 I 91.5 83.9 I m mm I m mmmm N mmm 74.3 Figure 3.2: Distribution of breakpoints per unit chromosome length for rep ascertained inci-dentally. Horizontal bars represent breakage densities for each band in a 320 band karyotype. Shaded region within a bar represents the 99% confidence interval. Thin vertical line within a confidence interval represents the observed breakage density. Thick vertical line, spanning all bands in a chromosome, is the expected breakage density based on hypothesis I (4.99). Hypothesis II expected values are shown as •; dark band values (2.88) are seen to the left of hypothesis I expected value, and light band values (6.88) are to the right. Hot spots are bands with expected values located outside the lower confidence limit, in the nonshaded region. Broken bars represent values greater than 65. Bands of frequent and infrequent breakage at the 99% confidence level are designated as hi, h2, or hl2 for hot spots and cl, c2, or cl2 for cold spots, for hypothesis I, hypothesis H, or both hypotheses, respectively. Chapter 3. Results 55 The correction for excess breakpoints in light G bands (Hypothesis II) resulted in the reduction of the ratio of light band to dark band hot spots in both data sets, but the effect was more pronounced in the incidentally ascertained data (A2). 3.1.5 Distribution of Rep Breakpoints in Dark and Light Bands The overall distribution of breakpoints was compared in light and dark bands in all individual data sets of rep ascertained through an abnormality (Al) and rep ascertained incidentally (A2). The expected number of breaks in light and dark bands were computed as proportional to the total relative lengths of light and dark bands (NL/LL and ND/LD respectively). The results are shown in table 3.8. In all cases the frequency of breakpoints in light G bands was found to be significantly higher than in dark G bands. 3.1.6 Association of Rep Breakpoints with Fragile Sites and Oncogenes Overall distributions of rep breakpoints were tested under the assumption that the dis-tribution is random with respect to fragile sites or oncogenes. Under this assumption the frequency of breakpoints in all fragile site bands was expected to be proportional to the sum of the relative lengths of all fragile site bands. The overall distribution of breakpoints in fragile site bands was found to be nonrandom in rep ascertained through abnormalities (Al) with significantly greater number of breakpoints in fragile site bands. (x2 = 12.2, p < 0.001) In incidentally ascertained rep (A2), the distribution of break-points was random with respect to fragile sites (%2 = 1.29, p > 0.1). However, the coincidence of specific rep hot spots and fragile site bands as measured by the hyperge-ometric distribution is likely to be due to chance in both ascertainment groups and for both hypotheses (p is at least 0.083). The same method was used to test the distribution of. rep breakpoints in bands with oncogenes. The overall distribution was nonrandom for both rep ascertained through Chapter 3. Results # of Breakpoints Data Set Light Bands Dark Bands z P # % # % Rep Ascertained Through Abnormalities a 848 (72.7) 319 (27.3) 15.32 < 0.0001 b 208 (76.2) 65 (23.8) 9.12 < 0.0001 c 109 (79.6) 28 (20.4) 7.80 < 0.0001 Total 1165 (73.7) 412 (26.1) 19.16 < 0.0001 Rep Ascertained Incidentally a 403 (70.1) 172 (29.9) 9.12 < 0.0001 b 100 (77.5) 29 (22.5) 6.76 < 0.0001 c 68 (75.5) 22 (24.5) 5.05 < 0.0001 d 12 (85.7) 2 (14.3) 3.53 0.0002 e 19 (79.2) 5 (20.8) 3.20 0.0007 / 43 (79.6) 11 (20.4) 4.92 < 0.0001 Total 602 (72.4) 230 (27.6) 12.89 < 0.0001 De novo Rep Ascertained Incidentally 103 (84.4) 19 (15.6) 9.67 < 0.0001 Table 3.8: Distribution of rep breakpoints i n dark and light G bands, (z for p < 0.01 2.58). Chapter 3. Results 57 Data Source a b c Sample Size (N) 73 65 28 df 24 24 24 x2 59.7 43.5 58.8 P < 0.00018 0.0087 0.0001 G2 70.2 54.1 48.8 P < 0.0001 0.0004 0.0020 Interpretation Nonrandom Nonrandom Nonrandom Table 3.9: Summary of % 2 and G2 in tests for overall randomness of breakpoint distri-butions of inversions ascertained through abnormalities (Bl). Data sources: a. ICRS; b. Daniel et al., 1989; c. ReCAP. See also table 2.5 Group Bl for references. abnormalities (Al) and rep ascertained incidentally (A2) (%2 = 12.3, p < 0.001 and X2 = 6.30, p < 0.025 respectively). On the other hand, we were unable to detect significant coincidence with specific hot spot bands (p is at least 0.109). 3.2 Inversions The only rearrangement type, besides rep, for which sufficient number of data points was available for analysis was inversion. Inversion data were analyzed in a manner similar to rep data. I^nitially, data from individual sources were analyzed separately. Subsequent to testing for homogeneity between individual data sets, pooled data were analyzed. The results are described in the following sections. 3.2.1 Inversions Ascertained Through Abnormalities ( B l ) The distribution of breakpoints overall on the chromosomes and in specific bands was tested for inversions ascertained through abnormalities (group Bl). The results are sum-marized in tables 3.9 and 3.10 respectively. The distribution of breakpoints overall on the chromosomes was nonrandom, despite Chapter 3. Results 58 Hypethesis I. Hot Spots in Data Set: a 6 c N = 81 N = 65 N = 28 2 p l l (L) - 2 p l l (L) 2ql3 (L) - 2ql3 (L) 3 p l l (D) 3ql2 (L) l lq21 (L) 7P15 (L) 8p23 (L) 8q22 (L) 6p25 (L) 6pl2 (D) Hypothesis II. Hot Spots in Data Set: a b c N = 81 N = 65 N = 28 2 p l l (L) 7pl5 (L) 8p23 (L) 8q22 (L) -2ql3 (L) 2ql3 (L) 3 p l l (D) 3ql2 (L) l lq21 (L) 6p25 (L) 6pl2 (D) Table 3.10: Bands of significant excess breakage detected at the 0.01 confidence level in inversions ascertained through abnormalities ( B l ) . Data sources are identified in table 3.9. Boxed bands are significant i n at least two data sets. (D) and (L) are designations for dark and light G bands respectively. Chapter 3. Results 59 the small sample sizes in all three data sets. Nonrandom breakage was also detected at the chromosome band level. High frequencies of breakpoints were observed in light G bands 2pll and 2ql3 in both the ICRS (data set a) and ReCAP (data set c) data sets. The probability that the two matches are due to chance is low (p=0.0012). The probability that the match between data sets a and c in hypothesis II is a chance occurrance is 0.0457, a significantly low result at the 95% level. No other matching hot spots were found in the three data sets. A significantly greater number of breakpoints were observed in light G bands than in dark G bands (see section 3.2.5). However, at the chromosome band level, the results of hypotheses I and II were virtually identical in all three data sources, with the exception that band 2pll in data set c was observed only in the hypothesis I test, indicating that the correction for a possible bias in assigning breakpoints to light G bands has very little influence on the results. Light G bands predominate in all data sets in tests of both hypotheses (see section 3.2.5. 3.2.2 Incidentally Ascertained Balanced Inversions (B2) Summaries of x2 a n d G2 tests for overall distribution of breakpoints and hot spots for incidentally ascertained inversions (B2) are shown in tables 3.11 and 3.12 respectively. The overall distribution of breakpoints was nonrandom in all data sets, but no specific bands were involved in ICRS (data set e). 4 matching bands were observed across data sets a through e including, 2pll, 2ql3, Ypl l , and Yqll. These bands represent only two rearrangements, inv(2)(pll;ql3) and inv(Y)(pll;qll) (see Appendix A). By comparing data sets with matching bands in pairs, we found that the coincidence of bands in any two data set is highly significant (p < 0.001 in all cases). The distribution of breakpoints in light and dark bands was nonrandom with a higher frequency of breakpoints in light G bands. This observation was not significant in data Chapter 3. Results 60 Data Source a b c d e Sample Size (N) 282 83 42 20 14 df 320 24 24 24 24 X2 2281 57.9 320 66.5 90.0 P < 0.0001 0.0001 < 0.0001 < 0.0001 < 0.0001 G2 876 68.3 101 43.8 44.7 P < 0.0001 < 0.0001 < 0.0001 0.0081 0.0063 Interpretation Nonrandom Nonrandom Nonrandom Nonrandom Nonrandom Table 3.11: Summary of %2 and G2 in test for overall randomness of breakpoint distri-butions of inversions ascertained incidentally (B2). Data Sources: a. Daniel et al, 1989; b. Hook et al., 1987; c. Ferguson-Smith., 1984; d. ReCAP; e. ICRS; See also table 2.5 Group B2 for references. set e at the 95% level, probably due to small sample size (see section 3.2.5). Similarly to inversions ascertained through abnormalities (Bl), the hypothesis II cor-rection had little influence on the set of bands found to have excess breakage. Although the ratio of light to dark hot spot bands is slightly lower in hypothesis II for data sets a and c, light hot spots predominate in both hypotheses for all data sets. 3.2.3 Incidentally Ascertained De novo Inversions (B3) The overall distribution of breakpoints was found to be nonrandom {p(x2 ^ 40.0) = 0.0211; p ( G 2 > 42.1) = 0.0127). At the level of chromosome bands, one hot spot in band llq21 was detected. The probability of breakage in llq21 is low because of its small size, therefore, this result was found to be significant in tests of both hypotheses I and II. Ilq21 was also observed in inversions ascertained through abnormalities (data set a) as well as in incidentally ascertained inversions (data set c). Chapter 3. Results 61 Hypethesis I. Hot Spots in Data Set: a b c d e N = 282 N = 83 N = 42 N = 20 N = 14 lq21 (L) l p l l (D) : -2 p l l (L) 2 p l l (L) 2 p l l (L) 2 p l l (L) -2q l3 (L) 2q l3 (L) 2q l3 (L) 2q l3 (L) -5 P 13 (L) 5q l3 (L) 6q l5 (L) l O p l l (L) 10q21 (D) 12ql5 (L) 1 2 p l l (L) 7p l3 (L) 7 q l l (L) l l q 21 (L) --Y p l l (L) - Y p l l (L) - -Y q l l (L) - Y q l l (L) - -Hypothesis II. Hot Spots in Data Set: a 6 c d e N = 282 N = 83 N = 42 N = 20 N = 14 lq21 (L) — l p l l (D) — -2 p l l (L) 2 p l l (L) 2 p l l (L) 2 p l l (L) -2q l3 (L) 2q l3 (L) - 2q l3 (L) -5p l 3 (L) 5q l3 (L) 6p l 2 (D) 6 p l l (D) 6q l5 (L) l O p l l (L) 10q21 (L) -I lq21 (L) --Y p l l (L) - Y p l l (L) - -Y q l l (L) - Y q l l (L) - -Table 3.12: Bands of significant excess breakage detected at the 0.01 confidence level in inversions ascertained incidentally (B2). Data sources are identified in table 3.11. Boxed bands were observed in at least two data sets. (D) and (L) represent dark and light G bands respectively. Chapter 3. Results 62 3.2.4 Comparison of Results: Inversions Ascertained Through Abnormali-ties (Bl) and Inversions Ascertained Incidentally (B2 & B3) Data from individual studies were tested for homogeneity using Pearson's x 2 and G2. The ReCAP data set was not included in this analysis due to its small sample size. The remaining two data sets were homogeneous ( p ( x 2 > 91) = 0.327; p ( G 2 > 124) = 0.420). In testing incidentally ascertained inversions for homogeneity the smallest data sets (ICRS and ReCAP) also had to be excluded from analysis because of small sample size. The three remaining data sets were not homogeneous (p (% 2 > 359) = 0.019; p ( G 2 > 339) = 0.031) at the 95% level. Data were pooled into either groups of inversions ascertained through abnormalities (Bl) or incidentally ascertained inversions (B2), and breakpoint distributions overall and in specific bands were tested in comparison of hypotheses of random breakage. The overall distribution in both ascertainment groups was found to be highly nonrandom (p(%2 > 774) < 0.0001; p { G 2 > 461) < 0.0001 for inversions ascertained through ab-normalities (Bl) and p(X2 > 2941) < 0.0001; p { G 2 > 1099) < 0.0001 for inversions ascertained incidentally (B2)). The results for individual bands are summarized in table 3.6. The observed breakage densities with the 99% confidence intervals and expected breakage densities for hypotheses I and II are shown in figures 3.2.4 and 3.2.4. Inversions involving breakpoints in hot spot bands listed in table 3.6 are found in Appendix A. The only bands that are common to both inversions ascertained through abnormalities (Bl) and inversions ascertained incidentally (B2) are 2pll and 2ql3, which are also repeat-edly detected in the individual data sets. This result is unlikefy to be due to chance coincidenece (p — 0.017 for hypothesis I and p — 0.012 for hypothesis II). The number of breakpoints in light G bands was significantly greater in both data sets (see section 3.2.5). The majority of hot spot bands were also in light G bands in both Chapter 3. Results 63 Hypothesis I Ascertained Through Abnormalities N = 174 2pll (L) 2pll (L) 2ql3 (L) RFS 2ql3 (L) 3P25 (L) 3pll (D) 3ql2 (L) 6p25 L 8p23 (L) ONC CFS Ascertained Incidentally N = 441 lpll (D) lq21 (L) CFS 5pl3 (L) 5ql3 (L) lOpll (L) lOqll (L) 10q21 (D) Ypll (L) Yqll (L) CFS Hypothesis II Ascertained Through Abnormalities N = 174 2pll (L) 2ql3 (L) 3p25 (L) 3pll (D) 3ql2 (L) 6p25 (L) 6pl2 (D) RFS ONC Ascertained Incidentally N = 441 lpll (D) lq21 (L) 2pll (L) 2ql3 (L) 5pl3 (L) 5ql3 (L) CFS RFS CFS 6pl2 (D) lOpll (L) 10q21 (D) Ypll (L) Yqll (L) CFS Table 3.13: Bands of significant excess breakage detected at the 0.01 confidence level in pooled inversion data. Boxed bands are common to both ascertainment groups. CFS=common fragile site; i lF5=rare fragile site; 0/VC=oncogene. (D) and (L) represent dark and light G bands respectively. 1p36 1p35 1p34 1p33 1p32 1p31 1p22 1p21 1p13 1p12 1p11 1q11 1q12 1q21 1q22 1q23 1q24 1q25 1q31 1q32 1q41 1q42 1q43 1q44 2p25 2p24 2p23 2p22 2p21 2p16 2p15 2p14 2p13 2p12 (h12) 2p11 2q11 2q12 (h12) 2q13 2q14 2q21 2q22 2q23 2q24 2q31 2q32 2q33 2q34 2q35 2q36 2q37 61.9 i j f ] 28.6 "Ife 44.7 3p26 (h12) 3p25 3p24 3p23 3p22 3p21 3p14 3p13 3p12 (h12) 3p11 3q11 (h12) 3q12 3q13 3q21 3q22 3q23 3q24 3q25 3q26 3q27 3q28 3q29 4p16 4p15 4p14 4p13 4p12 4p11 4q11 4q12 4q13 4q21 4q22 4q23 4q24 4q25 4q26 4q27 4q28 4q31 4q32 4q33 4q34 4q35 {^3 73.2 ZJJH 53.0 I J P 67.6 5p15 6p14 5p13 5p12 5p11 5q11 5q12 5q13 6q14 5q15 6q21 5q22 5q23 5q31 5q32 6q33 5q34 Sq3S (h12) 6p25 6p24 6p23 6p22 6p21 (h2) 6p12 6p11 6q11 6q12 6q13 6q14 6q15 6q16 6q21 6q22 6q23 6q24 6q25 6q26 6q27 7p22 7p21 7p15 7p14 7p13 7p12 7p11 7q11 7q21 7q22 7q31 7q32 7q33 7q34 7q35 7q36 (hi) 8p23 8p22 8p21 8p12 8p11 8q11 8q12 8q13 8q21 8q22 8q23 8q24 IZE 67.6 9p24 9p23 9p22 9p21 9p13 9p12 9p11 9q11 9q12 9q13 9q21 9q22 9q31 9q32 9q33 9q34 10p15 10p14 10p13 10p12 10p11 10q11 10q21 10q22 10q23 10q24 10q25 10q26 4J3 48.S 66.2 Chapter 3. Results 4b U P 44.2 66.2 S3.0 16p13 16p12 16p11 16q11 16q12 16q13 16q21 16q22 16q23 16q24 17p13 17p12 I7p11 17q11 17q12 17q21 17q22 17q23 17q24 17q25 18p11 18q11 18q12 18q21 18q22 18q23 19p13 19p12 19p11 19q11 19q12 19q13 20p13 20p12 20p11 20q11 20q12 20q13 21p13 21p12 21p11 21q11 21q21 21q22 22p13 22p12 22p11 22q11' 22q12 22q13 Xp22 Xp21 Xp11 Xq11 Xq12 Xq13 Xq21 Xq22 Xq23 Xq24 Xq25 Xq26 Xq27 Xq28 Yp11 Yq11 Yq12 IE 64 44.2 74.3 Figure 3.3: Distribution of breakpoints per unit chromosome length for inversions ascertained through abnormalities. Horizontal bars represent breakage densities for each band in a 320 band karyotype. Shaded region within a bar represents the 99% confidence interval. Thin vertical line within a confidence interval'represents the observed breakage density. Thick vertical line, spajming all bands in a chromosome, is the expected breakage density based on hypothesis I (0.93). Hypothesis H expected values are shown as •; dark band values (0.52) are seen to the left of hypothesis I expected value, and light band values (1.30) are to the right. Hot spots are bands with expected values located outside the lower confidence limit in the nonshaded region. Broken bars represent values greater than 40. Bands of frequent and infrequent breakage at the 99% confidence level are designated as h i , h2, or h l 2 for hot spots and c l , c2, or cl2 for cold spots, for hypothesis I, hypothesis H , or both hypotheses, respectively. 1p36 1p3S 1p34 1p33 1p32 1p31 1p22 1p21 1p13 1p12 (M2) 1p11 1q11 1q12 (h12) 1q21 1q22 1q23 1q24 1q25 1q31 1q32 1q41 1q42 1q43 1q44 2p25 2p24 2p23 2p22 2p21 2p16 2p15 2p14 2p13 2p12 (h12) 2p11 2q11 2q12 (M2) 2q13 2q14 2q21 2q22 2q23 2q24 2q31 2q32 2q33 2q34 2q3S 2q36 2q37 3p26 3p25 3p24 3p23 3p22 3p21 3p14 3p13 3p12 3p11 3q11 3q12 3q13 3q21 3q22 3q23 3q24 3q25 3q26 3q27 3q28 3q29 4p16 4p15 4p14 4p13 4p12 4p11 4q11 4q12 4q13 4q21 4q22 4q23 4q24 4q25 4q26 4q27 4q28 4q31 4q32 4q33 4q34 4q35 1104.9 5p15 5p14 (h12) 5p13 5p12 5p11 5q11 5q12 (h12) 5q13 5q14 5q16 5q21 Sq22 5q23 5q31 5q32 5q33 5q34 5q35 84.3 6p25 6p24 6p23 6p22 6p21 (h2) 6p12 6p11 6q11 6q12 6q13 6q14 6q15 6q16 6q21 6q22 6q23 6q24 6q25 6q26 6q27 U I P 68.6 112.7 7p22 7p21 7p15 7p14 7p13 7p12 7p11 7q11 7q21 7q22 7q31 7q32 7q33 7q34 7q35 7q36 8p23 8p22 8p21 8p12 8p11 8q11 8q12 8q13 8q21 8q22 8q23 8q24 9p24 9p23 9p22 9p21 9p13 9p12 9p11 9q11 9q12 9q13 9q21 9q22 9q31 9q32 9q33 9q34 I *m. -mA i m m ::« mmm • i >: mm mmmm: I mmmwms •|: m mm :: *.: mm *m mmtm . :•: mi mmmmi • ::*: mmmmff s i\ *mmmm m ::¥: vmmmm: ;**,) *m I ,1 10p16 10p14 10p13 10p12 10p11 (h12) 10q11 thD 10q21 (h12) 10q22 10q23 10q24 10q25 10q26 mm T\ Chapter 3. Results 78.4 74.3 16p13 16pl2 16p11 16q11 16q12 16q13 16q21 16q22 16q23 16q24 17p13 17p12 17p11 17q11 17q12 17q21 17q22 17q23 17q24 17q25 18p11 18q11 18q12 18q21 18q22 18q23 19p13 19p12 19p11 19q11 19q12 19q13 20p13 20p12 20p11 20q11 20q12 20q13 21p13 21p12 21p11 21q11 21q21 21q22 22p12 22p11 22q11 22p13 22q12 22q13 Xq22 Xp21 Xp11 Xq11 Xq12 Xq13 Xq21 Xq22 Xq23 Xq24 Xq25 Xq26 Xq27 Xq28 <h12) Yp11 (h12) Yq11 Yq12 M E T 65 W 90.9 157.0 104.7 Figure 3.4: Distribution of breakpoints per unit chromosome length for inversions ascertained incidentally. Horizontal bars represent breakage densities for each band in a 320 band karyotype. Shaded region within a bar represent the 99% confidence intervals. Thin vertical line within a confidence interval represents the observed breakage density. Thick vertical line, spanning all bands in a chromosome, is the expected breakage density based on hypothesis I (2.48). Hypothesis II expected values are shown as •; dark band values (1.28) are seen to the left of hypothesis I expected value, and light band values (3.56) are to the right. Hot spots are bands with expected values located outside the lower confidence limit in the nonshaded region. Broken bars represent values greater than 65. Bands of frequent and infrequent breakage at the 99% confidence level are designated as h i , h2, or h l 2 for hot spots and c l , c2, or cl2 for cold spots, for hypothesis I, hypothesis n , or both hypotheses, respectively. Chapter 3. Results 66 hypotheses I and II. The ratio of light to dark bands is reduced in hypothesis II for both inversions ascertained through an abnormality and inversions ascertained incidentally (5 to 4, and 5 to 3 in the two groups respectively). However, light G bands predominate in both sets of hot spots, similarly to inversions ascertained through abnormalities (Bl). 3.2.5 Distribution of Inversion Breakpoints in Dark and Light Bands The overall distribution of breakpoints were compared in light and dark bands in all individual data sets of inversions ascertained through abnormalities (Bl) and inversions ascertained incidentally (B2), and the results are described in table 3.14. The expected number of breaks in light and dark bands were computed as proportional to the total relative lengths of light and dark bands (NL/LL and ND/LD respectively). In most data sets, the frequency of breakpoints in light G bands was significantly higher than in dark G bands. 3.2.6 Association of Inversion Breakpoints with Fragile Sites and Oncogenes Overall distribution of inversion breakpoints was random for each ascertainment group with respect to both fragile sites and oncogenes (highest %2 value was 2.95 with associated p = 0.1). In testing the probability of chance coincidence of specific sets of bands of frequent breakage with fragile sites or oncogenes, we found that the 2 fragile sites detected in hypothesis I in inversions ascertained through abnormalities (Bl) were unlikely to match only by chance (p = 0.019). However, any other matches of hot spots with either fragile sites or oncogenes observed in the two hypothesis tests and the two ascertainment groups (see table 3.13) were likely to be coincidental (p is at least 0.288). Chapter 3. Results 67 # of Breakpoints Data Set Light Bands Dark Bands z V # % # % Inversions Ascertained Through Abnormality o 49 (67) 24 • (33) 2.63 0.0043 b 50 (77) 15 (23) 4.64 < 0.0001 c 23 (82) 5 (18) 4.07 < 0.0001 Total 125 (72) 49 (28) 5.62 < 0.0001 Inversions Ascertained Incidentally a 206 (73) 76 (27) 7.71 < 0.0001 b 67 (81) 16 (19) 6.48 < 0.0001 c 34 (81) 8 (19) 4.72 < 0.0001 d 16 (80) 4 (20) 3.05 0.0011 e 10 (71) 4 (29) * 1.55 0.0606 Total 333 (76) 108 (24) 9.72 < 0.0001 De novo Inversions Ascertained Incidentally 15 (68) 7 (32) * 1.56 0.0594 Table 3.14: D i s t r i b u t i o n of inversion breakpoints in dark and light G bands. Difference in data sets designated with * does not reach significance at the 95% level, z for p < 0.01 and p < 0.05 are 2.58 and 1.98 respectively. Chapter 3. Results 68 Band Study Case Aberrations 3cen a 1 ctb 3cen a 2 csb 3cen a 3 ctb 3cen 9p22 b 1 t(9;14)(P22;q22) b 2 del 9p22 17q21 a 1 cte(5;17)(qll;q21) a 1 csb(17q21) a 2 cteasy(2;17)(?;q21) b 1 csg(17q21) Table 3.15: Bands of significant excess breakage detected at the 0.01 confidence level, and the observed aberrations in these bands. Study a) Brandriff et al . , 1985 [4]; study b) Mar t in et a l , 1987 [45]. 3.3 Sperm Chromosome Aberrations We analyzed 109 sperm chromosome breakpoints. The distribution of breakpoints over all chromosomes was random {p(%2 > 21.2) = 0.6264; p(G2 > 23.4) = 0.4986). A t the level of chromosome bands, 3cen, 9p22, and 17q21 were frequently involved in aberrations in data pooled from two studies. This information is summarized in table 3.15 W i t h the exception of two aberrations involving band 17q21 found in one individ-ual (study a, case 1), all aberrations involving -hot spots were obtained from different individuals. Distribution of sperm chromosome breakpoints in bands with fragile sites or oncogenes was tested and found to be random for fragile sites (%2 = 1.69, p < 0.25), and nonrandom for oncogenes (%2 = 3.96, p < 0.05), with a significantly lower than expected frequency of breakpoints i n bands with oncogenes. Chapter 3. Results 69 3.4 Comparison of Results: Rep (Group A), Inversions (Group B), and Sperm Chromosome Aberrations (Group C) There were no bands frequently involved in all three groups of pooled rep, pooled in-versions and sperm chromosome aberrations. The light G band 9p22 was found to have excess breakpoints i n both incidentally ascertained rep (A2) and rep ascertained through abnormalities ( A l ) , as well as in sperm chromosome aberrations. Bands l p l l and lq21 were found in both incidentally ascertained rep (A2) and inversions (B2). 5pl3 is a hot spot band in rep ascertained through abnormalities ( A l ) (hypothesis I only) and inver-sions ascertained through abnormalities ( B l ) . 8p23 is a hot spot in both rep ascertained through an abnormality ( A l ) and incidentally ascertained inversions (B2) (hypothesis I only in both data sets). I lq21 was seen in incidentally ascertained de novo inversions (B3) and incidentally ascertained rep (A2). As measured by the hypergeometric distribution method, chance may account for the coincident occurrence of bands 8p23 and 5pl3 in two data sets. Furthermore, coincidence of 9p22 i n sperm chromosome aberrations and either in rep ascertained through abnor-malities ( A l ) or i n rep ascertained incidentally (A2) was found not to be significant. The hypergeometric method used to compare results of the tests for nonrandomness is inadequate for evaluating the significance of observing the same hot spot in three inde-pendent data sets, as we found for 9p22 in this study. For all other mathing bands found in independent studies chance could not be ruled out as a reasonable possibility (p is at least 0.05). Chapter 4 Discussion and Conclusions 4.1 Distribution of Breakpoints in Constitutional Rearrangements In previous studies of breakpoint distributions from constitutional rearrangements, non-random breakage was frequently suggested (see tables 1.1 and 1.2). However, the inter-pretation of these results is difficult due to a likely nonrandom representation of specific groups of breakpoints i n the samples studied, resulting from selection procedures that are in some way related to viability potentials and other phenotypic manifestations of the rearrangements. The goals of this project were to determine if there are specific sites on human chro-mosomes that are preferentially involved in constitutional reciprocal translocations and inversions. Ideally, the hypothesis of nonrandom breakage would be evaluated in a ran-dom sample of all human constitutional rearrangements. As it is virtually impossible to obtain such a sample, we took a comparative approach and analyzed several groups of rearrangements organized according to the method of selection for study, in the hope that we would be able to identify sources of nonrandomness in breakpoint distributions that are due to bias and those that may be truly the result of a biological predisposition to involvement in constitutional rearrangements. In our analyses we usually found the distribution of breakpoints i n the large data sets of reciprocal translocations and inversions to be nonrandom. In the evaluation of these results several possible explanations have to be taken into consideration: (1) The 70 Chapter 4. Discussion and Conclusions 71 observed clustering of breakpoints in specific chromosome regions may be the product of bias introduced through nonrandom ascertainment of rearrangements with respect to the location of breakpoints; (2) Nonrandomness may be due to chance fluctuations. Studies of breakpoint distribution present a special problem for statistical analysis and these have to be taken into account in the interpretation of the results. (3) Nonrandomness may in fact represent sites that are predisposed either to chromosome breakage, preferential rear-rangement involving specific sites on the chromosomes, or both. Each of these potential explanations for nonrandomness is considered individually in the following sections. 4.1.1 The Effects of Ascertainment Bias Nonrandomness in a distribution of breakpoints may result from bias if selected rear-rangements are ascertained in a manner that is in some way related to the position of breakpoints. Nonrandom selection with respect to breakpoint position may occur, if specific sets of breakpoints in rearrangements lead to specific phenotypic outcomes. If a specific outcome, such as live birth of a phenotypically abnormal child, depends on factors that are secondarily influenced by the position of breakpoints, rearrangements ascertained through other outcomes should represent different sets of breakpoints. This might occur for example if zygotes with unbalanced rearrangements at certain breakpoints were more likely to survive to be born alive than zygotes with unbalanced rearrangements at other breakpoints. In order to test the hypothesis, that the distribution of breakpoints observed in cases ascertained through abnormal phenotypes is determined, at least in part, by secondary factors, such as differential fetal survival, we compared the distribution of breakpoints associated with rearrangements ascertained through abnormalities to the distribution of breakpoints from rearrangements ascertained through incidental events. Nonrandomness was found in both ascertainment groups. However, the bands frequently involved in one Chapter 4. Discussion and Conclusions 72 group were generally different from the bands frequently involved in the other group. This result suggests that the nonrandomness observed among cases ascertained through phenotypic abnormality, is, at least in part, the product of ascertainment bias. In order to understand the effects of nonrandom selection of rearrangements on the distribution of breakpoints, it is useful to identify sources of ascertainment bias, in ref-erence to the specific groups of rearrangements that are likely to be affected. For this purpose, possible sources of bias are considered that may be of concern at specific stages, subsequent to the initial formation of the chromosomal rearrangement. Rearrangements may be detected prior to fertilization of the gamete, shortly after the mutation event by cytogenetic analysis of the sperm or ovum. Most often, however, rearrangements are detected at a stage subsequent to the transmission of the mutant chromosomes to the next generation. Detection of rearrangements in gametes is unaf-fected by forms of bias related to viability of the conceptus. On the other hand, some rearrangements may never be seen in gametes due to selection prior to gametogenesis. Sources of Bias Prior to or During Transmission of Rearrangement It is not known at what points structural rearrangements arise. Rearrangements may be formed prior to meiosis in a germ line cell, during meiosis, or subsequent to meiosis in a gamete. One of these times may predominate, or more than one of them may be common. If an rep is formed in a diploid germ line cell, prior to pairing of homologoues in meiosis I, aberrant pairing and segregation of the rearranged chromosomes and their homologoues during meiosis may result. Depending on breakpoint position, the rearrangement may completely interfere with pairing and segregation, preventing the production of gametes [8]. Alternatively, segregation of chromosomes may follow successful pairing of rearranged Chapter 4. Discussion and Conclusions 73 chromosomes and their homologoues in a quadriradial structure. The products of segre-gation may be chromosomally normal, balanced or unbalanced. If balanced products are produced, many are expected to be compatible with normal differentiation and survival of gametes. If unbalanced products are produced, the size and nature of the imbalanced segment determine the viability potential of the segregation product [10] [37] [63]. Severe imbalances may not be compatible with gamete differentiation [8], possibly leading to the nonrandom loss of certain breakpoints at this stage. Alternatively, reciprocal translocation may be formed in a haploid cell. Rearrange-ments in haploid cells do not affect segregation of homologoues in meiosis I. Conse-quently, imbalances due to segregation are not produced. Many of such rearrangements are expected to be chromosomally balanced. However, unbalanced products may arise post-meiotically as well. For example, imbalances may be produced through unequal degradation and/or resynthesis of DNA in a repair process. It is also conceivable that a balanced chromosome complement is not maintained in a rearrangement if fragments separated by breakpoints are lost prior to the completion of the rearrangement process. Balanced and unbalanced products in inversions may also be produced prior, or sub-sequent to meiotic segregation. Unbalanced recombinants result when a crossover takes place between homologous chromosomes in meiosis I. Alternatively, imbalances may arise as a consequence of the rearrangement process following meiotic segregation as suggested above for rep. In summary, selective loss of rep and inversion breakpoints may occur shortly after the rearrangement is formed, but prior to transmission of the rearranged chromosomes to an offspring. Rearrangements formed prior to meiosis, may interfere with pairing and segregation preventing germ cell maturation [8], or severe imbalances may lead to lethality in differentiating gametes. Any rearrangements formed after meiosis would not be affected by bias imposed by the segregation of aberrant chromosomes. However, Chapter 4. Discussion and Conclusions 74 imbalances may arise as a consequence of the processes of rearrangement or repair. An additional mechanism that may also lead to selective loss of specific rearrangement breakpoints involves the disruption of genes essential for germ cell maturation and gamete differentiation. Breakpoints of rearrangements in these genes may interfere with gamete survival and result in the nonrandom loss of a subgroup of rearrangement breakpoints. The extent that the mechanisms of nonrandom loss of specific groups of breakpoints may be of significance depends on the actual time when many or most rearrangements arise. Through these mechanisms a biased representation of breakpoints is produced in a sample of rearrangements detected prior to transmission to offspring. At present, the only such sample that is feasible to study involves aberrations from sperm chromo-somes. Based on the discussion above, breakpoint distributions obtained from sperm chromosomes should be considered a nonrandom sample of rearrangement breakpoints. Sources of Bias Subsequent to Transmission of Rearrangement All balanced and unbalanced rearrangements that are compatible with gamete survival may be transmitted to an offspring. A de novo carrier of a constitutional rearrangement results from the initial transmission followed by survival and successful implantation and development of the zygote. The rearrangement may be inherited in subsequent generations through transmission by fertile carriers. Based on the previous discussion, there are potentially four types of gametes that may be fertilized after the initial rearrangement event: Type 1 (balanced post-meiotic) A chromosomally balanced gamete that carries the original rearrangement formed in the haploid cell after meiotic segregation. Type 2 (balanced pre-meiotic) A chromosomally balanced gamete that is the product of meiotic segregation of chromosomal rearrangement that occurred in the diploid precursor cell, prior to or during meiosis. Chapter 4. Discussion and Conclusions 75 Type 3 (unbalanced post-meiotic) A chromosomally unbalanced gamete that carries the original rearrangement formed in the haploid cell after meiotic segregation. Type 4 (unbalanced pre-meiotic) A chromosomally unbalanced gamete that is the product of meiotic segregation of chromosomal rearrangement that occurred in the diploid precursor cell, prior to or during meiosis. These four types of gametes may very well produce different distributions of rearrange-ment breakpoints. It is possible that certain rearrangements are incompatible with normal mitotic divi-sion preventing implantation of the zygote and possibly resulting in an early unrecognized pregnane}' loss. Breakpoints that are nonrandomly associated with rearrangements in-compatible with preimplantation cleavage or mitotic division are lost at this stage. Ran-dom loss with respect to rearrangement breakpoint position may also occur through early post-zygotic lethality due to aberrant metabolic mechanisms or mitotic errors that are unrelated to the rearrangement. These are not expected to affect breakpoint distribu-tions. Fertilization of a gamete, with a balanced rearrangement formed in a haploid cell (type 1), should have a relatively high probability of leading to a normal live birth. The same holds for the second type of gamete, that carries a balanced rearrangement formed prior to meiosis I (type 2). These two gametic types are most likely to be represented in groups of rearrangements classified as incidentally ascertained balanced de novo rep (group A3) and inversion (group B3). De novo rearrangements with no obvious pheno-typic manifestations may be detected in the offspring of individuals referred for prenatal diagnosis for advanced maternal age or in surveys of consecutive newborns (see table 2.5 for references). Whether rearrangements from only one or both types of balanced gametes are present in incidentally ascertained de novo rearrangements cannot be deter-mined until the process of chromosome rearrangement is understood in more detail. The Chapter 4. Discussion and Conclusions 76 relative contribution by the two gametic types may have an effect on the extent that bias introduced in the segregation of aberrant chromosomes leads to nonrandomness in the distribution of breakpoints in incidentally ascertained balanced de novo rearrangements. Fertilization of a gamete with an unbalanced chromosome complement formed in a haploid cell (type 3) or in a diploid cell (type 4), may have various outcomes depending on the severity of the imbalance. - The imbalance may lead to an unrecognized abortion prior to or soon after implantation, recognized abortion of an abnormal fetus, stillbirth, or live birth of a phenotypically abnormal child. Such unbalanced de novo carriers may rarely be detected by chance in prenatal diagnosis or in population surveys. Abnormal-ities in a de novo carrier offspring or conceptus may also lead to the detection of the rearrangement. Some appearently balanced rearrangements are also detected in these ways when associated with some form of phenotypic abnormality. Ascertainment bias associated with direct phenotypic effects of imbalances and the influence of breakpoint position on meiotic segregation are also of concern in inherited rearrangements. In addition, the detection of rearrangements based on previous repro-ductive history is of relevance, particularly in the detection of balanced carriers. Balanced rearrangements detected through infertility, recurrent abortions, or birth of an abnormal child are likely to be representative of breakpoints that preferentially lead to unbalanced segregation products. In contrast, balanced rearrangements that tend to result in nor-mal balanced offspring are probably overrepresented in incidentally ascertained balanced rearrangements. In conclusion, comparison of rep and inversions ascertained incidentally and through abnormalities suggests that biased representation of specific sets of breakpoints accounts for much of the nonrandomness in the distributions. A possible mechanism of producing this nonrandomness involves the interdependence of the position of breakpoints with Chapter 4. Discussion and Conclusions 77 various phenotypic manifestations of the chromosome imbalance. In rep, the source of this interdependence at the level of chromosome segregation may be a tendency to specifically produce balanced or unbalanced products through a prefered segregation mode [44]. Although in inversions the influence of breakpoint position on segregation leading to balanced and unbalanced products is less clear, the relative position of breakpoints to each other does affect the frequency of recombination in the inverted segment, that lead to chromosomally unbalanced products [40] [43]. Beyond segregation, phenpotypic manifestation of a rearrangement is related to breakpoint position through the differential survival potential of segregants with a chromosome rearrangement. The size and genetic content and possibly other characteristics of the unbalanced chromosome segment are directly related to the phenotype of the carrier. That in turn determines the likelyhood that the rearrangement is detected in a population ascertained in a specific way. 4.1.2 T h e Ef fects of C h a n c e F l u c t u a t i o n s Some of the nonrandomness in the distribution of breakpoints may be the product of chance fluctuations. In any statistical analysis, conclusions have a component of uncer-tainty resulting from the probabilistic nature of statistical testing. For this reason, the set of hot spot bands detected in a single sample may include bands that show frequent breakage by chance. Alternatively sites of nonrandomness may also go undetected. The probability that bands of nonrandom involvement are detected by chance is determined by the level of significance used to test for nonrandomness. The detection of a hot spot in a band with random breakage is equivalent to the inappropriate rejection of a null hypothesis ( H 0 ) of random breakage when it is really true. This is known as a type I, or a error [48]. At the 99% significance level, hypothetical values of random breakage are expected to he outside the observed confidence interval in any band 1% of the time. Therefore, in a simultaneous test of 320 bands, approximately 3 to 4 bands Chapter 4. Discussion and Conclusions 78 are expected to be falsely designated as hot spots. In some of the data sets, this number represents the majority of hot spots. One way to distinguish between hot spot bands that are a chance occurrence and bands that are frequently involved in breakage for other reasons is to analyze data from different sources that were compiled independently but using similar procedures and ascertainment criteria. The data sets should represent large samples and have approx-imately equal sample sizes. Bands that are consistently affected by breakpoints at a high frequency can then be considered to be nonrandomly involved in rearrangements for other reasons. In this study we attempted to make such comparisons between indepen-dent data sets by comparing data from individual studies prior to pooling, to see if the same bands were repeatedly observed in most (see tables 3.2, 3.5, 3.10, 3.12). Although some bands were observed in more than one data set, many were not. This difference is not likely to be due to ascertainment bias as the mode of ascertainment was similar i n all data sets i n each group. There are two explanations for our inability to reproduce results in independent data sets of the same general mode of ascertainment. It is possible that the populations sampled in the original studies were not comparable. In the present analysis we had to rely on data collected in various studies with various purposes. Therefore, we had no control over the design of the data collection process. Although incidentally ascertained rearrangements were mostly detected in prenatal diagnosis for advanced maternal age (tables 2.3 and 2.4), the group of rearrangements ascertained through abnormalities is particularly heterogeneous (tables 2.1 through 2.2). In this study, we were unable to demonstrate homogeneity between data sets of rep ascertained through abnormalities and inversions ascertained incidentally. A n alternative explanation for differences observed across data sets is the large vari-ation in sample sizes. A true site of nonrandom involvement is more likely to be masked Chapter 4. Discussion and Conclusions 79 by chance fluctuation in a small data set leading to the acceptance of the null hypothesis when it is actually false [41]. This error is known as a type II or fl error. The exact value of type II or fl error is difficult to calculate for most statistical tests, but in gen-eral fl errors are reduced by increasing sample size [48]. Accordingly, fl errors probably account for the observation in the present study that fewer hot spot bands were de-tected in independent data sets employing similar methods of ascertainment was greater in smaller samples. In this analysis, the sample size in the original studies was probably too small to avoid most fl errors. With the exception of rep ascertained incidentally and through abnormalities from [11], all individual data sets averaged fewer than 1 break per band in a 320 band karyotype. A relatively simple empiric method for determining the optimal sample size to reduce fl errors would involve comparisons of the numer of hot spots detected between data sets with a range of sample sizes ascertained the same way. The increase in the number of hot spots detected should level off when the sample sizes become large enough to reduce fl errors to acceptable levels. 4.1.3 Candidate Sites of True Nonrandom Involvement Subsequent to evaluation of the roles of ascertainment bias and chance in the production of nonrandomness in breakpoint distributions, there remain a set of bands with significant excess of breakpoints in various ascertainment groups. These bands may represent sites of biologically important predisposition to rearrangement due perhaps to characteristics of the underlying DNA sequence, chromatin structure or nuclear organization. Five chromosome bands (5q35, 7p22, 9p22, 13ql4, 17q25) were found to have a higher than expected number of breakpoints in pooled data sets of both incidentally ascertained rep (A2) and rep ascertained through abnormalities (Al). After correcting for a possible bias in preferentially assigning breakpoints to light G bands (see below), only 9p22 and 17q25 have a significant excess of breakpoints. As discussed above, the influences of Chapter 4. Discussion and Conclusions 80 various ascertainment biases in the groups of rep ascertained incidentally and through abnormalities would be expected to lead to overrepresentation of different sets of break-points. Thus, excess breakage i n bands observed in both ascertainment groups may represent sites of biologically important predisposition to rearrangement. Although ran-dom chance i n the coincidence of bands 9p22 and 17q25 in the two ascertainment groups cannot be excluded, the coincidence of five bands detected according to hypothesis I, was found to be highly unlikely. Further support for a predisposition to rearrangement at band 9p22 comes from the observation of excess breakage at this band i n sperm chromo-some abnormalities. The only bands with a significant excess of breakpoints in both incidentally ascer-tained inversions and inversions ascertained through abnormalities were 2 p l l and 2ql3 . Almost all rearrangements involving these bands are the identical inversion inv(2 ) (p l l ;q l3 ) detected in apparently unrelated individuals. This situation is a typical example of our observations i n inversion breakpoints. Most inversion hot spots result from many carriers of the same rearrangement detected independently. There are two possibilities to account for this observation. Identical rearrangements may occur in unrelated individuals as the product of frequent mutations because of some D N A sequence or structural predisposi-tion that strongly favors the formation of one specific inversion as opposed to any other at a given breakpoint. Alternatively, a single mutation may have become widespread in the population because of a founder effect (i.e. genetic drift) and the lack of any significant reduction in the reproductive fitness of carriers. To distinguish between these possibilities, analysis of a data set of incidentally ascer-tained de novo inversions would be informative. In this study, only a very small data set of such rearrangements was available. Although we did not observe inv(2 ) (p l l ;q l3 ) among these cases, a much larger set of data would be required to draw reliable conclusions. Chapter 4. Discussion and Conclusions 81 There is support for a founder effect related to inversions. A mechanism that main-tains the reproductive fitness of inversion carriers through crossover suppression does exist in other organisms [40] [43]. Furthermore, the risk for unbalanced segregation products for both paracentric and pericentric inversion carriers was found to be relatively low [10]. Inversions have been important in evolution [15] indicating that they can become fixed in the population. Therefore, it seems likely that nonrandom involvement in inversions is more likely to be due to the high frequency of certain inversions, derived from a single ancestral mutation in the population. As a result, the study of breakpoint distributions to detect sites that are predisposed to forming inversions would be more enlightening using de novo data. In summary, for rep we found some bands that may represent sites of biologically mediated predisposition to rearrangement. However, we were unable to show this un-equivocally in our data on rep and inversions. Candidate bands for hot spots due to a predisposition to rearrangement would be expected to be seen in most independently conducted studies, in data ascertained through different methods, and in de novo data. Similar hot spots may also occur in sperm data. Bands that are predisposed to rear-rangement might also be involved in both inversions and rep. We were unable to detect any bands that showed frequent involvement in data from all these sources. However, some bands may turn out to be frequently involved in breakage and rearrangements in larger and more homogeneous data sets. 4.2 E v i d e n c e for R a n d o m B r e a k a g e The distribution of breakpoints in incidentally ascertained rearrangements and in re-arrangements ascertained through abnormalities was found to be nonrandom. In the previous discussion we considered ascertainment bias and chance variation as possible Chapter 4. Discussion and Conclusions 82 explanations for the observed nonrandomness. Here we evaluate the evidence for ran-dom breakage in terms of the results obtained in de novo rearrangements and sperm chromosome aberrations. The data set of incidentally ascertained de novo inversions was very small. However, we were able to obtain 122 incidentally ascertained de novo rep breakpoints and found that the overall distribution of breakpoints was random. In addition there was no frequent involvement in breakage in specific bands that was not otherwise be detected in testing for overall nonrandomness. The reason for analyzing incidentally ascertained de novo rep was the possibility that some forms of bias affecting incidentally ascertained inherited balanced rearrangements may not be present in de novo rearrangements. Ascertainment bias related to segre-gation leading to unique different frequencies of balanced and unbalanced products for specific rearrangements is virtually impossible to eliminate in constitutional rep data. Recognition of such bias is most important in analysing inherited rearrangements, and it may be of special concern in prenatal diagnosis data through the following mechanism. As discussed previously in section 4.1.1 carriers of balanced rep that are not likely to produce unbalanced progeny are less likely to be referred for cytogenetic studies than carriers of rearrangements who tend to produce unbalanced progeny at a higher fre-quency. In practical terms this means that carriers in the former group would only in rare instances have abnormal children, recurrent abortions or fertility problems related to their rearrangement. Consequently, cytogenetic studies are unlikely to be performed on these carriers. By the time mothers become eligible for prenatal diagnosis for advanced maternal age, a subset of rearrangements that tends frequently to produce imbalance is likely to have been removed from this "normal" population through earlier detection because of abnormal pregnancy outcome. In terms of the possible difference between incidentally ascertained balanced and Chapter 4. Discussion and Conclusions 83 inherited rearrangements with respect to a bias of segregation, the observed randomness in breakage in de novo rearrangements is interesting. The question is if this difference is due to less bias in de novo data or simply to small sample size. As discussed above in section 4.1.2, small samples are affected by /3 errors to a greater extent, and this may be the reason for the observed randomness in de novo data. In small sets of inherited rearrangements randomness in the overall distribution of breakpoints was also observed. Nonrandomness was detected only in the larger or pooled data sets. Therefore, it is quite possible that nonrandomness would also be found in a larger data set of de novo rep. As this type of data potentially represents the best source of constitutional data with respect to ascertainment bias, it is important to study a larger data set of these rearrangements. We also found random breakage in a set of breakpoints obtained from sperm chro-mosome aberrations. Although we found 3 sites of nonrandom involvement in testing at the chromosome band level, this could be a chance observation as discussed in section 4.1.2. Sperm chromosome aberrations are fundamentally different from constitutional re-arrangements detected in fetuses, children and adults in several ways. For example, cells of somatic tissues go through a series of mitotic divisions, but division and chromosome segregation do not take place in mature sperm cells. Therefore, sperm chromosomes may exhibit aberrations with lethal or destabilizing effects on dividing somatic cells. Further-more, chromosome aberrations in sperm cells involve a wider range of abnormalities from chromatid gaps to reciprocal translocations [45] [44]. In this respect, sperm chromosome aberrations may represent a more complete and unbiased set of the breakpoints, but these may or may not be the same sites involved in rearrangements. Sperm chromosome data to date have been obtained from a small number of in-dividuals, with many different rearrangements observed in each individual. Thus the distribution of breakpoints observed might be influenced by predispositions to break-age unique to certain individuals. We observed 5 cases of two different rearrangements Chapter 4. Discussion and Conclusions 84 involving the same band in the same individual among 24 donors. An increasing frequency of structural aberrations in sperm has been observed with increasing age of donor [45]. Therefore, sites of nonrandom breakage found in sperm chromosome rearrangements might differ according to the age structure of the population of donors. The results of our analysis of incidentally ascertained de novo rep and sperm chro-mosome aberrations provide no indication of nonrandom predisposition to chromosomal rearrangement. Although there are several problems with the data that interfere with firm conclusions, as discussed above, the results obtained in these data sets are in general agreement with our inability to show indisputable evidence of predisposition to rearrange-ment at specific chromosome bands of inherited chromosomal abnormalities. 4.3 Distribution of Breakpoints in Light and Dark Bands Breakpoints in light G bands were consistently observed with a higher frequency than breakpoints in dark G bands (see sections 3.1.5 and 3.2.5). The obvious question that follows this observation is whether there is a possible predisposition to rearrangement in light G bands. It has been proposed that this excess is not due to such predisposition, but rather to a bias on the part of cytogeneticists in preferentially assigning breakpoints to fight bands because the human eye recognizes such breaks more easily [55]. A test of this hypothesis would be to use both R and G banding on the same set of rearrangements and compare the assignment of breakpoints in light and dark bands between the two banding methods. If pattern recognition bias is the sole explanation for the excess of breakpoints observed in light G bands, then in an R banded data set a similar excess should be observed in fight R bands (i.e. dark G bands). We do not have data available to make such comparisons with the exception of 45 Chapter 4. Discussion and Conclusions 85 breakpoints from R banded rep with unknown ascertainment (provided by Dr. C.-L. Richer (data not shown)). In this data set there were 25 breakpoints in dark R bands (light G), and 20 breakpoints in light R bands (dark G). Although, this distribution is not significantly different from the expectation of 24 and 21 breakpoints respectively, based on the total relative lengths of light and dark G bands, it is interesting to note that the trend is opposite to that produced on the basis of pattern recognition bias. Similar trends in a much larger data set would indicate that pattern recognition bias is not the only explanation for excess breakage in light G bands. 4.4 C o i n c i d e n c e w i t h F r a g i l e Sites a n d O n c o g e n e s Fragile sites are expressed under certain culture conditions in vitro as breaks or gaps at specific locations on the chromosomes [66]. Their existence in vivo, functional significance and structure at the molecular level are as yet unknown. Fragile sites are classified in two main groups according to their frequency in the population: rare fragile sites are observed only in a small number of individuals while common fragile sites are frequently seen [66]. Further subgroups based on the mode of induction exist. To date 113 fragile sites are mapped, of which 75 have confirmed status [67]. It has been suggested that fragile sites predispose to chromosome breakage and struc-tural rearrangements [66]. In keeping with this hypothesis, increased frequencies of sister chromatid exchange [27] [21], sperm chromosome aberrations [2], breakpoints of struc-tural rearrangements in cancer [18] [73], and spontaneous chromosome breaks [47] have been reported at various fragile sites. Deletions and reciprocal translocations at two fragile sites have been demonstrated cytologically in somatic cell hybrids [28]. In a study of spontaneous abortions, stillbirths, and newborns [31], and in prenatal Chapter 4. Discussion and Conclusions 86 diagnosis results [30], overall elevated frequencies of constitutional rearrangement break-points in fragile site bands have been claimed. These results, however, are questionable due to limitations of the data source [30] [31], as well as of the statistical methods used [16]. In another study of 6391 constitutional rearrangement breakpoints including various rearrangements ascertained in a variety of ways, no overall association of constitutional breakpoints with fragile sites were found [53]. Similarly, in 984 breakpoints from balanced carriers with recurrent spontaneous abortions no association between fragile site bands and rearrangement breakpoints was observed [12]. These results suggest that there is no overall preferential involvement of constitutional rearrangement breakpoints in fragile sites. Our own results indicate no overall association between fragile site bands and either rep breakpoints ascertained incidentally (A2) or inversions in either ascertainment group (Bl & B2). Rep breakpoints ascertained through abnormalities (Al) were, however, non-randomly associated with fragile site bands (see section 3.1.6. In incidentally ascertained rep and sperm chromosome data no such association was detected. Therefore, the asso-ciation of fragile sites with rep breakpoints ascertained through abnormalities appears to be dependent on the ascertainment bias affecting these rep. Of the fragile site bands observed in other studies only 18q21 [53] in rep ascertained through abnormalities and 2ql3 [30] in inversions of both ascertainment groups were seen in this study as well. 18q21 was found in a study of pooled data on all kinds of structural rearrangements and 2ql3 was seen among cases ascertained through prenatal diagnosis. Apparently, the extent of association and the specific bands involved are heavily influ-enced by the types of rearrangements in the data set, and the forms of ascertainment bias affecting the data. We conclude that there is no consistent association between fragile sites in general and constitutional chromosome rearrangement breakpoints. In all of our pooled data sets there were some fragile site bands that were frequently Chapter 4. Discussion and Conclusions 87 involved in rearrangements. The coincidence of these bands with fragile site bands is probably explained as a chance occurrence (see sections 3.1.6 and 3.2.6). Our method of testing for significant overall coincidence using the hypergeometric distribution is not adequate to test for significant correlation at any one specific site. Therefore, lack of significant correlation demonstrated by this method does not rule out the possibility that certain specific fragile site bands are also predisposed to constitutional chromosomal rearrangement. Consistent association between hot spots for constitutional breakage in bands with fragile sites in independent unbiased data sets would be required as a first step to demonstrate this. Verification would require high resolution banding and molecular techniques, to demonstrate such a relationship in isolated cases. Some bands containing oncogenes were found among the hot spots in rep ascer-tained both incidentally and through abnormalities. Only one band, 17q25 with onco-gene ERBA2L, was observed in both ascertainment groups of rep. This band may be a candidate for studying the oncogene as a possible factor involved in constitutional re-arrangement. Overall coincidence of the bands involved, however, was not significant as only 4 out of 20 bands in rep ascertained incidentally and 2 out of 20 bands in rep ascertained through abnormalities contained oncogenes (see sections 3.1.6 and 3.2.6. The distribution of inversion breakpoints was also random with respect to bands containing oncogenes. In sperm chromosome rearrangements breakpoints were significantly less fre-quently involved in bands with oncogenes. Chance may be responsible for producing this effect given the small sample size. We conclude that bands with known oncogenes do not appear generally associated with a predisposition to constitutional chromosome rearrangements. Chapter 4. Discussion and Conclusions 88 4.5 Improvements for Future Analysis In this study, we were faced with several problems in the analysis of constitutional re-arrangement breakpoints. To allow more definite conclusions regarding the distribution of constitutional rearrangement breakpoints, it is important to address several of these concerns in future studies. First of all, a major problem with this and other similar studies has been the lack of large enough data sets. Ideally, the number of breakpoints should be much greater than the number of classes studied for reliable statistical conclusions. For the % 2 test, 4 to 5 times the number of classes, that is well over 1000 breakpoints for a 320 band karyotype, have been suggested [23]. Only a few of our data sets had this many breakpoints, and most had far fewer. We were able to observe recurring trends in some of our data sets, but most were likely the result of ascertainment bias or founder effect. Such bias can be strong enough to show an effect in very small data sets, as we have seen for inversions. However, nonrandomness due to a biological predisposition to rearrangement is more difficult to detect, unless the tendency is strong enough to be apparent despite any superimposed bias. Constitutional rearrangement data from incidentally ascertained balanced de novo cases probably represents the best type of data that is currently available. It would be useful to compile data sets comprising this kind of information to allow more reliable statistical analyses to be carried out. Chromosome aberrations from sperm of healthy individuals is a potentially useful source of information for breakpoint distribution analysis. However, more data of this kind are also required. Data from a large number of individuals is necessary to understand the nature of the variation between individuals and to determine if predispositions to breakage at different specific sites exist in different men. Chapter 4. Discussion and Conclusions 89 The question of nonrandom breakage with respect to light and dark G bands remains unresolved. The problem of pattern recognition bias may be best evaluated by analysing single data sets that are both G and R banded and interpreted separately. Analysis of separate data sets ascertained in similar ways but banded using the two different methods might also be useful. The present study was aimed at defining the distribution of breakpoints at the chro-mosome band level. For similar studies in the future, it is important to obtain precise measurements of chromosome bands at different levels of resolution. Furthermore, the relative amounts of DNA in light and dark bands should be determined in order to allow meaningful comparisons of breakpoint frequencies in the two bands. The amount of DNA in a band may be of particular relevance if chromosome breakage most often takes place in interphase when the DNA is uncoiled, rather than in metaphase when the DNA is tightly coiled. In the former situation, we may expect that the length of the DNA in a band would be related to the frequency of breakpoints in that segment. In the present study and in other studies utilizing methods of analysis aimed at detecting nonrandomness at the level of chromosome bands, direct measurements of chromosome bands from the ISCN diagrams [34] have been made. These diagrams were produced based on estimated relative lengths and are, as a result, inaccurate. Further error is introduced by measurements of the diagram. Precise meausurements, with error estimates, of metaphase chromosome bands at various levels of contraction are essential for this kind of study. It has been repeatedly shown that taking band lengths into consideration greatly influences the results [16] [53]. If this is to be a consistent approach for breakpoint distribution analysis, proper measurements of band length and consistant treatment of variable bands are necessary. Our statistical method has proved useful in detecting nonrandomness in breakpoint distributions at the level of in chromosome bands. As discussed in section 4.1.2, chance Chapter 4. Discussion and Conclusions 90 can produce a and f3 errors that simulate or mask nonrandomness in breakpoint distri-butions. Therefore, it is useful also to test for overall nonrandomness to understand the general distribution of breakpoints. At present, we are unaware of a reliable statistical method to do this, given the large number of classes and small samples in each class in available data sets. We used two goodness of fit tests and considered a result acceptable if the same conclusion was reached by both tests [23]. It would be useful to find improved approaches to this problem. 4.6 Conclusions 1. The distribution of breakpoints in inherited reciprocal translocations and inver-sions ascertained through abnormalities or through incidental events is nonrandom. Much of the nonrandomness can be accounted for as bias in the ascertainment of rep, and a founder effect among inversions. 2. Our least biased data set (incidentally ascertained balanced de novo reciprocal translocations), shows a random distribution of constitutional rep breakpoints with no specific bands preferentially involved. Larger data sets of de novo rep are re-quired, however, because this data set is so small that (3 errors are quite likely. We cannot conclude anything about inversions, as the data set of de novo inversions was too small. The overall distribution of breakpoints in sperm chromosome rearrangements was also random. There are a few candidate bands for possible biologically mediated predisposition to rearrangement in rep. Bands 5q35, 7p22, 9p22, 13ql4, and 17q25 were observed Chapter 4. Discussion and Conclusions 91 as hot spots both in incidentally ascertained rep and rep ascertained through ab-normalities, despite the opposing forces of ascertainment bias affecting these data. 9p22 was also a hot spot i n sperm chromosome rearrangements providing further support for possible nonrandom breakage and/or rearrangment at this site. 5. No conclusions can be drawn about the involvement of specific bands in inver-sions unti l the problem of repeated ascertainment of the same rearrangement is satisfactorally addressed. 6. Fragile sites and oncogenes were not found to be frequent factors in predisposing to constitutional rearrangements. However, they may play a role in the generation of constitutional rearrangements in a few specific bands. Possibilities from the present analysis include bands l p l l , and 7p22 that coincide with common fragile sites and band 17q25 which contains the E R B A 2 L oncogene. 7. The distribution of breakpoints i n dark and light bands appears to be nonrandom in the direction of higher frequencies of breakpoints i n light G bands, even after correction for overascertainment. 8. Ascertainment bias is important i n producing nonrandomness i n the distribution of rearrangement breakpoints, and masks the fundamental, structurally-related dis-tribution of breakpoints produced at the time rearrangements occur. Therefore it is important to study both de novo constitutional rearrangements ascertained incidentally de novo rearrangements in sperm from sperm to understand the distri-bution of rearrangement breakpoints. Further studies must be to be carried out to characterize sites of frequent rearrangement in order to provide clues to the possible mechanisms of constitutional chromosomal rearrangement. Appendix A Rearrangements Associated W i t h Bands of Frequent Breakage The following are lists of reciprocal translocations and inversions associated with chromo-some bands with significantly higher freqency of breakpoints than expected by hypotheses of random breakage. For definitions of hypotheses I and II see chapter 1. A . l Lists of Rep Associated with Bands of Frequent Breakage Rep Ascertained Through Abnormalities (Group A l ) Band Hypothesis Reference Breakpoint 1 Breakpoint 2 lq42 I & II [11] lq42 lq42 lq42 lq42 lq42 lq42 lq42 lq42 lq42 lq42 lq42 lq42 lq42 lq42 4p l 5 5p l5 5q23 l l q l 3 12ql2 14ql3 16q24 17pl2 17pl2 21q22 22q l3 21? 13ql4 18ql2 '11] '11] "11] l l ] 11] 11] 11] 11] : H ] 11] :51] L241 f24: 3q21 I n; i i i i : i i i f i f n : n : i i lq31 3q21 3q21 3q21 3q21 3q21 3q21 3q21 3q21 3q21 4p l 6 7q36 13ql4 13q22 14q24 14q32 16ql2 16ql3 92 Appendix A. Rearrangements Associated With Bands of Frequent Breakage 93 Band Hypothesis Reference Breakpoint 1 Breakpoint 2 [11] 3q21 17q25 [51] 3q21 9p22 [51] 3q21 16pl3 [24] 3q21 15q26 3q27 I & II [11] 2q21 3q27 [11] 3q27 4q25 [11] 3q27 9q22 [11] 3q27 12q22 [11] 3q27 14ql3 [11] 3q27 14q22 [11] 3q27 17pl3 [11] 3q27 18q23 [11] 3q27 21q22 [51] 3q27 15q22 4 p l l I & II [11] 1 P 34 4 p l l [11] 4 p l l 5 q l l [11] 4 p l l 5q35 [11] 4 p l l 1 3 p l l [11] 4 p l l 2 1 q l l [11] 4 p l l 22p l3 4q35 I & II [11] lp22 4q35 [11] lq32 4q35 [11] 2p21 4q35 [11] 2q33 4q35 [11] 4q35 7q32 [11] 4q35 7q32 [11] 4q35 8q l3 [11] 4q35 9p21 [11] 4q35 9p l 3 [11] 4q35 10q25 [11] 4q35 18q22 [51] 4q35 18q l l [51] 4q35 10q21 [51] 4q35 8q22 [51] 4q35 l l q 2 2 [24] 4q35 2 2 q l l 5p l 5 I & II [11] lq32 5p l 5 [11] lq42 5 P 15 [11] 3p24 5p l5 [11] 3p25 5p l5 [11] 3q22 5p l 5 [11] 4q26 5p l 5 [11] 5p l 5 8q23 [11] 5p l5 9p l 3 [11] 5p l5 9p24 [11] 5p l 5 10q24 [11] 5p l 5 10q25 [11] 5p l 5 l l q 2 5 [11] 5p l5 14ql3 [11] 5 P 15 14q24 Appendix A. Rearrangements Associated With Bands of Frequent Breakage Band Hypothesis Reference Breakpoint 1 Breakpoint 2 [ITj 5p l 5 15^22 [11] 5 P 15 15q22 [11] 5p l 5 18q21 [11] 5p l5 22ql2 [51] 5p l5 8q22 [51] 5p l5 13q22 [24] 4q27 5p l5 [24] 5p l 5 10q26 5p l 3 [ I [11] 5p~13 6q21 [11] 5p l 3 10p l5 [11] 5p l 3 l O q l l [11] 5p l 3 10q22 [11] 5p l 3 10q26 [11] 5p l 3 l l q 2 3 [11] 5p l 3 13pl2 [11] 5p l 3 13ql4 [11] 5p l 3 22q l3 [11] 5p l 3 21q22 [51] 2q37 5p l 3 [51] 5p l 3 14q32 [24] lp_13 5p l 3 5p l 2 [ I & II [11] 5p l 2 9p24 [11] 5p l 2 10pl5 [11] 5p l2 l l q 2 3 [11] 5p l2 1 5 p l l [11] 5p l2 15pl2 [11] 5p l2 1 8 p l l [11] 5p l 2 19ql3 [11] 5p l2 2 0 p l l 5g35 | I [11] 1^21 5q35 [11] 2p l 2 5q35 [11] 4 p l l 5q35 [11] 4q22 5q35 [11] 5q35 8q24 [11] 5q35 9p l 3 [11] 5q35 13q22 [11] 5q35 15ql3 [11] Xq21 5q35 [24] 3q25 5g35 6q21 | 1 ][ [11] lq43 6q21 [11] 2q35 6q21 [11] 2q37 6q21 [11] 5p l 3 6q21 [11] 6q21 7q32 [11] 6q21 8q l3 [11] 6q21 10q26 [11] 6q21 17q25 [11] 6q21 18ql2 [11] 6q21 2 1 q l l [11] 6q21 22p l3 Appendix A. Rearrangements Associated With Bands of Frequent Breakage Band Hypothesis Reference Breakpoint 1 Breakpoint 2 [51] 6q21 8q22 [24] 6q21 20q l3 7p22 I [11] lq21 7p22 [11] 2q33 7p22 [11] 3 P 14 7p22 [11] 5q32 7p22 [11] 7p22 8q l2 [11] 7p22 13ql4 [11] 7p22 15q22 [11] 7p22 15q22 [11] 7p22 15q24 [11] 7p22 19ql2 [24] 7p22 2 2 q l l 7q22 I [11] I q l l 7q22 [11] 2q33 7q22 [11] 2q35 7q22 [11] 7q22 8p21 [11] 7q22 13ql2 [11] 7q22 13q32 [11] 7q22 13q34 [11] 7q22 17q25 [11] 7q22 18q23 [11] 7q22 21q? [51] 7q22 20ql3 [51] 7q22 14q32 [51] 6q24 7q22 [51] 6q25 7q22 7q32 I k II [11] 4q35 7q32 [11] 4q35 7q32 [11] 6q21 7q32 [11] 7q32 9p23 [11] 7q32 9p24 [11] 7q32 12q24 [11] 7q32 13ql4 [11] 7q32 13q22 [11] 7q32 18q22 [11] 7q32 21q22 [51] 7q32 18q23 [51] 4q26 7q32 [24] 4p l 6 7q32 [24] 5q l3 7q32 8p23 I & II [11] lq31 8p23 [11] 2q33 8p23 [11] 3p l 4 8p23 [11] 4p l 6 8p23 [11] 4p l 6 8p23 [11] 6p21 8p23 [11] 8p23 9q32 [11] 8p23 l l p l l [11] 8 P 23 l l q l 4 Appendix A. Rearrangements Associated With Bands of Frequent Breakage Band Hypothesis Reference Breakpoint 1 Breakpoint 2 [51] 8p23 10pl2 [51] 8p23 14q31 [24] 8p23 15q22 [24] 8 P 23 13ql4 9p24 I & II [11] 3q26 9p24 [11] 4q23 9p24 [11] 5p l2 9p24 [11] 5p l5 9p24 [11] 7q32 9p24 [11] 8p21 9p24 [11] 9p24 10q24 [11] 9p24 13q22 [11] 9p24 18q21 [51] 9p24 18q21 [51] 9p24 15q22 [51] 2p24 9p24 [51] 2p l2 9p24 [51] 3q23 9p24 [51] 9p24 18ql2 [51] 9p24 14q24 [24] 5q32 9p24 [24] 9p24 12q24 9p22 I, II [11] 6q27 9p22 [11] 9p22 14ql3 [11] 9p22 14q22 [11] 9p22 14q24 [11] 9p22 15ql3 [11] 9p22 16q24 [11] 9p22 1 7 p l l [11] 9p22 1 8 q l l [11] 9p22 22ql3 [51] 3q21 9p22 [24] 9p22 l O p l l 9p l 3 I [11] 4q35 9p l 3 [11] 5p l5 9p l 3 [11] 5q35 9p l 3 [11] 7q35 9p l 3 [11] 9p l 3 l l q 2 4 [11] 9p l 3 14pl2 [11] 9p l 3 1 8 p l l [11] 9p l 3 1 8 p l l [51] 9p l3 15q24 [51] 9p l 3 13? [51] 7 q l l 9 P 13 9 p l l I, II [11] 8q24 9 p l l [11] 9 p l l 1 4 p l l [11] 9 p l l 1 5 q l l [11] 9 p l l 1 8 p l l [51] 6q27 9 p l l [51] 9 p l l 10pl5 Appendix A. Rearrangements Associated With Bands of Frequent Breakage Band Hypothesis Reference Breakpoint 1 Breakpoint 2 [51] 9 p l l 14? [24] Y q l 2 9 p l l 9 q l l I, II [11] 9 q l l 20p l3 [11] 9 q l l 2 2 q l l [11] 9 q l l 22q l2 [11] 9 q l l 1 4 p l l [51] 9 q l l 17q25 [51] 9 q l l 1 7 p l l 10q26 I, II [11] lq32 10q26 [11] 2p23 10q26 [11] 4q31 10q26 [11] 4q31 10q26 [11] 5 P 13 10q26 [11] 6p21 10q26 [11] 6q21 10q26 [11] 8p21 10q26 [11] 8p21 10q26 [11] 9q32 10q26 [11] 10q26 l l q l 3 [11] 10q26 1 6 p l l [11] 10q26 1 2 p l l [51] 10q26 21q21 [51] lq31 10q26 [51] 10q26 21q22 [24] 5p l5 10q26 [24] 10q26 1 5 p l l l l q l l I, II [11] l O p l l l l q l l [11] l l q l l 22q l3 [51] 2q22 l l q l l [51] 4q22 l l q l l [51] l l q l l 15? 13ql4 I, II [11] lq44 13ql4 [11] 2q34 13ql4 [11] 3q21 13ql4 [11] 3q24 13ql4 [11] 3q28 13ql4 [11] 4q23 13ql4 [11] 5p l 3 13ql4 [11] 6p21 13ql4 [11] 6q24 13ql4 [11] 7p l5 13ql4 [11] 7p l5 13ql4 [11] 7 P 22 13ql4 [11] 7q32 13ql4 [11] 9q34 13ql4 [11] 12q24 13ql4 [11] 13ql4 15q26 [11] 13ql4 15q26 [11] 13ql4 20q l3 [51] 13ql4 17 P 13 Appendix A. Rearrangements Associated With Bands of Frequent Breakage Band Hypothesis Reference Breakpoint 1 Breakpoint 2 [24] lq42 13ql4 [24] 8p23 13ql4 [11] 3q29 13q22 13q22 I, II [11] 3q21 13q22 [11] 5q35 13q22 [11] 6p23 13q22 [11] 6p23 13q22 [11] 7q32 13q22 [11] 9p24 13q22 [11] 12q22 13q22 [11] 13q22 15q21 [11] 13q22 15q26 [11] 13q22 1 8 p l i [11] 4q27 13q22 [11] 13q22 21q22 [51] 13q22 15q25 [51] 5p l5 13q22 [51] 10q22 13q22 13q34 I [11] lq31 13q34 [11] 2q l4 13q34 [11] 2q22 13q34 [11] 2q33 13q34 [11] 6q l4 13q34 [11] 7 P 13 13q34 [11] 7q22 13q34 [11] 8q21 13q34 [11] 10q21 13q34 [11] 13q34 1 4 q l l [51] lq31 13q34 14q32 I [11] lp32 14q32 [11] lq32 14q32 [11] 3q l3 14q32 [11] 3q21 14q32 [11] 3q25 14q32 [11] 6p21 14q32 [11] 7 q l l 14q32 [11] 8p21 14q32 [11] 14q32 2 0 q l l [11] 14q32 22ql2 [11] 14q32 2 1 q l l [51] 9q21 14q32 [51] 7q22 14q32 [51] 8q24 14q32 [51] 5p l 3 14q32 [24] 8q21 14q32 15q22 I, II [11] lp32 15q22 [11] 2q32 15q22 [11] . 3q28 15q22 [11] 5p l5 15q22 [11] 5p l5 15q22 Appendix A. Rearrangements Associated With Bands of Frequent Breakage Band Hypothesis Reference Breakpoint 1 Breakpoint 2 [11] 6p25 15q22 [11] 7p22 ' 15q22 [11] 7p22 15q22 [11] 15q22 20p l3 [11] 15q22 20ql3 [11] 15q22 21q22 [51] 3q27 15q22 [51] 8q22 15q22 [51] 9p24 15q22 [24] 8p23 15q22 [24] 15g22 17pl3 17pl3 [ l7n [ l l ] l q n 17pl3 [11] lq41 17pl3 [11] 2 p l l 17pl3 [11] 3p l 2 17pl3 [11] 3p l 3 17pl3 [11] 3q27 17pl3 [11] 5q33 17pl3 [11] 10q24 17pl3 [11] 17pl3 2 2 q l l [51] 17pl3 2 1 p l l [51] 23p22 17pl3 [51] 15ql5 17pl3 -[51] 15q24 17pl3 ' [51] 10q21 17pl3 [51] 13ql4 17pl3 [51] 16pl3 17pl3 [24] 15q22 17pl3 [24] 15ql3 17pl3 17g25 | L J I lp36 17g25 [IT] 3q2 l 17q25 [11] 4 p l 3 17q25 [11] 5q33 17q25 [11] 6q21 17q25 [11] 7q22 17q25 [11] 8q24 17q25 [11] 17q25 2 2 q l l [11] 17q25 22ql2 [51] 17q25 19ql3 [51] 9 q l l 17q25 [51] 16ql2 17q25 [24] 13ql2 17q25 [24] 6p21 17g25 1 8 p l l [ l7n \U] l p l i 1 8 p l l [11] 2q34 1 8 p l l [11] 4 p l 4 1 8 p l l [11] 4q31 1 8 p l l [11] 5p l 2 1 8 p l l . [11] 6pll 1 8 p l l [11] 6q l6 1 8 p l l Appendix A. Rearrangements Associated With Bands of Frequent Breakage Band Hypothesis Reference Breakpoint 1 Breakpoint 2 [Tl] 9 p i i i 8 p i i [11] 9p l 3 1 8 p l l [11] 9p l 3 1 8 p l l [11] l l q l 4 1 8 p l l [11] Hq23 1 8 p l l [11] 1 2 p l l 1 8 p l l [11] 13q22 1 8 p l l [11] 1 8 p l l 19ql3 [11] 1 8 p l l 2 0 p l l [11] 15ql5 1 8 p l l [51] 2p l 3 1 8 p l l [51] 2p23 1 8 p l l [51] 18q23 1 8 p l l [51] 6p21 1 8 p l l [51] 1 8 p l l 1 8 q l l [51] 10q24 1 8 p l l [24] 3q23 1 8 p l l [24] 1 8 p l l 2 1 q l l [24] 1 8 p l l 22q l3 18g21 | I [11] 1^32 18q21 [11] lq44 18q21 [11] 2q37 18q21 [11] 4p l 4 18q21 [11] 4q27 18q21 [11] 5p l5 18q21 [11] 6p23 18q21 [11] 8q23 18q21 [11] 9p24 18q21 [11] l l q 2 1 18q21 [11] 18q21 22ql3 [11] 18q21 21q22 [51] 9p24 18q21 [24] 4q21 18q21 [24] 7q34 18q21 [24] 8q24 18q21 [24] 1 5 q l l 18q21 18g23 I l7n [Tl] 1^32 18q23 [11] 2q33 18q23 [11] 3p l 4 18q23 [11] 3p21 18q23 [11] 3q27 18q23 [11] 4p l 5 18q23 [11] 4q24 18q23 [11] 4q28 18q23 [11] 4q31 18q23 [11] 7q22 18q23 [11] 8q24 18q23 [11] 9p23 18q23 [11] 10q25 18q23 [11] l l q l 4 18q23 Appendix A. Rearrangements Associated With Bands of Frequent Breakage 101 Band Hypothesis Reference Breakpoint 1 Breakpoint 2 [11] l l p l 3 18q2l [11] 12q23 18q23 [11] 14q24 18q23 [11] 15q21 18q23 [51] 7q32 18q23 [51] 18q23 1 8 p l l [24] 13q? 18q23 2 1 q l l [ M l [ i l ] 2pTl 2 1 q i l [11] 4 p l l 2 1 q l l [11] 6q l5 2 1 q l l [11] 6q21 2 1 q l l [11] 1 2 q l l 2 1 q l l [11] 14q32 2 1 q l l [51] 23q22 2 1 q l l [51] 19p l3 2 1 q l l [24] 1 8 p l l 2 1 q l l [24] 4p l 5 2 1 g l l 21q22 [ iTlI [ l l ] lq24 21q22 [11] lq32 21q22 [11] lq42 21q22 [11] 4q31 21q22 [11] 5q l3 21q22 [11] 5q31 21q22 [11] 6q l2 21q22 [11] 6q25 21q22 [11] 7q32 21q22 [11] l O p l l 21q22 [11] l l p l l 21q22 [11] l l q 2 3 21q22 [11] 12ql5 21q22 [11] 15q22 21q22 [11] 1 6 p l l 21q22 [11] 17q23 21q22 [11] 1 8 q l l 21q22 [11] 5p l 3 21q22 [11] 5q22 21q22 [11] 15ql3 21q22 [11] 19ql3 21q22 [11] 3q27 21q22 [11] 13q22 21q22 [11] 18q21 21q22 [51] 6 p l l 21q22 [51] lp31 21q22 [51] 10q26 21q22 [24] 2p23 21q22 2 2 q l l | I [ l l ] 2q l3 2 2 q l l [11] 3p26 2 2 q l l [11] 7p l 3 2 2 q l l [11] 9 q l l 2 2 q l l [11] 13q21 2 2 q l l Appendix A. Rearrangements Associated With Bands of Frequent Breakage 102 Band Hypothesis Reference Breakpoint 1 Breakpoint 2 [TT| 14ql3 2 2 q l l [11] 15q24 2 2 q l l [11] 17p l3 2 2 q l l [11] 17q25 2 2 q l l [11] 4p l 6 2 2 q l l [11] 21p l3 2 2 q l l [51] 1 3 p l l 2 2 q l l [24] 7p22 2 2 q l l [24] 4q35 2 2 q l l 22gi21 iji [H] ^ 7 J [11] 6p21 22ql2 [11] 9 q l l 22q l2 [11] 14q32 22ql2 [11] 15q26 22ql2 [11] 16q21 22ql2 [11] 17q25 22ql2 [11] 19ql3 22ql2 [51] 4p l 6 22ql2 Appendix A. Rearrangements Associated With Bands of Frequent Breakage 103 Rep Ascertained Incidentally (Group A2) Band Hypothesis Reference Breakpoint 1 Breakpoint 2 l p l l I, II [11] l p l l 3 p l l IHJ l p l l 8p23 [11] l p l l l l q l l [11] l p l l 1 6 q l l [11] l p l l 1 9 p l l [22] l p l l 1 9 q l l [22] l p l l 1 9 p l l lq21 I [11] lq21 3q23 [11] lq21 3q29 [11] lq21 16q22 [11] lq21 1 9 q l l [11] lq21 2p23 [22] lq21 2q37 [32] lq21 17q23 [32] lq21 l l q l 3 [25] lq21 17q21 2q33 I [11] 2q33 5q l3 [11] 2q33 7p22 [11] 2q33 13ql2 [11] 2q33 16q22 [11] 2q33 19p l3 [11] 2q33 22q l3 [32] 2q33 10q22 [32] 2q33 3q29 [24] 2q33 5q22 4q22 II [11] lq44 4q22 [11] 4q22 l l q l l [11] 4q22 15q21 [11] 4q22 16pl3 [11] 4q22 16q23 [22] 4q22 5p l 4 5q l3 I [11] 2q33 5q l3 [11] 4q31 5q l3 [11] 5q l3 l l q l l [11] 5q l3 13ql4 [11] 5q l3 16pl3 [11] 5q l3 1 8 p l l [22] 5q l3 6p23 [22] 5q l3 7q22 [22] 5q l3 19ql3 Appendix A. Rearrangements Associated With Bands of Frequent Breakage Band Hypothesis Reference Breakpoint 1 Breakpoint 2 [50] 5q l3 15q25 5q35 I [11] lq23 5q35 [11] 5q35 6q l3 [11] 5q35 8q21 [11] 5q35 15ql5 [11] 5q35 16pl3 [11] 5q35 19pl3 [22] 5q35 14q24 [32] 5q35 6q l5 [32] 2q33 10q22 [32] 10q22 14q24 [32] 4 P 13 10q22 7 P 22 I [11] lq32 7p22 [11] 2q33 7p22 [11] 7p22 8q24 [11] 7p22 19ql3 [22] 7p22 2 0 q l l [24] 7p22 1 8 q l l [5] 7p22 l l q 2 3 [5] 3p l4 7p22 9p22 I, II [11] 9p22 10pl5 [11] 9p22 l l q 2 5 [11] 9p22 12ql3 [22] 9p22 10q24 [20] 9p22 18q21 10q22 I [11] 2p25 10q22 [11] 2q37 10q22 [11] 4q25 10q22 [11] 7 q l l 10q22 [11] 7q32 10q22 [11] 10q22 17q25 [11] 2q34 10q22 [25] 3q28 10q22 [20] 2q35 10q22 10q24 I [11] 1 P 36 10q24 [11] 3p25 10q24 [11] 6q25 10q24 [11] 10q24 16q22 [22] 9p22 10q24 [32] 10q24 14q32 [51] 10q24 17q21 [20] 10q24 12ql5 [42] 10q24 21q22 I l p l 5 I [11] lp36 l l p l 5 [11] 3q24 l l p l 5 [11] 6p23 l l p l 5 [11] 8p21 11 P 15 [11] 4p l 4 11 P 15 [22] 1 P 31 l l p l 5 [32] 3p l2 11 P 15 Appendix A. Rearrangements Associated With Bands of Frequent Breakage 105 Band Hypothesis Reference Breakpoint 1 Breakpoint 2 [32] 5q31 11 P 15 [32] 11 P 15 1 9 p l l [32] 4q l3 l l p l 5 [20] 9q22 H P 1 5 [5] l l p ! 5 15gl5 i i q 2 i | iTn [TTj n q 2 i i 4 q 3 2 [11] l l q 2 1 20p l3 [22] 7p21 l l q 2 1 [22] l l q 2 1 15q24 [51] 4q35 l l q 2 1 [24] 2p l 3 l l q 2 1 13gl4 | I [ l l ] 5^13 1 3 q H [11] 6q l5 13ql4 [11] 10q25 13ql4 [11] 13ql4 17q23 [22] 5q31 13ql4 [22] 13ql4 1 5 p l l [22] 13ql4 16q22 [32] 9p l3 13ql4 [32] 4q25 13ql4 [51] 12q l l 13ql4 [50] 6p21 13ql4 17q25 | i T n [U] l p l 6 17q25 [11] lq32 17q25 [11] 5q33 17q25 [11] 10q22 17q25 [11] 15q22 17q25 [11] 16ql3 17q25 [22] 7 q l l 17q25 [22] 17q25 22ql2 [32] 3q25 17q25 [25] 2q31 17q25 [25] l q l 2 17q25 1 9 q l l 1 I, II [11] l q 2 1 1 9 q l l [11] 5 q l l 1 9 q l l [11] 16q24 1 9 q l l [22] l p l l 1 9 q l l Appendix A. Rearrangements Associated With Bands of Frequent Breakage 106 A.2 Lists of Inversions Associated with Bands of Frequent Breakage Inversions Ascertained Through Abnormalities (Group B l ) Band Hypothesis Reference Breakpoint 1 Breakpoint 2 2 p l l I, II [11] 2 p l l 2q l3 [51] 2 p l l 2q l2 [51] 2 p l l 2q l3 [51] 2 p l l 2q l3 [51] 2 p l l 2q l3 [24] 2 p l l 2q l3 [24] 2 p l l 2q l3 2q l 3 I, II [11] 2 p l l 2q l3 [51] 2 p l l 2q l3 [51] 2 p l l 2q l3 [51] 2 p l l 2q l3 [24] 2 p l l 2q l3 [24] 2 p l l 2q l3 3p25 I, II [11] 3p25 3q21 [11] 3p25 3q25 [51] 3p25 3q21 [51] 3p25 3q l3 3 p l l I, II [51] 3 p l l 3q l2 [51] 3 p l l 3q l2 [51] 3 p l l 3q l2 3q l2 I, II [51] 3 p l l 3q l2 [51] 3 p l l 3q l2 [51] 3 p l l 3q l2 6p25 I, II [51] 6p25 6q23 [51] 6p25 6q l5 [24] 6p l2 6p25 [24] 6p l2 6p25 8p23 I [11] 8p23 8q22 [11] 8p23 8q22 [11] 8p23 8q22 [51] 8p23 8q l3 Appendix A. Rearrangements Associated With Bands of Frequent Breakage 107 Inversions Ascertained Incidentally (Group B2) Band Hypothesis Reference Breakpoint 1 Breakpoint 2 l p l l I, II [11] l p l l lq21 [22] l p l l l q l 2 [22] l p l l lq23 [32] l p l l l q l 2 lq21 I, II [11] l p l l lq21 [11] l p l 2 lq21 [11] 1 P 13 lq21 [11] l p l 3 lq21 [11] l p l 3 lq21 [11] l p l 3 lq21 [11] lq21 lq31 [32] lp36 lq21 [32] l p l 3 lq21 2 p l l I, II [11] 2 p l l 2q l2 [11] 2 p l l 2q l3 [11] 2 p l l 2q l3 [11] 2 p l l 2q l3 [11] 2 p l l 2q l3 [11] 2 p l l 2q l3 [11] 2 p l l 2q l3 [11] 2 p l l 2q l3 [11] 2 p l l 2q l3 [11] 2 p l l 2q l3 [11] 2 p l l 2q l3 [11] 2 p l l 2q l3 [11] 2 p l l 2q l 3 [22] 2 p l l 2q l3 [22] 2 p l l 2q l2 [22] 2 p l l 2q l3 [32] 2 p l l 2q l3 [32] 2 p l l 2q l3 [32] 2 p l l 2q l3 [32] 2 p l l 2q l3 [32] 2 p l l 2q l3 [24] 2 p l l 2q l3 [24] 2 p l l 2q l3 2q l3 I, II [11] 2 p l l 2q l3 [11] 2 p l l 2q l 3 [22] Y p l l Y q l l [22] Y p l l Y q l l [22] Y p l l Y q l l [51] Y p l l Y q l l [24] Y p l l Y q l l [11] 2 p l l 2q l3 [11] 2 p l l 2q l3 [11] 2 p l l 2q l3 Appendix A. Rearrangements Associated With Bands of Frequent Breakage 108 Band Hypothesis Reference Breakpoint 1 Breakpoint 2 [11] 2 p l l 2q l3 [11] 2 p l l 2q l3 [11] 2 p l l 2q l3 [11] 2 p l l 2q l3 [11] 2 p l l 2q l3 [11] 2 p l l 2q l3 [11] 2 p l l 2q l3 [11] 2 p l l 2q l3 [11] 2p l2 2q l3 [11] 2p l2 2q l3 [11] 2 p l l 2q l3 [22] 2 p l l 2q l3 [22] 2 p l l 2q l3 [32] 2 p l l 2q l3 [32] 2 p l l 2q l3 [32] 2 p l l 2q l3 [32] 2 p l l 2q l3 [32] 2 p l l 2q l3 [24] 2 p l l 2q l3 [24] 2 p l l 2g l3 5 p l 3 | \U] Ipljj gqU [11] 5p l 3 5q l3 [11] 5p l 3 5q l3 [11] 5p l 3 5q l3 [11] 5p l 3 5q l3 [11] 5p l 3 5q l3 [24] 5p l 3 5g l3 5gi31 iTn [IT] ipTI 5 q i 3 [11] 5p l 3 5q l3 [11] 5p l 3 5q l3 [11] 5p l 3 5q l3 [11] 5p l 3 5q l3 [11] 5p l 3 5q l3 [11] 5p l 3 5q l3 [11] 5q l3 5q35 [11] 5q l3 5q35 [11] 5p l 3 5q l3 [32] 5q l3 5q34 r _ _ ^ [24] 5p l 3 5g l3 6pl2| II \U] 6pT2 6ql5 [11] 6p l2 6q l5 [11] 6p l2 6p24 [11] 6p l 2 6p22 [51] . 6p25 6p l 2 l O p l l 1 U l [H] l O p l l 10q21 [11] l O p l l 10q21 [11] l O p l l 10q21 [11] l O p l l 10q21 [11] l O p l l 10q21 [11] l O p l l 10q21 Appendix A. Rearrangements Associated With Bands of Frequent Breakage 109 Band Hypothesis Reference Breakpoint 1 Breakpoint 2 [11] l O p l l 10q21 [11] l O p l l 10q21 [11] l O p l l 10q21 [11] l O p l l 10q21 [11] l O p l l 10q21 [11] l O p l l 10q21 [11] l O p l l 10q21 [11] l O p l l 10q21 [11] l O p l l 10q21 [11] l O p l l 10q21 [22] l O p l l l O q l l [32] l O p l l l O q l l [32] l O p l l 10q22 l O q l l | I [ i l ] 10pl3 l O q l l [11] l O q l l 10q23 [11] l O q l l 10q21 [22] l O p l l l O q l l [32] l O q l l 10q26 [32] l O p l l l O q l l 10q21 | iTn [H] l O p l l 10q21 [11] l O p l l 10q21 [11] l O p l l 10q21 [11] l O p l l 10q21 [11] l O p l l 10q21 [11] l O p l l 10q21 [11] l O p l l 10q21 [11] l O p l l 10q21 [11] l O p l l 10q21 [11] l O p l l 10q21 [11] l O p l l 10q21 [11] l O p l l 10q21 [11] l O p l l 10q21 [11] l O p l l 10q21 [11] l O p l l 10q21 [11] 10pl3 10q21 [11] l O q l l 10q21 [11] l O p l l 10q21 • [32] 10pl2 10q21 Y p l l [ I E \U] Y p l l Y q l l [11] Y p l l Y q l l [11] Y p l l Y q l l [11] Y p l l Y q l l [11] Y p l l Y q l l [11] Y p l l Y q l l [11] Y p l l Y q l l [11] Y p l l Y q l l [11] Y p l l Y q l l [11] Y p l l Y q l l [11] Y p l l Y q l l [11] Y p l l Y q l l Appendix A. Rearrangements Associated With Bands of Frequent Breakage 110 Band Hypothesis Reference Breakpoint 1 Breakpoint 2 [11] Y p l l Y q l l [11] Y p l l Y q l l [22] Y p l l Y q l l [22] Y p l l Y q l l [22] Y p l l Y q l l [22] Y p l l Y q l l [51] Y p l l Y q l l [24] Y p l l Y q l l Y p l l I, II [11] Y p l l Y q l l [11] Y p l l Y q l l [11] Y p l l Y q l l [11] Y p l l Y q l l [11] Y p l l Y q l l [11] Y p l l Y q l l [11] Y p l l Y q l l [11] Y p l l Y q l l [11] Y p l l Y q l l [11] Y p l l Y q l l [11] Y p l l Y q l l [11] Y p l l Y q l l [11] Y p l l Y q l l [11] Y p l l Y q l l [22] Y p l l Y q l l [22] Y p l l Y q l l [22] Y p l l Y q l l [22] Y p l l Y q l l [51] Y p l l Y q l l [24] Y p l l Y q l l Appendix B Band Measurements The ISCN idiograms [34] were used for direct measurements of chromosome bands. The values obtained for a 320 band karyotype, and a 329 band karyotype defined for use with sperm chromosome data are listed below. Measurements are given in cm. Each band is designated as a light (L) or a dark (D) G band. The assignment of light and dark bands was based on a 400 band karyotype. Therefore, some bands in the 320 band karyotype consist of several subbands. In these cases, a band was designated light or dark based on the total relative lengths of light and dark subbands 111 Appendix B. Band Measurements B.l Band measurements at 320 band resolution Band L/D Length (cm) lp36 L 1.29 1 P 35 D 0.30 1 P 34 L 0.62 lp33 D 0.30 lp32 L 0.58 lp31 D 1.54 lp22 L 0.85 lp21 D 0.81 l p l 3 L 0.84 l p l 2 D 0.19 l p l l D 0.12 I q l l D 0.15 l q l 2 D 1.10 lq21 L 0.70 lq22 D 0.31 lq23 L 0.49 lq24 D 0.40 lq25 L 0.43 lq31 D 1.19 lq32 L 1.09 lq41 D 0.65 lq42 L 0.50 lq43 D 0.38 lq44 L 0.32 2p25 L 0.48 2p24 D 0.45 2p23 L 0.61 2p22 D 0.57 2p21 L 0.51 2p l6 D 0.77 2p l5 L 0.26 2p l 4 D 0.29 2p l 3 L 0.59 2p l 2 D 0.49 2 p l l L 0.60 2 q l l L 0.53 2q l2 D 0.40 2q l3 L 0.35 2q l4 D 0.72 2q21 L 0.68 2q22 D 0.70 2q23 L 0.32 2q24 D 0.81 2q31 L 0.65 2q32 D 0.90 2q33 L 0.60 2q34 D 0.43 2q35 L 0.40 Appendix B. Band Measurements Sand L/D Length (cm) 2q36 D 0.49 2q37 L 0.79 3p26 D 0.26 3p25 L 0.40 3 P 24 D 0.69 3p23 L 0.21 3p22 D 0.38 3p21 L 1.35 3p l 4 D 0.77 3 P 13 L 0.32 3p l 2 D 0.99 3 p l l D 0.12 3 q l l D 0.48 3q l2 L 0.15 3q l3 D 1.20 3q21 L 0.57 3q22 D 0.30 3q23 L 0.29 3q24 D 0.65 3q25 L 0.46 3q26 D 1.05 3q27 L 0.32 3q28 D 0.30 3q29 L 0.37 4p l 6 L 0.81 4p l 5 D 0.97 4p l 4 L 0.45 4p l 3 D 0.32 4p l 2 L 0.31 4 p l l D 0.15 4 q l l D 0.10 4q l2 L 0.45 4q l3 D 0.75 4q21 L 0.79 4q22 D 0.52 4q23 L 0.11 4q24 D 0.50 4q25 L 0.23 4q26 D 0.70 4q27 L 0.31 4q28 D 0.82 4q31 L 1.01 4q32 D 0.60 4q33 L 0.27 4q34 D 0.42 4q35 L 0.47 5p l 5 L 0.99 5p l 4 D 0.83 Appendix B. B a n d Measurements Band L/D Length (cm) 5 P 13 L 0.58 5p l2 D 0.25 5 p l l D 0.31 5 q l l L 0.70 5q l2 D 0.42 5q l3 L 0.72 5q l4 D 0.80 5q l5 L 0.26 5q21 D 0.81 5q22 L 0.23 5q23 D 0.90 5q31 L 1.01 5q32 D 0.43 5q33 L 0.40 5q34 D 0.68 5q35 L 0.50 6p25 L 0.40 6p24 D 0.21 6p23 L 0.41 6p22 D 0.69 6p21 L 1.22 6p l2 D 0.60 6 p l l D 0.29 6 q l l D 0.14 6q l2 D 0.44 6q l3 L 0.29 6q l4 D 0.42 6q l5 L 0.29 6q l6 D 0.68 6q21 L 0.70 6q22 D 1.04 6q23 L 0.45 6q24 D 0.61 6q25 L 0.55 6q26 D 0.29 6q27 L 0.41 7p22 L 0.48 7p21 D 0.80 7p l5 L 0.61 7p l4 D 0.40 7p l3 L 0.38 7p l2 D 0.34 7 p l l L 0.33 7 q l l L 1.25 7q21 D 1.00 7q22 L 0.70 7q31 D 1.20 7q32 L 0.48 Appendix B. Band Measurements Band L/D Length (cm) 7q33 D 0.23 7q34 L 0.25 7q35 D 0.26 7q36 L 0.61 8p23 L 0.60 8p22 D 0.49 8p21 L 0.51 8p l 2 D 0.58 8 p l l L 0.50 8 q l l L 0.45 8q l2 D 0.49 8q l3 L 0.52 8q21 D 1.37 8q22 L 0.80 8q23 D 0.75 8q24 L 1.23 9p24 L 0.36 9p23 D 0.48 9p22 L 0.11 9p21 D 0.69 9p l 3 L 0.50 9p l 2 D 0.47 9 p l l D 0.08 9 q l l D 0.14 9q l2 D 0.94 9q l3 L 0.18 9q21 D 1.00 9q22 L 1.00 9q31 D 0.50 9q32 L 0.21 9q33 D 0.35 9q34 L 1.00 10pl5 L 0.45 10p l4 D 0.37 10p l3 L 0.43 10pl2 D 0.63 l O p l l L 0.70 l O q l l L 0.62 10q21 D 1.19 10q22 L 0.98 10q23 D 0.68 10q24 L 0.60 10q25 D 0.69 10q26 L 0.72 l l p l 5 L 1.12 l l p l 4 D 0.78 11 P 13 L 0.31 l l p l 2 D 0.35 Appendix B. Band Measurements Band L/D Length (cm) l l p l l L 0.72 l l q l l D 0.12 l l q l 2 D 0.53 l l q l 3 L 0.89 l l q l 4 D 0.73 l l q 2 1 L 0.14 l l q 2 2 D 0.62 l l q 2 3 L 1.00 l l q 2 4 D 0.30 l l q 2 5 L 0.33 12p l3 L 0.75 12pl2 D 0.69 1 2 p l l L 0.60 1 2 q l l D 0.10 12ql2 D 0.50 12ql3 L 0.75 12ql4 D 0.50 12ql5 L 0.38 12q21 D 0.99 12q22 L 0.44 12q23 D 0.50 12q24 L 1.30 13pl3 D 0.28 13pl2 L 0.31 1 3 p l l D 0.56 13q l l D 0.13 13ql2 L 0.69 13ql3 D 0.29 13ql4 L 0.76 13q21 D 1.05 13q22 L 0.56 13q31 D 0.63 13q32 L 0.50 13q33 D 0.29 13q34 L 0.45 14pl3 D 0.30 14pl2 L 0.30 1 4 p l l D 0.60 1 4 q l l L 0.63 14ql2 D 0.40 14ql3 L 0.40 14q21 D 0.80 14q22 L 0.48 14q23 D 0.25 14q24 L 0.91 14q31 D 0.50 14q32 L 0.70 15 P 13 D 0.26 Appendix B. Band Measurements Band L/D Length (cm) 15pl2 L 0.35 1 5 p l l D 0.58 1 5 q l l L 0.30 15ql2 D 0.19 15ql3 L 0.21 15ql4 D 0.33 15ql5 L 0.48 15q21 D 0.80 15q22 L 0.66 15q23 D 0.25 15q24 L 0.59 15q25 D 0.39 15q26 L 0.70 16pl3 L 1.00 16pl2 D 0.49 1 6 p l l L 0.66 16q l l D 0.65 16ql2 L 0.30 16ql3 L 0.30 16q21 D 0.37 16q22 L 0.48 16q23 D 0.45 16q24 L 0.49 17pl3 L 0.59 17pl2 D 0.41 1 7 p l l L 0.58 17q l l L 0.45 17ql2 D 0.30 17q21 L 0.80 17q22 D 0.50 17q23 L 0.30 17q24 D 0.46 17q25 L 0.55 1 8 p l l D 1.25 1 8 q l l L 0.59 18ql2 D 0.83 18q21 L 0.78 18q22 D 0.78 18q23 L 0.44 19pl3 L 1.43 19pl2 D 0.30 1 9 p l l D 0.12 19q l l D 0.15 19ql2 D 0.32 19ql3 L 1.80 20 P 13 L 0.50 20pl2 D 0.50 2 0 p l l L 0.70 Appendix B. Band Measurements Band L/D Length (cm) 2 0 q l l L 0.69 20ql2 D 0.35 20q l3 L 1.05 21p l3 D 0.30 21p l2 L 0.33 2 1 p l l D 0.56 2 1 q l l L 0.33 21q21 D 0.68 21q22 L 1.01 22p l3 D 0.30 22p l2 L 0.34 2 2 p l l D 0.56 2 2 q l l L 0.69 22ql2 D 0.39 22ql3 L 1.13 Xp22 L 0.85 Xp21 D 0.60 X p l l L 0.94 X q l l D 0.10 X q l 2 D 0.24 X q l 3 L 0.56 Xq21 D 0.85 Xq22 L 0.32 Xq23 D 0.29 Xq24 L 0.24 Xq25 D 0.38 Xq26 L 0.37 Xq27 D 0.37 Xq28 L 0.34 Y p l l L 0.22 Y q l l L 0.33 Y q l 2 D 0.38 Appendix B. Band Measurements B.2 Band Measurements for Sperm Chromosomes Band L/D Length (cm) lp36 L 1.29 1 P 35 D 0.30 1 P 34 L 0.62 lp33 D 0.30 1 P 32 L 0.58 lp31 D 1.54 1 P 22 L 0.85 1 P 21 D 0.81 1 P 13 L 0.84 1 P 12 D 0.19 lcen D 0.24 l q l 2 D 1.10 lq21 L 0.70 lq22 D 0.31 lq23 L 0.49 lq24 D 0.40 lq25 L 0.43 lq31 D 1.19 lq32 L 1.09 lq41 D 0.65 lq42 L 0.50 lq43 D 0.38 lq44 L 0.32 2p25 L 0.48 2p24 D 0.45 2p23 L 0.61 2p22 D 0.57 2p21 L 0.51 2p l6 D 0.77 2p l5 L 0.26 2p l 4 D 0.29 2p l 3 L 0.59 2p l2 D 0.49 2 p l l L 0.48 2cen D 0.24 2 q l l L 0.41 2q l2 D 0.40 2q l3 L 0.35 2q l4 D 0.72 2q21 L 0.68 2q22 D 0.70 2q23 L 0.32 2q24 D 0.81 2q31 L 0.65 2q32 D 0.90 2q33 L 0.60 2q34 D 0.43 2q35 L 0.40 Appendix B. Band Measurements Band L/D Length (cm) 2q36 D 0.49 2q37 L 0.79 3p26 D 0.26 3p25 L 0.40 3p24 D 0.69 3p23 L 0.21 3p22 D 0.38 3p21 L 1.35 3p l 4 D 0.77 3p l 3 L 0.32 3p l 2 D 0.99 3cen D 0.24 3 q l l D 0.36 3q l2 L 0.15 3q l3 D 1.20 3q21 L 0.57 3q22 D 0.30 3q23 L 0.29 3q24 D 0.65 3q25 L 0.46 3q26 D 1.05 3q27 L 0.32 3q28 D 0.30 3q29 L 0.37 4p l 6 L 0.81 4p l 5 D 0.97 4p l 4 L 0.45 4p l 3 D 0.32 4p l 2 L 0.31 4cen D 0.24 4q l2 L 0.45 4q l3 D 0.75 4q21 L 0.79 4q22 D 0.52 4q23 L 0.11 4q24 D 0.50 4q25 L 0.23 4q26 D 0.70 4q27 L 0.31 4q28 D 0.82 4q31 L 1.01 4q32 D 0.60 4q33 L 0.27 4q34 D 0.42 4q35 L 0.47 5p l 5 L 0.99 5p l 4 D 0.83 5p l 3 L 0.58 Appendix B. Band Measurements Band L/D Length (cm) 5p l 2 D 0.25 5cen D 0.24 5 q l l L 0.58 5q l2 D 0.42 5q l3 L 0.72 5q l4 D 0.80 5q l5 L 0.26 5q21 D 0.81 5q22 L 0.23 5q23 D 0.90 5q31 L 1.01 5q32 D 0.43 5q33 L 0.40 5q34 D 0.68 5q35 L 0.50 6p25 L 0.40 6p24 D 0.21 6p23 L 0.41 6p22 D 0.69 6p21 L 1.22 6p l 2 D 0.60 6 p l l L 0.17 6cen D 0.24 6q l2 D 0.44 6q l3 L 0.29 6q l4 D 0.42 6q l5 L 0.29 6q l6 D 0.68 6q21 L 0.70 6q22 D 1.04 6q23 L 0.45 6q24 D 0.61 6q25 L 0.55 6q26 D ' 0.29 6q27 L 0.41 7p22 L 0.48 7p21 D 0.80 7p l5 L 0.61 7p l 4 D 0.40 7p l 3 L 0.38 7p l 2 D 0.34 7 p l l L 0.21 7cen D 0.24 7 q l l L 1.13 7q21 D 1.00 7q22 L 0.70 7q31 D 1.20 7q32 L 0.48 Appendix B. Band Measurements Band L/D Length (cm) 7q33 D 0.23 7q34 L 0.25 7q35 D 0.26 7q36 L 0.61 8p23 L 0.60 8p22 D 0.49 8p21 L 0.51 8p l 2 D 0.58 8 p l l L 0.38 8cen D 0.24 8 q l l L 0.33 8q l2 D 0.49 8q l3 L 0.52 8q21 D 1.37 8q22 L 0.80 8q23 D 0.75 8q24 L 1.23 9p24 L 0.36 9p23 D 0.48 9p22 L 0.11 9p21 D 0.69 9p l 3 L 0.50 9p l 2 D 0.47 9cen D 0.24 9q l2 D 0.94 9q l3 L 0.18 9q21 D 1.00 9q22 L 1.00 9q31 D 0.50 9q32 L 0.21 9q33 D 0.35 9q34 L 1.00 10pl5 L 0.45 10pl4 D 0.37 10p l3 L 0.43 10pl2 D 0.63 l O p l l L 0.58 lOcen D 0.24 l O q l l L 0.50 10q21 D 1.19 10q22 L 0.98 10q23 D 0.68 10q24 L 0.60 10q25 D 0.69 10q26 L 0.72 l l p l 5 L 1.12 l l p l 4 D 0.78 l l p l 3 L 0.31 Appendix B. Band Measurements Band L/D Length (cm) l l p l 2 D 0.35 l l p l l L 0.60 l l c e n D 0.24 l l q l 2 D 0.53 l l q l 3 L 0.89 l l q l 4 D 0.73 l l q 2 1 L 0.14 l l q 2 2 D 0.62 l l q 2 3 L 1.00 l l q 2 4 D 0.30 l l q 2 5 L 0.33 12pl3 L 0.75 12pl2 D 0.69 1 2 p l l L 0.48 12cen D 0.24 12ql2 D 0.50 12ql3 L 0.75 12ql4 D 0.50 12ql5 L 0.38 12q21 D 0.99 12q22 L 0.44 12q23 D 0.50 12q24 L 1.30 13pl3 D 0.28 13pl2 L 0.31 1 3 p l l D 0.44 13cen D 0.24 13ql2 L 0.69 13ql3 D 0.29 13ql4 L 0.76 13q21 D 1.05 13q22 L 0.56 13q31 D 0.63 13q32 L 0.50 13q33 D 0.29 13q34 L 0.45 14pl3 D 0.30 14pl2 L 0.30 1 4 p l l D 0.48 14cen D 0.24 14q l l L 0.51 14ql2 D 0.40 14ql3 L 0.40 14q21 D 0.80 14q22 L 0.48 14q23 D 0.25 14q24 L 0.91 14q31 D 0.50 Appendix B. Band Measurements Band L/D Length (cm) 14q32 L 0.70 15p l3 D 0.26 15pl2 L 0.35 1 5 p l l D 0.46 15cen D 0.24 1 5 q l l L 0.18 15ql2 D 0.19 15ql3 L 0.21 15ql4 D 0.33 15ql5 L 0.48 15q21 D 0.80 15q22 L 0.66 15q23 D 0.25 15q24 L 0.59 15q25 D 0.39 15q26 L 0.70 16pl3 L 1.00 16pl2 D 0.49 1 6 p l l L 0.54 16cen D 0.24 16q l l D 0.53 16ql2 L 0.30 16ql3 L 0.30 16q21 D 0.37 16q22 L 0.48 16q23 D 0.45 16q24 L 0.49 17pl3 L 0.59 17pl2 D 0.41 1 7 p l l L 0.46 17cen D 0.24 1 7 q l l L 0.33 17ql2 D 0.30 17q21 L 0.80 17q22 D 0.50 17q23 L 0.30 17q24 D 0.46 17q25 L 0.55 1 8 p l l L 1.13 18cen D 0.24 1 8 q l l L 0.47 18ql2 D 0.83 18q21 L 0.78 18q22 D 0.78 18q23 L 0.44 19pl3 L 1.43 19pl2 D 0.30 19cen D 0.24 Appendix B. Band Measurements Band L/D Length (cm) 19ql2 D 0.32 19ql3 L 1.80 20p l3 L 0.50 20p l2 D 0.50 2 0 p l l L 0.58 20cen D 0.24 2 0 q l l L 0.57 20ql2 D 0.35 20q l3 L 1.05 21p l3 D 0.30 21p l2 L 0.33 2 1 p l l D 0.44 21cen D 0.24 2 1 q l l L 0.21 21q21 D 0.68 21q22 L 1.01 22p l3 D 0.30 22p l2 L 0.34 2 2 p l l D 0.44 22cen D 0.24 2 2 q l l L 0.57 22ql2 D 0.39 22ql3 L 1.13 Xp22 L 0.85 Xp21 D 0.60 X p l l L 0.85 Xcen D 0.18 X q l 2 D 0.24 X q l 3 L 0.56 Xq21 D 0.85 Xq22 L 0.32 Xq23 D 0.29 Xq24 L 0.24 Xq25 D 0.38 Xq26 L 0.37 Xq27 D 0.37 Xq28 L 0.34 Y p l l L 0.19 Ycen D 0.06 Y q l l L 0.30 Y q l 2 D 0.38 Appendix C Computer Programs The following computer programs were utilized in both the processes of data management and statistical analysis. These programs were all written by A. R. Rutherford and are not available in any other published source. For this reason, they are reproduced in the following sections. C.l Checking for Invalid Bands In the data used from published sources, some bands reported are nonexistent bands based on the ISCN nomenclature. This is probably the result of typing mistakes in the original papers. All breakpoints in each data set were checked against the list of valid bands defined according to ISCN [34] using the following program written in Shellscript. Invalid bands werre removed from the data set. # # Program to check for i n v a l i d bands. # onintr close foreach i ($argv) awk '{print $3}' $ i I spellout "/minus/bb.hash >! tempi.$$ awk '-{print $4}' $ i I spellout "/minus/bb.hash >! temp2.$$ set f i r s t = "'sort -u tempi.$$"' set second = '"sort -u temp2.$$'" i f ($#first == 0) then echo 'No bad bands i n the t h i r d column of »>"$±">".> else echo 'Bad bands i n the t h i r d column of ">"$i">" are:' foreach word ( $ f i r s t ) sed -n "/ $word /p" $i end 126 Appendix C. Computer Programs 127 endif i f ($#second == 0) then echo 'No bad bands in the fourth column of ">•'$!»>".> else echo 'Bad bands in the fourth column of '""$i">" are:' foreach word ($second) sed -n " / $word /p" $i end endif echo " end close: \rm -f tempi.$$ temp2.$$ C.2 Checking for Duplicate Rearrangements The following program written in C shell script compares breakpoints in any file or be-tween all data in a string of files and prints all rearrangements with identical set of breakpoints as a group. The program handles those cases where one breakpoint is ei-ther unknown or partially defined, and prints all possible matches as a group to allow subsequent decisions as to probability of duplication based on additional identifying in-formation. # # C s h e l l script to look for duplicate pairs of breakpoints. # onintr close \rm - f fil e . $ $ foreach f i l e ($argv) awk '{printf '7.-16s y.-ls\n", " ' $ f i l e ' : " , $0}' $ f i l e » ! fi l e . $ $ end echo "DATA FILES: $argv[*]" awk '{\ i f ($4 * A ? / && $5 " A?/) print NR. > "'workl .$$'"\ else i f ($4 == "?") print NR., $5 > '"work2.$$'"\ else i f ($5 == "?") print NR, $4 > "'work2.$$'"\ else i f ($4 - A ? / I I $5 " A?/) print NB, $4, $5\ else i f ($4 < $5) print NR, $4, $5 > "'work4.$$'"\ else print NR, $5, $4 > '"work4.$$"'\ }' file . $ $ I sort I tee work3a.$$ I sed 's/?//g' >! temp.$$ sed 'sA./\\\./' work3a.$$ I sed 's/?/\.\*/' I j o i n - temp.$$ |\ sort -n >! work3b.$$ Appendix C. Computer Programs #Bad data l i n e s . i f (-e workl.$$) then echo 'Data lines ignored:' echo ' ' set lines = 'cat workl.$$' foreach i ($lines) sed -n {$i}p f i l e . $ $ end echo \ > > endif echo 'Duplicate Data Lines:' echo ' ' #Data with an unknown breakpoint, i f (-e work2.$$) then echo 'Data lines with an unknown breakpoint:' echo \ i > set wc = 'wc -1 work2.$$' set count = 1 while ($count <= $wc[l]) \rm -f matches.$$ awk 'NR. == '$count'{print $l"\n"$2>' work2.$$ >! bl.$$ set b l = "'cat bl.$$"' awk ' NR. != "'$count'" && $2 ==• "'$bl[2]"' {print $1>' \ work2.$$ » ! matches.$$ i f (-e work4.$$) then awk ' $2 == "'Sblte]'" II $3 == "'$bl[2]'" {print $1}' work3a.$$ work4.$$ » ! matches.$$ else awk ' $2 == '"$bl[2]"' I I $3 == "'$bl[2]'" {print $1}' work3a.$$ » ! matches.$$ endif i f (! -z matches.$$) then set matches = ($bl[l] 'cat matches.$$') foreach l i n e ($matches) sed -n {$line>p f i l e . $ $ end echo \ J j endif <D. count++ end endif #Breaks with some missing information, i f (! -z work3b.$$) then echo 'Data lines with a p a r t i a l l y known breakpoint:' echo \ Appendix C. Computer Programs set wc = 'wc -1 work3b.$$' set count - 1 while ($count <= $wc[l]) \rm -f matches.$$ awk 'NR. == »$count' {print $l"\n"$2"\n"$3"\n"$4"\n"$5}' work3b.$$ >! bl.$$ set bl = '"cat bl.$$*" set record = 1 while ($record <= $wc[l]) i f ($count != $record) then awk »HR == '$record' {print $2"\n"$3"\n"$4"\n"$5>' work3b.$$ >! b2.$$ set b2 = '"cat b2.$$"' awk 'HR == '$record' {\ i f C"$blC4]'" ' /- '"$b2[l]" '$/ I I "'$b2[3]"' " /~"'$bl[2]"'$/){\ i f C"$bl[5]'" " /~'"$b2[2]"'$/ I I "'$b2[4]'" " /~'"$bl [3]">$/){\ print $1>}\ i f ("'$bl[4] " ' * /-"'$b2[2]'"$/ I I "'$b2[4]'" " /*'"$bl[2]"'$/){\ i f C"$bl[5]'" " /-" '$b2[l]" '$/ I I '"$b2[3]'" ~ /*" '$bl [3]"'$/){\ print $1>» ' work3b.$$ » ! matches.$$ endif 19 record++ end i f (-e work4.$$) then awk '{\ i f ($2 " /- ' "$bl[2]" '$/ && $3 " /-"'$bl[3]"'$/) print $1 \ else i f ($3 - /-'"$blC2]"'$/ && $2 ' /"" '$bl[3]"'$/) print $1 \ >' work4.$$ » ! matches.$$ endif i f (! -z matches.$$) then set matches = ($bl[l] 'cat matches.$$') foreach line ($matches) sed -n {$line>p file.$$ end echo \ > > endif ID count++ end endif #Completely known breakpoints, i f (-e work4.$$) then echo 'Data lines with completely known breakpoints:' echo \ > i set wc = 'wc -1 work4.$$' i f ($wc[l] >= 2) then awk >{\ i f ($2 < $3)\ {print $1, $2, $3}\ else\ Appendix C. Computer Programs 130 {print $1, $3, $2}\ >' work4.$$ I sort +1 >! temp.$$ set record = 1 set eof = 0 while ($record < $wc[l]) set bl = 'sed -n {$record}p temp.$$' 0 next = $record + 1 set b2 = 'sed -n {$next}p temp.$$' while ("$bl[2]" == "$b2[2]" && "$bl[3]M == "$b2[3]" && ! $eof) next++ i f ($next <= $wc[l]) then set b2 = 'sed -n {$next}p temp.$$ ' else <D eof = 1 endif end i f ($next >= $record + 2) then awk 'NR == '$record', NR == '$next' - Imprint $1>'\ temp.$$ >! matches.$$ set matches = 'sort -n matches.$$' foreach line ($matches) sed -n {$line}p file.$$ end echo \ > > endif <S record = $next end endif endif close: \rm -f file.$$ temp.$$ workl.$$ work2.$$ work3a.$$ work3b.$$ work4.$$\ bl.$$ b2.$$ matches.$$ C.3 Statistical Analysis Using Binomial Confidence Limits The following program written in C shell script counts the number of breakpoints in a list of rearrangements for each band in a 320 band karyotype, calculates confidence intervals, breakage densities, and expected values. It also compares hypothetical and observed values to determine if some bands are hot spots or cold spots for breakage. # # C shell script to look for hot spots. # Appendix C. Computer Programs 131 onintr close i f ($#argv < 2) then echo 'Heed at least 2 arguments.' echo 'Have you forgotten the bands f i le? ' exit endif set bf = $argv[l] shift echo "DATA FILES: $argv[*]" echo "BAUDS FILE: $bf" \rm -f file.$$ breaks.$$ hlhs.temp hies.temp h2hs.temp h2cs.temp sort -b +1 $bf >! bands.$$ foreach f i l e ($argv) awk '{printf '"/.-15s '/.ls\n", " ' $ f i l e ' : " , $0 » " ' f i l e .$$ " ' \ print $3"\n"$4}' $fi le » ! breaks.$$ end sort breaks.$$ I uniq -c I join -a2 -e 0 - j 2 \ -o 2.1 2.2 2.3 2.4 1.1 - bands.$$ I sort -n I tee temp.$$ l \ awk '{\ i f ($3 == "L"){\ TL += $5\ LL += $4>\ i f ($3 == "D"){\ TD += $5\ LD += $4}\ TB += $5\ LB += $4>\ END { EL = TL / LL\ ED = TD / LD\ EB = TB / LB\ print TL, TD, TB, LL, LD, LB, EL, ED, EB}' l\ cat - temp.$$ I awk -f hs.awk i f (-e hlhs.temp) then echo 'Hot spots for hypothesis 1:' set spots = "'cat hlhs.temp'" foreach break ($spots) echo "" echo $break':' fgrep " $break " file.$$ end else echo 'No hot spots for hypothesis 1.' endif echo \ 11 I I i f (-e hies.temp) then echo 'Cold spots for hypothesis 1:' set spots = "'cat hies.temp'" foreach break ($spots) Appendix C. Computer Programs echo "" echo $break':' fgrep " $break " fil e . $ $ end else echo 'No cold spots for hypothesis 1.' endif echo \ M i f (-e h2hs.temp) then echo 'Hot spots for hypothesis 2:' set spots = "'cat h2hs.temp'" foreach break ($spots) echo "" echo $break':' fgrep " $break " fil e . $ $ end else echo 'No hot spots for hypothesis 2.' endif echo \ II i f (-e h2cs.temp) then echo 'Cold spots f o r hypothesis 2:' set spots = "'cat h2cs.temp'" foreach break ($spots) echo "" echo $break':' fgrep " $break " fil e . $ $ end else echo 'No cold spots for hypothesis 2.' endif echo \ • i close: \rm - f bands.$$ breaks.$$ file.$$ temp.$$ hlhs.temp \ hies.temp h2hs.temp h2cs.temp BEGIN { print \ " num. upper lower num. u . c . l . l . c . l . " print \ " of conf. conf. per per per hyp. hy print \ " band length breaks lim. lim. length length length 1 2 print \ Appendix C. Computer Programs 133 #The Poisson confidence lim i t s axe added below. #These are the 99'/, confidence intervals LCL[0] = 0.00000 UCL[0] = 5.30 LCL[1] = 0.00501 UCL[1] = 7.43 LCL[2] = 0.10300 UCL[2] = 9.27 LCL[3] = 0.33800 UCL[3] 10.98 LCL[4] = 0.67200 UCL[4] = 12.59 LCLC5] 1.08000 UCL[5] = 14.15 LCL[6] = 1.54000 UCLC6] = 15.66 LCL[7] 2.04000 UCL[7] = 17.13 LCL[8] = 2.57000 UCL[8] = 18.58 #Z-value for the normal confidence l i m i t s . Z = 2.576 > NR == 1 { T[l] = $1 ; T[2] = $2 ; T[3] = $3 T[4] = $4 ; T[5] = $5 ; T[6] = $6 T[7] = $7 ; T[8] = $8 ; T[9] = $9 D = 2 * (T[3] + (Z * Z))> NR > 1 { i f ($5 <= 8) { ULIM = UCL[$5] LLIM = LCL[$5] } else •[ X = Z * sqrt((4 * $5 * T[3] * (T[3] - $5)) + (Z * Z * T[3] * T[3])) ULIM = ((2 * $5 * T[3]) + (Z * Z * T[3]) + X) / D LLIM = ((2 * $5 * T[3]) + (Z * Z * T[3]) - X) / D > CPL = $5 / $4 ; ULIMPL = ULIM / $4 ; LLIMPL = LLIM / $4 i f (LLIMPL > T[9]) { HI = "H" print $2 » "hlhs.temp"} else i f (ULIMPL < T[9]) { HI = "C" print $2 » "hies.temp"} else HI = "-" i f ($3 == "L") { i f (LLIMPL > T[7]) { H2 = "H" print $2 » "h2hs.temp"} else i f (ULIMPL < T[7]) { H2 = "C" print $2 » "h2cs.temp"} else H2 = "-"} else i f ($3 == "D") { i f (LLIMPL > T[8]) { H2 = "H" print $2 » "h2hs.temp"} else i f (ULIMPL < T[8]) { H2 = "C" Appendix C. Computer Programs 134 print $2 » "h2cs.temp"> else H2 = "-"} else H2 = " - " printf \ '7.-7s '/.Is '/.5.2f '/.4d '/.6.2f '/.6.2f '/.6.2f */.7.2f */.7.2f '/,1s '/.ls\n",\ $2, $3, $4, $5, ULIM, LLIM, CPL, ULIMPL, LLIMPL, HI, H2> END { print \ print "Total number of breaks in light bands = "T[l] print "Total number of breaks in dark bands = "T[2] print "Total number of breaks in a l l bands = "T[3] print "Total length of light bands = "T[4] print "Total length of dark bands = "T[5] print "Total length of a l l bands = "T[6] print "Expected number per unit length in light bands (Hyp. 2) = "T[7] print "Expected nummber per unit length in dark bands (Hyp. 2) = "T[8] print "Expected number per unit length in a l l bands (Hyp. 1) = "T[9] print \ II II C.4 Testing for Nonrandomness and Homogeneity The following program written in Pascal tests overall nonrandomness of breakpoint dis-tributions and homogeneity between data sets using Pearson's % 2 statistic and the log-lin statistic. PROGRAM HomTest (input,output); (********************************************************** Program to do homogeneity testing of a set of hot spot data f i les . Written in Vax Pascal. For Turbo Pascal, replace the data type varying ] ( of char with string ] (, and modify the commands to open f i le s . Possibly, the command sngl to convert from double to single precision in the function Chisq may also need to be change. *********************************************** CONST maxfiles = 10; datalines = 320; (* Number of lines read from data f i les . *) VAR name, exfilename : varying ]30( of char; results, datafile, exfile : text; degfree, Ze, N, i , j : integer; X2 , G2 : real; x : char; filenum : 1 .. maxfiles; Appendix C. Computer Programs breaks : array ]1..maxfiles.l..datalines( of integer; expect : array ]1..datalines( of real; filename : array ]1..maxfiles( of varying ]30( of char; colsum : array ]1..maxfiles( of integer; FUNCTION Norm ( x : double) : double; (************************************************************* Computes the probability function, P(z > x ), for the normal distrubution. Returns 0 for arguments greater than 5. Norm (5) = 2.9E-7. Returns 1 for arguments less than -5. ********************************************************* CONST sqrttwopi = 2.506628274631000502415765D+0; VAR i : integer; sum , term : double; BEGIN (* Norm *) IF x > 5 THEN Norm := 0 ELSE IF x < -5 THEN Norm := 1 ELSE BEGIN sum := x; term := x; i:= 2; REPEAT term := -term * x * x * (i-1) / (i*(i+l)) ; sum := sum + term; i := i + 2 UNTIL (sum + term = sum ) or ( i > 200) ; IF ( i > 200 ) THEN writeln('Norm failed to converge!'); Norm := 0.5 - ( sum / sqrttwopi ) END END; (* Norm *) FUNCTION Chisq (nu : integer ; x : double) : real; (********************************************************************* Calculates the probability in the t a i l of the Chi squared distribution for nu degrees of freedom. Uses approximation 26.4.14 from Abramowitz and Stegun, which is valid for nu > 30. In this range, i t seems to accurate to about 2E-S. **********************************************************************) CONST third = 0.33333333333333D+0; BEGIN (* Chisq *) Chisq := sngl(Norm( (9*(x*nu*nu)**third - 9*nu + 2) / (3*sqrt(2*nu)))) END; (* Chiqs *) Appendix C. Computer Programs PROCEDURE Skipcolumn; (********************************************************************* This procedure skips over a column in the data f i l e . * * ******************************************* VAR i : integer; x : char; BEGIN (* Skipcolumn *) read(datafile.x); WHILE x = ' 'DO read(datafile,x); WHILE not (x = ' ') DO read(datafile.x) END; (* Skipcolumn *) (************************************************************) BEGIN (* main Program *) filenum := 0; (* Initialize the f i l e number. *) write ('Data f i l e : ' ) ; readln (name); WHILE not (name = " ) DO BEGIN filenum := filenum +1; f ilename]f ilenum( := name; open(datafile,name,readonly); reset(datafile); REPEAT readln(datafile,x) UNTIL x = ' - ' ; colsum]filenum( := 0; FOR i := 1 to datalines DO BEGIN Skipcolumn; Skipcolumn; Skipcolumn; readln(datafile,breaks]filenum,i(); colsum]filenum( := colsum]filenum( + breaks]filenum,i(; END; close(datafile); write('Data f i l e : ' ) ; readln(name) END; N := 0; FOR i := 1 to filenum DO N := N + colsum]i(; write('File of expected values: ' ) ; Appendix C. Computer Programs readln(exfilename); IF exfilename = ' ' THEN BEGIN Ze:= 0; FOR j := 1 to datalines DO BEGIN expect]j( : = 0; FOR i := 1 to filenum DO expect]j( := expect]j( + breaks]i,j(; IF expect] j( = 0 THEN Ze := Ze + 1; expect]j( := expect]j( / N END END ELSE BEGIN open(exfile,exfilename.readonly); reset(exfile); FOR j:= 1 to datalines DO readln(exfile,expect]j(); close(exfile) END; X2 := 0; G2 := 0; FOR j:= 1 to datalines DO FOR i := 1 to filenum DO BEGIN (* Calculating the Pearson chi-square statistic. Undefined values are treated as zero. *) IF not (expect] j( = 0) THEN X2 := X2 + (sqr(breaks]i,j(-colsum]i(*expect]j() / (colsum]i(*expect]j()); (* Calculating the log-lin statistic. Undefined values are treated, as zero. *) IF not (breaks]i,j( = 0) THEN G2 := G2 + 2 * breaks]i,j( * In(breaks]i,j( / (colsum]i(*expect]j()) END; (* Calculate the number of degrees of freedom. *) IF exf ilename = " THEN degfree := (filenum - 1 ) * (datalines - Ze) ELSE degfree := filenum * datalines; open(results,'homtest.out'); rewrite(results); IF exfilename = " THEN writeln(results,' TESTING FOR HOMOGENEITY OF PROPORTIONS') ELSE BEGIN Appendix C. Computer Programs writeln(results,' CHI-SQUARE TEST QF EXPECTATION VALUES'); writeln(results); writeln(results,'Expectations taken from f i l e : ',exfilename) END; writeln(results); writeln(results,'Data f i l e s : ' ) ; FOR i:= 1 to filenum DO write(results.filename]i(,' ' ) ; writeln(results); writeln(results); writeln(results,'total number of breaks = ' ,N :0) ; IF (exfilename = " ) THEN writeln(results,'number of zero marginal sums = ' ,Ze :0) ; writeln(results,'number of degrees of freedom = ',degfree:0); writeln(results); writeln(results,'Pearson chi-square stat ist ic: ' ) ; writeln(results,'X2 =',X2); writeln(results,'Compare to chi-square distribution with ' , degfree:0,' degrees of freedom.'); writeln(results,'P(z > X2) =',Chisq(degfree,X2)); writeln(results); writeln(results,'Log-lin stat i s t ic : ' ) ; writeln(results,'G2 =',G2); writeln(results,'Compare to chi-square distribution with ' , degfree:0,' degrees of freedom.'); IF G2> 0 THEN writeln(results,'P(z > G2) =',Chisq(degfree,G2)) ELSE writeln(results,'P(z > G2) = 1'); writ eln (results,' ') close(results) END (* main program *) . References [1] Aurias, A., Prieur, M., Dutrillaux, B., and Lejeune, J. (1978) Systematic analysis of 95 reciprocal translocations in autosomes. Hum Genet 45:259-282. [2] Benet, J., Fuster, C., Genesca, Navarro, J., Miro, R., Egozcue, J., Templado, C. (1989) Expression of fragile sites in human sperm and lymphocyte chromosomes. Hum Genet 81:239-242. [3] Boue, A., and Gallano, P. (1984) A collaborative study of the segregation of in-herited chromosome structural rearrangements in 1356 prenatal diagnoses. Prenat Diagn 4: (Special Issue, Spring 1984) 45-67. [4] Brandriff, B., Gordon, L., Ashworth, L., Watchmaker, G., Moore, D., Wyrobek, A. J., and Carrano, A. V. (1985) Chromosomes of human sperm: variability among normal individuals. Hum Genet 70:18-24. [5] Buckton, K. E., O'Riordan, M. L., Slight, J., Mitchell, M., McBeath, S., Keay, A. J., Barr, D., and Short, M. (1980) A G-band study of chromosomes in liveborn infants. Ann Hum Genet 43:227-239. [6] Burns, J. P., Koduru, P. R. K., Alonso, M. L., and Chaganti, R.S.K. (1986) Analysis of meiotic segregation in a man heterozygous for two reciprocal translocations using the hamster in vitro penetration system. Am J Hum Genet 38:954-964. [7] Campana, M., Serra, A., and Neri, G. (1985) Role of chromosome aberrations in recurrent abortions: A study 259 balanced translocations. Am J Med Genet 24:341-356. [8] Chandley, A. C. (1989) Meiotic studies and fertility in human translocation carriers. In: Daniel, A. (ed.) The Cytogenetics of Mammalian Autosomal Rearrangements. Vol. 8 in Sandberg, A. A. (series ed.) Progress and Topics in Cytogenetics pp:361-382. New York: Alan R. Liss, Inc. [9] Daniel, A. (1979) Structural differences in reciprocal translocations: potential for a model of risk in reciprocal translocation. Hum Genet 51:171-182. [10] Daniel, A., Hook, E. B., and Wulf, G. (1988) Collaborative U.S.A. data on pre-natal diagnosis for parental carriers of chromosome rearrangements: risks of un-balanced progeny. In: Daniel, A. (ed.) The Cytogenetics of Mammalian Autosomal Rearrangements. Vol. 8 in Sandberg, A. A. (series ed.) Progress and Topics in Cytogenetics pp:73-162. New York: Alan R. Liss, Inc. 139 References 140 [11] Daniel A., Hook, E. B., and Wulf, G. (1989) Risks of unbalanced progeny at amnio-centesis to carriers of chromosome rearrangements: data from United States and Canadian laboratories. Am J Med Genet 31:14-53. [12] Davis, J. R., and Hagaman, R. M. (1987) Fragile sites are unrelated to reciprocal translocation breakpoints. Clin Genet 31:308-310. [13] Davis, J. R., Hagaman, R. M., Thies, A. C, and Veomett, I. C. (1985) Balanced reciprocal translocations: risk factors for aneuploid segregate viability. Clin Genet 27:1-19. [14] Davis, J. R., Rogers, B. B., and Hagaman, R. M. (1988) Factors influencing viability in reciprocal translocation segregants in man. In: Daniel, A. (ed.) The Cytogenetics of Mammalian Autosomal Rearrangements. Vol. 8 in Sandberg, A. A. (series ed.) Progress and Topics in Cytogenetics pp:419-451. New York: Alan R. Liss, Inc. [15] Dutrillaux, B. (1979) Chromosomal evolution in primates: Tenative phylogeny from Microcebus murinus (Prosimian) to man. Hum Genet 48:251-314. [16] De Braekeleer, M. (1985) Fragile sites and chromosome breakpoints in constitu-tional rearrangements. Clin Genet 27:523-524. [17] De Braekeleer, M., and Smith, B. (1988) Two methods for measuring nonrandom-ness of chromosome abnormalities. Ann Hum Genet 52:63-67. [18] De Braekeleer, M., Smith, B., and Lin, C.C. (1985) Fragile Sites and Structural Rearrangements in Cancer. Hum Genet 69:112-116. [19] Evans, H. J. (1973) Molecular Architecture of human chromosomes. Br Med Bul-letin 29:196-202. [20] Evans, J. A., Canning, N., Hunter, A. G. W., Martsolf, J. T., Ray, M., Thompson, D. R., and Hamerton, J. L. (1978) A cytogenetic survey of 14,069 newborn infants III. An analysis of the significance and cytologic behaviour of the Robertsonian and reciprocal translocations. Cytogenet Cell Genet 20:96-123. [21] Feichtinger, W., and Schmid, M. (1989) Increased frequencies of sister chromatid exchanges at common fragile sites (l)(q42) and (19)(ql3) Hum Genet 83:145-147. [22] Ferguson-Smith, M. A., and Yates, J. R. W. (1984) Maternal age specific rates for chromosome aberrations and factors influencing them: report of a collaborative european study on 52 965 amniocenteses. Prenat Diagn 4:5-44. [23] Fienberg, S. E. (1980) The Analysis of Cross-Classified Categorical Data. Mas-sachusetts: The MIT Press. [24] Friedman, J. M., Smith, J. P., Lerner, B. N., Helgeson, J. S., Howard-Peebles P. N., Mize, C. E., Mize, S. G., Singleton, W. L., and Smith, M. E. (1987) ReCAP: the registry of cytogenetic abnormalities and phenylketonuria. Am J Med Genet 27:325-336. References 141 Friedrich, U., and Nielsen, J. (1974) Autosomal reciprocal translocations in new-born children and their relatives. Humangenetik 21:133-144. Funderburk, S. J., Spence, A. M., and Sparkes, R. S. (1977) Mental retardation associated with "balanced" chromosome rearrangements. Am J Hum Genet 29:136-141. Glover, T. W., and Stein, C. K. (1987) Induction of sister chromatid exchanges at common fragile sites. Am J Hum Genet 41:882-890. Glover, T. W., and Stein, C. K. (1988) Chromosome breakage and recombination at fragile sites. Am J Hum Genet 43:265-273. Hamerton, J. L., Canning, N., Ray, M., and Smith, S. (1975) A cytogenetic survey of 14,069 newborn infants. Clin Genet 8:223-243. Hecht, F., and Hecht, B. K. (1984) Fragile sites and chromosome breakpoints in constitutional rearrangements I. Amniocentesis. Clin Genet 26:169-173. Hecht, F., and Hecht, B. K. (1984) Fragile sites and chromosome breakpoints in constitutional rearrangements II. Spontaneous abortions, stillbirths, and newborns. Clin Genet 26:174-177. Hook, E. B., and Cross, P. K. (1987) Rates of mutant and inherited structural cytogenetic abnormalities detected at amniocentesis:results on about 63 000 fetuses. Ann Hum Genet 51:27-55. Hook, E. B., Schreinemachers, D. M., Willey A. M., and Cross, P.K. (1983) Rates of mutant structural chromosome rearrangements in human fetuses: Data from pre-natal cytogenetic studies and associations with maternal age and parental mutagen exposure. Am J Hum Genet 35:96-109. "ISCN (1985) An international system for human cytogenetic nomenclature. Cyto-genet Cell Genet 40:50-57. Jacobs, P. A. (1977) Population survellance: a cytogenetic approach. In: Morton, N. E., and Chung, C. S. (eds) Genetic Epidemiology New York: Academic Press. Jacobs, P. A., Melville, M., Ratcliffe, S., Keay, A. J., and Syme, J. (1974) A cytogenetic survey of 11,680 newborn infants. Ann Hum Genet 37:359-376. Jalbert, P., Jalbert., H., and Sele, B. (1988) Types of imbalances in human re-ciprocal translocations: risks at birth. In: Daniel, A. (ed.) The Cytogenetics of Mammalian Autosomal Rearrangements. Vol. 8 in Sandberg, A. A. (series ed.) Progress and Topics in Cytogenetics pp:267-291. New York: Alan R. Liss, Inc. [38] Jalbert, P., and Sele, B. (1979) Factors predisposing to adjacent 2 ans 3:1 disjunc-tions: study of 161 human reciprocal translocations. J Med Genet. 16:467 References 142 [39] Jalbert, P., Sele, B., and Jalbert, H. (1980) Reciprocal translocations: a way to predict the mode of imbalanced segregation by pachytene-diagram drawing. Huma Genet 55:209-222. [40] Kaiser, P. (1988) Pericentric Inversions: Their problems and clinical significance. In: Daniel, A. (ed.) The Cytogenetics of Mammalian Autosomal Rearrangements. Vol. 8 in Sandberg, A. A. (series ed.) Progress and Topics in Cytogenetics pp:163-247. New York: Alan R. Liss, Inc. [41] Larsen, R. J., and Marx, M. L. (1981) An Introduction to Mathematical Statistics and Its Applications. Englewood Cliffs, New Jersey: Prentice-Hall [42] Lin, C. C, Gedeon, M. M., Griffith, P., Newton, D. R., Wilkie, L., and Sewell, L. M. (1976) Chromosome analysis on 930 consecutive newborn children using quinacrine fluorescent banding technique. Hum Genet 31:315-328. [43] Madan, K. (1988) Paracentric inversions and their clinical implications. In: Daniel, A. (ed.) The Cytogenetics of Mammalian Autosomal Rearrangements. Vol. 8 in Sandberg, A. A. (series ed.) Progress and Topics in Cytogenetics pp:249-266. New York: Alan R. Liss, Inc. [44] Martin, R. H. (1988) Abnormal spermatozoa in human translocation and inversion carriers. In: Daniel, A. (ed.) The Cytogenetics of Mammalian Autosomal Rear-rangements. Vol. 8 in Sandberg, A. A. (series ed.) Progress and Topics in Cytoge-netics pp:397-417. New York: Alan R. Liss, Inc. [45] Martin, R. H., Rademaker, A. W., Hildebrand, K., Long-Simpson, L., Peterson, D., and Yamamoto, J. (1987) Variation in the frequency and type of sperm chromosome abnormalities among normal men. Hum Genet 77:108-114. [46] Maserati, E., Pasquali, F., and Peretti, D. (1986) Different break-points in Philadel-phia chromosome variant translocations and in constitutional and sporadic translo-cations. Ann Hum Genet 50:153-162. [47] Mattei, M. G., Souiah, N., and Mattei, J. F. (1979) Distribution of spontaneous chromosome breaks in man. Cytogenet Cell Genet 23:95-102. [48] Mendenhall, W., Schaeffer, R. L., & Wackerly, D. D. (1986) Mathematical Statistics with Applications. 3rd edition. Boston: Duxbury Press. [49] Nielsen, J., and Rasmussen, K. (1976) Distribution of breakpoints in reciprocal translocations in children ascertained in population studies. Hereditas 82:73-77. [50] Nielsen, J., and Sillesen, I. (1975) Incidence of chromosome aberrations among 11 148 newborn children. Humangenetik 30:1-12. References 143 [51] Palmer, C. G. (1981) Are there "hot spots" on the human genome? Evidence from breakpoint analysis in a collaborative study of germinal chromosome rearrange-ment. In Population and Biological Aspects of Human Mutation (ed E. B. Hook and I. H. Porter), pp. 147-165. New York: Academic Press. [52] Pearson, E. S., and Hartley, H. 0. eds. (1966) Biometrika Tables for Statisticians vol. 1. Cambridge: Cambridge University Press. [53] Porfirio, B., Dallapiccola, B., and Terrenato, L. (1987) Breakpoint distribution in constitutional chromosome rearrangements with respect to fragile sites. Ann Hum Genet 51:329-336. [54] Runyon, R. P. (1985) Fundamentals of Statistics in the Biological, Medical, and Health Sciences. Boston: Duxbury Press. [55] Savage, J. R. K. (1977) Assignment of aberration breakpoints in banded chromo-somes. Nature 270:513-514. [56] Schinzel, A. (1984) Catalogue of Unbalanced Chromosome Aberrations in Man. Berlin:Walter de Gruyter. [57] Schinzel, A. (1988) Phenotype in autosomal chromosome aberrations: distinctive-ness, variability, and karyotype correlations. In: Daniel, A. (ed.) The Cytogenetics of Mammalian Autosomal Rearrangements. Vol. 8 in Sandberg, A. A. (series ed.) Progress and Topics in Cytogenetics pp:725-738. New York: Alan R. Liss, Inc. [58] Schwartz, S., Palmer, C. G., and Yu, P.-L. (1982) Evaluation of factors differen-tiating translocations ascertained in couples with fetal wastage and translocations ascertained through an unbalanced carrier. Am J Hum Genet 34:142A [59] Schwartz, S., Palmer, C. G., Yu, P.-L., Boughman, J. A., and Cohen, M. M. (1986) Analysis of translocations observed in three different populations. I. Reciprocal translocations. Cytogenet Cell Genet 42:42-52. [60] Schwartz, S., Palmer, C. G., Yu, P.-L., Boughman, J. A. and Cohen, M. M. (1986) Analysis of translocations observed in three different populations. II. Robertsonian translocations. Cytogenet Cell Genet 42:53-56. [61] Searle, J. B. (1988) Selection and robertsonian variation in nature: the case of the common shrew. In: Daniel, A. (ed.) The Cytogenetics of Mammalian Autosomal Rearrangements. Vol. 8 in Sandberg, A. A. (series ed.) Progress and Topics in Cytogenetics pp:507-531 New York: Alan R. Liss, Inc. [62] Smith, C. A. B. (1986), Chi-squared test with small numbers. Ann Hum Genet. 50:163-167. References 144 [63] Stene, J. and Stengel-Rutkowski, S. (1988) Genetic risks of familial reciprocal and Robertsonian translocation carriers. In: Daniel, A. (ed.) The Cytogenetics of Mam-malian Autosomal Rearrangements. Vol. 8 in Sandberg, A. A. (series ed.) Progress and Topics in Cytogenetics pp:3-72. New York: Alan R. Liss, Inc. [64] Stoll, C. (1980) Nonrandom distribution of exchange points in patients with recip-rocal translocations. Hum Genet 56:89-93. [65] Sturtevant, A. H., and Beadle, G. W. (1936) The relation of inversion in the X chromosome of Drosophila Melanogaster to crossing over and disjunction. Genetics 21:554-604. [66] Sutherland, G. R., and Hecht, F. (1985) Fragile Sites on Human Chromosomes. Oxford Monographs on Medical Genetics, No. 13. New York: Oxford University Press. [67] Sutherland, G. R., and Ledbetter, D. H. (1989) Report of the committee on cyto-genetic markers. Human Gene Mapping 10. Cytogenet Cell Genet 51:452-458 [68] Sutton, H. E. (1988) Human Cytogenetics. Harcourt Brace Jovanich, Inc. [69] Therman, E., Susman, B., and Denniston, C. (1989) The nonrandom participa-tion of human acrocentric chromosomes in Robertsonian translocations. Ann Hum Genet 53:40-65. [70] Vasarhelyi, K., and Friedman, J. M. (1989) Analysing rearrangement breakpoint distributions by means of binomial confidence intervals. Ann Hum Genet 53:375-380. [71] Warburton, D. (1984) Outcome of cases of de novo structural rearrangements di-agnosed at amniocentesis. Prenat Diagn 4:69-70. [72] Yu, C. W., Borgaonkar, D. S., Boiling, D. R. (1978) Break points in human chro-mosomes. Hum Hered 28:210-225. [73] Yunis, J. J., and Soreng, A. L. (1984) Constitutive fragile sites and cancer. Science 226:1199-1204. 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0098346/manifest

Comment

Related Items