Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Global approaches to identifying aberrations in early staged disease in cancers affecting women Shadeo, Ashleen 2009

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-ubc_2010_spring_shadeo_ashleen.PDF [ 4MB ]
Metadata
JSON: 24-1.0072300.json
JSON-LD: 24-1.0072300-ld.json
RDF/XML (Pretty): 24-1.0072300-rdf.xml
RDF/JSON: 24-1.0072300-rdf.json
Turtle: 24-1.0072300-turtle.txt
N-Triples: 24-1.0072300-rdf-ntriples.txt
Original Record: 24-1.0072300-source.json
Full Text
24-1.0072300-fulltext.txt
Citation
24-1.0072300.ris

Full Text

GLOBAL APPROACHES TO IDENTIFYING ABERRATIONS IN EARLY STAGED DISEASE IN CANCERS AFFECTING WOMEN  by  ASHLEEN SHADEO B.Sc. The University of British Columbia, 2001  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE STUDIES (Pathology and Laboratory Medicine) THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver)  December 2009 © Ashleen Shadeo, 2009  Abstract INTRODUCTION: Breast and cervical cancer are the most common cancers in women worldwide. Widely implemented screening programs have allowed for the detection of precancerous lesions in the breast and cervix and have provided a valuable opportunity to study the earliest events in disease development with the goal of distinguishing cases that are likely to progress from those that are self limited or would spontaneously regress. OBJECTIVE: The overall objective of this thesis is to identify genes in biological pathways or networks altered in pre cancerous lesions of the breast and cervix using global analysis tools. HYPOTHESIS: I hypothesize that global transcriptome and high resolution genome analysis will identify altered genes that are shared in pre cancer lesions of both the cervix and breast. METHODS: A comprehensive Serial Analysis of Gene Expression method was used for the unbiased analysis of well characterized, frozen samples of normal cervix, CIN I, CIN II and CIN III. Tiling resolution whole genome array comparative hybridization was used for the detailed investigation of lobular carcinoma in situ (LCIS) and atypical lobular hyperplasia (ALH). The efficacy of this tool was first confirmed in commonly used breast cancer cell lines and then in archival breast cancer tissue. RESULTS: This work has led to the identification of aberrations in a chromatin remodelling gene network not previously implicated in cervical intraepithelial neoplasia. Genomic copy number analysis revealed novel features not previously described including aberrations to multiple components of a single biological pathway (epidermal growth factor receptor), the delineation of alteration boundaries and the identification of a novel amplicon of prognostic significance in breast cancer. We identified novel copy number changes in several genes in LCIS and ALH (including HOXB cluster genes) that were used to elucidate a genomic signature. Aberrations in HoxB7 were identified in pre cancer lesions of both breast and cervix (LCIS and CIN III) tissue. CONCLUSION: Collectively, this work demonstrates that through whole genome approach of assessment of genomic copy number and expression we can identify novel genes and gene networks that are altered during the development of pre cancer lesions.  11  Table of Contents Abstract  .  ii  Table of Contents  iii  List of Tables  ix  List of Figures  x  List of Abbreviations  xii  Acknowledgements  xiv  Co-authorship Statement  xv  Dedication  xvi  1. INTRODUCTION  1  1.1. Cervical Cancer  1  1.1.1. Incidence and Mortality  1  1.1.2. Screening and Staging  1  1.1.3. Etiologic Origins  3  1.2. Breast Cancer  6  1.2.1. Incidence and Mortality  6  1.2.2. Disease Progression  6  1.2.3. Classification Approaches in a Heterogeneous Disease  8  1.3. Global Gene Expression Profiling in Cervical Cancer  9  1.3.1. Gene Expression Analysis Techniques  9  1.3.2. Gene Expression Profiling in Cervical Intraepithelial Neoplasia 1.4. Genomic Profiling and Breast Cancer  10 11  1.4.1. Genomic Profiling Technologies  11  1.4.2. Genomic Profiling in Breast Cancer  13  1.4.3. Genomic Profiling of Pre Cancer Lesions of the Breast  15  1.5. Rationale  16  1.5.1. Objectives and Hypothesis  16  1.5.2. Specific Aims and Thesis Outline  16  1.6. References  18  111  2. CHARACTERIZATION OF THE NORMAL CERVIX TRANSCRIPTOME  25  2.1. Introduction  25  2.2. Materials and Methods  26  2.2.1. Sample Selection  26  2.2.2. L-SAGE Library Construction and Sequence Tag Analysis  26  2.2.3. Data Analysis  26  2.2.4. Reverse Transcriptase PCR  27  2.2.5. HPV Tag-to-Gene Mapping  28  2.3. Results  28  2.3.1. Genes Highly Expressed in Normal Cervical Epithelium  28  2.3.2. Disrupted Gene Expression in CIN III  29  2.3.3. Viral (HPV 16) Tags in L-SAGE libraries  29  2.4. Discussion  30  2.4.1. Assessing Highly Expressed Tags by Functional Group  30  2.4.2. Cervical Tissue Gene Signature  31  2.4.3. Genes Altered in Expression in CIN III  32  2.4.4. Human L-SAGE Tags Map to the HPV Genome  34  2.5. Conclusions  35  2.6. References  44  3. TRANSCRIPTOMIC ABERRATIONS IN CHROMATIN REMODELLING PATHWAY IN CERVICAL INTRAEPITHELIAL NEOPLASIA  47  3.1. Introduction  47  3.2. Methods  48  3.2.1. Sample Selection and Collection  48  3.2.2. L-SAGE  49  3.2.3. Analysis  49  3.2.4. cDNA Synthesis  50  3.2.5. Real Time PCR Analysis  50  iv  3.3. Results  .51  3.3.1. L-SAGE Libraries from Cervical Tissue  51  3.3.2. Early Stage Changes  51  3.3.3. Late Stage Changes  52  3.3.4. Pathway Analysis of Overall Changes  52  3.4. Discussion  53  3.4.1. Early and Late Events  54  3.4.2. Biological Characteristics of Differentially Expressed Tags  54  3.4.3. Network Analysis  55  3.4.4. Network A  55  3.4.5. Network B  58  3.5. Conclusion  59  3.6. References  70  4. COMPREHENSIVE COPY NUMBER PROFILES OF BREAST CANCER MODEL GENOMES  73  4.1 Introduction  73  4.2. Materials and Methods  74  4.2.1. Cell Line DNA  74  4.2.2. Array CGH  75  4.2.3. Imaging and Analysis  75  4.2.4. Fluorescence in situ Hybridization  76  4.3. Results and Discussion  76  4.3.1. Whole Genome Tiling Path Analysis of Segmental Alterations  76  4.3.2. Novel Features of the Genome of Model Cell Lines  77  4.3.3. MCF-7 Genome  78  4.3.4. BT-474 Genome  78  4.3.5. ZR-75-30 Genome  79  4.3.6. UACC 893 Genome  80  4.3.7. SK-BR-3 Genome  80  4.3.8. MDA-MB-231 Genome  81  V  4.3.9. T-47D Genome  .  81  4.3.10. Common Regions of Copy Number Alteration  81  4.3.11. EGFR (ERBB 1) and Associated Pathways  83  4.4. Conclusion  84  4.5. References  93  5. NRG1 GENE REARRANGEMENTS IN CLINICAL BREAST CANCER: IDENTIFICATION OF AN ADJACENT NOVEL AMPLICON ASSOCIATED WITH POOR PROGNOSIS  98  5.1. Introduction  98  5.2. Materials and Methods  100  5.2.1. Breast Cancer Case Series and TMA Construction  100  5.2.2. RNA Isolation  100  5.2.3. Fluorescent in situ Hybridization  100  5.2.4. Immunohistochemical Analysis of HRG, HER3 and HER4  101  5.2.5. BAC array CGH  102  5.2.6. BAC array CGH Imaging and Analysis  102  5.2.7. Quantitative RT—PCR  102  5.2.8. Statistical Analysis  103  5.3.Results  103  5.3.1. Three Distinct Types ofNRG1 Aberrations  103  5.3.2. HRG Correlates Significantly with Tumor Grade, p53 and HER1  104  5.3.3. BAC array CGH Confirms NRG1 Aberrations and Identifies an Amplicon Centromeric to NRG1  105  5.3.4. The Novel Amplicon is Present in 24% of Clinical Breast Cancer Cases and Shows a Positive Correlation with Poor Survival  106  5.3.5. SPFH2 Most Significantly Correlates Over Expression with Amplification as Determined by Real-time PCR  106  5.4. Discussion  107  5.5. References  118  vi  6. GENOMIC ALTERATIONS IN LOBULAR NEOPLASIA: A MICROARRAY COMPARATIVE GENOMIC HYBRIDIZATION SIGNATURE FOR EARLY NEOPLASTIC PROLIFERATION iN THE BREAST  120  6.1. Introduction  120  6.2. Materials and Methods  121  6.2.1 Lobular Neoplastic Cases  121  6.2.2. Microarray Comparative Genomic Hybridization  122  6.2.3. Imaging and Data Visualization  122  6.2.4. Microarray-CGH Data Pre-processing and Normalization  122  6.2.5. Statistical Tabulation of Gains and Losses  123  6.2.6. ALH and LCIS Class Discrimination  124  6.2.7. Real-time PCR  124  6.2.8. Fluorescence in situ Hybridization (FISH)  125  6.3. Results  126  6.3.1. Regions of Gain and Loss in ALH and LCIS  126  6.3.2. Validation of Microarray-CGH Alterations  127  6.4. Discussion  128  6.5. References  139  7. CONCLUSION  142  7.1. Discussion  142  7.1.1. Global Gene Expression Analysis by SAGE in Cervical Intraepithelial Neoplasia  143  7.1.2. Efficacy of Whole Genome tiling aCGH in Breast Cancer  145  7.1.3. Whole Genome Tiling Resolution aCGH Analysis in Precancerous Lesions  147  7.1.4. Alteration of HoxB7 Shared in Precancerous Lesions of Cervix and Breast  149  7.2. Conclusions and Significance Findings  vii  150  7.3 Future Directions  .  151  7.3.1. Histological Variants of Precursor Lesions of Interest  151  7.3.2. Integrative Studies  152  7.3.3. HPV Screening  153  7.3.4. HoxB7  154  7.4. References  155  viii  List of Tables  Table 2.1. Tags expressed in normal cervical libraries  37  Table 2.2. Highly expressed genes with altered expression in CIN III  38  Table 2.3. Tags mapping to HPV16 genome  39  Table 3.1. Genes differentially expressed in Networks A and B  61  Table 3.2. Candidate genes in Networks A and B  62  Table 4.1. High level alterations detected by array CGH  85  Table 4.2. Components of the EGFR pathway affected by copy number change  87  Table 5.1. Correlations between 8p abnormalities and clinical/pathological variables  110  Table 5.2. Correlations between expression and amplification status of genes from chromosome 8p  111  Table 6.1. Pathological classification of lobular neoplastic cases evaluated by microarray-CGH  132  Table 6.2. Genomic signature for lobular neoplasia  133  Table 6.3. Comparison of alterations found by microarray-CGH and subsequently validated by Real-Time PCR  134  ix  List of Figures  Figure 2.1. Flow diagram of SAGE analysis and tag-to-gene mapping  40  Figure 2.2. Validation of tissue specificity of gene expression  41  Figure 2.3. Summary of test panel quantitative PCR results of genes with altered expression in CIN III L-SAGE libraries  42  Figure 2.4. Functional groupings of tags highly expressed (>500 tpm) in normal libraries  43  Figure 3.3.1. Summary of L-SAGE library analysis  63  Figure 3.3.2. Functional groupings of tags differentially expressed in SAGE libraries  64  Figure 3.3.3. Biological functions targeted by gene expression changes between NC and CIN III  65  Figure 3.3.4. Gene Network A  66  Figure 3.3.5. Gene Network B  67  Figure 3.3.6. Summary of validation panel quantitative PCR results of genes with altered expression in CIN III L-SAGE libraries  68  Figure 3.4. Histopathology correlation with texture score  69  Figure 4.4.1. Comprehensive SMRT aCGH profile of cell line UACC-893  89  Figure 4.4.2. Magnified SMRT aCGH profile of lp2l.l-pl 1.1 region in MCF-7  90  Figure 4.4.3. Fluorescence in situ hybridization (FISH) analysis in SK-BR-3 and BT-474 cells  91  Figure 4.4.4. 1 7q SMRT aCGH profile of five cell lines sharing multiple sites of minimum altered regions (MAR)  92  Figure 5.5.1  112  (a) Summary ofNRG1 aberrations as determined by FISH  112  (b) Amplification of the 5’-end ofNRG1  112  Figure 5.5.2  112  (a) A schematic representation of the region of interest on the 8p arm  113  (b) BAC array CGH results  113 x  (c) A graphical representation of amplification ratios obtained from FISH for all BACs  113  Figure 5.5.3. Minimum common region of amplification (MAR) at 8pl2  114  Figure 5.5.4  115  (a) Kaplan—Meier survival curve demonstrating the prognostic significance of the novel amplicon  115  (b) Summary of univariate survival analysis for NRG1 aberrations and the novel amplicon  115  Figure 5.5.5. Schematic representation of the BFB mechanism  116  Figure 6.1. Region of copy number gain at chromosome lOp 15 .2-p 15.1  135  Figure 6.2. Region of copy number loss at chromosome 16q21-q23.1  136  Figure 6.3. Heatmap with accompanying dendrograms illustrating the clustering of lobular neoplastic lesions using the identified 33 region genomic signature for lobular neoplasia  137  Figure 6.4. Region of copy number loss at chromosome 14q32.33 validated using fluorescence in situ hybridization (FISH)  xi  138  List of Abbreviations  Abbreviation  Definition  aCGH  array Comparative Genomic Hybridization  ADH  Atypical Ductal Hyperplasia  ALH  Atypical Lobular Hyperplasia  BAC  Bacterial Artificial Chromosome  BFB  Break—fusion—bridge  bp  base pairs  cDNA  complementary DNA  CGH  Comparative Genomic Hybridization  CIN  Cervical Intraepithelial Neoplasia  CIS  Carcinoma in situ  Cy3  Cyanine 3  Cy5  Cyanine 5  DCIS  Ductal Carcinoma in situ  dCTP  Deoxycytidine Triphosphate  DNA  Deoxyribonucleic acid  dNTP  Deoxynucleotide Triphosphate  EST  Expresses Sequence Tag  FFPE  Formalin-fixed, Paraffin Embedded  FIGO  International Federation of Gynaecologic Oncology  FISH  Fluorescence in situ Hybridization  HER  Human Epidermal Growth Factor Receptor  HPV  Human Papilloma Virus  HSIL  High Grade Squamous Intraepithelial Lesion  IDC  Invasive Ductal Carcinoma  IHC  Immunohistochemistry  ILC  Invasive Lobular Carcinoma  kbp  kilo-base pairs  LCIS  Lobular Carcinoma in situ  xii  LOH  Loss of Heterozygosity  LSIL  Low Grade Squamous Intraepithelial Lesion  MAR  Minimum Amplification Region  Mbp  Mega-base pairs  M-FISH  Multiple Fluorescence in situ Hybridization  mM  millimolar  mRNA  messenger RNA  Pap  Papanicolauo  PCR  Polymerase Chain Reaction  RNA  Ribonucleic acid  ROMA  Representative oligonucleotide microarray analysis  RT PCR  Reverse Transcriptase Polymerase Chain Reaction  SAGE  Serial Analysis of Gene Expression  SCC  Squamous Cell Carcinoma  SKY  Spectral Karyotyping  SNP  Single Nucleotide Polymorphism  SNR  Signal-to-noise Ratio  TDLU  Terminal Ductal Lobular Unit  TMA  Tissue Microarray  TPM  Tags Per Million  xlii  Acknowledgements  I am indebted to many people for their support of this work over the years but first and foremost I am grateful to my supervisor. Dr. WL Lam has not only been an inspiring guide and advisor throughout my training but has provided challenges when I needed them most and innumerable opportunities to explore avenues of my own interest. I am indebted to this generosity.  I would like to extend my sincere appreciation to the Lam Lab who have not only provided savvy technical support but have been there to share in the successes, commiserate in the defeats and work late into the night under a common vision. It is a family like no other and I am privileged to be a member.  I would like to thank my committee members: Drs. Sandra Dunn, Torsten Nielsen, Calum MacAulay and David Granville for their guidance during my time as a graduate student. The breadth of experience you have brought to my training continues to be invaluable. I would like to extend my appreciation to CIHR for funding of my work.  I am grateful to my family and friends for their unwavering confidence in me. Without these expectations I would surely fall flat. I am thankful to my parents for setting an inspiring example of determination and success. I extend my deepest gratitude to my husband Gordon for his constant support and belief in me. Thank you  xiv  Dedication  For Gordon  xv  Co-authourship Statement Chapters 2 through 6 were originally co-authored as research manuscripts for publication. The entries below represent the complete citations for each of these works.  Chapter 2: Shadeo A, Chari R, Vatcher G, Campbell J, Lonergan KM, Matisic J, van Niekerk D, Ehlen T, Miller D, Follen M, Lam WL, Macaulay C Comprehensive serial analysis of gene expression of the cervical transcriptome. BMC Genomics. Jun 1;8(1): 142 Contribution: I am the primary and corresponding authour on this paper. Under the guidance of the principal investigators, I was responsible for project design, and the execution of this study. I created a portion of the SAGE libraries investigated.  In collaboration with Raj Chari, I  performed data analysis. I was also responsible for writing the manuscript. All authors read and approved the final draft of this manuscript.  Chapter 3: Shadeo A, Chari R, Lonergan KM, Pusic A, Miller D, Ehlen T, van Niekerk D, Matisic J, Richards-Kortum R, Follen M, Guillaud M, Lam WL, Macaulay C Up regulation in gene expression of chromatin remodelling factors in cervical intraepithelial neoplasia. BMC Genomics, 2008 Feb 4;9(1):64 Contribution: I am the primary and corresponding authour on this paper. Under the guidance of the principal investigators, I was responsible for project design, and the execution of this study. I created a portion of the SAGE libraries investigated.  In collaboration with Raj Chari, I  performed data analysis. I was also responsible for writing the manuscript. All authors read and approved the final draft of this manuscript.  Chapter 4: Shadeo A and Lam WL. Comprehensive Copy Number Profiles of Breast Cancer Cell Model Genomes. Breast Cancer Res. 2006;8(1):R9 (1-14). Epub Jan 3. Contribution: I am the primary and corresponding authour on this paper. I performed the array CGH experiments, data analysis and drafted the manuscript. WLL is the Principal Investigator. Both authors participated in the development of concepts and framework for the manuscript, the generation of figures, multiple rounds of text editing, and fact checking. Both authors read and approved the final manuscript.  xvi  Chapter 5: Prentice LM, Shadeo A, Lestou VS, Miller MA, deleeuw RJ, Makretsov N, Turbin D, Brown LA, Macpherson N, Yorida E, Cheang MC, Bentley J, Chia S, Nielsen TO, Gilks CB, Lam W, Huntsman DG. (2005) NRGI gene rearrangements in clinical breast cancer: identification of an adjacent novel amplicon associated with poor prognosis. Oncogene. Nov 1O;24(49): 7281-9. Contribution: I am the second authour on this paper. I was responsible for performing all the aCGH experiments/analysis. The key discovery of a novel genetic rearrangement (rather than the anticipated NRG1 alteration) was based on my analysis of the CGH data. In addition, I constructing figures 2b and 3 and editing of the manuscript.  Chapter 6: Mastracci T L, Shadeo A, Colby S M, Tuck A B, O’Malley P, Bull S B, Lam W L, Andrulis I L. (2006) Genomic Alterations in Neoplastic Lesions of the Breast. Genes Chromosomes and Cancer. Nov;45(11):1007-17. Contribution: I am the second authour on this paper. I was responsible for performing all the aCGH experiments, aCGH data analysis and was consulted on statistical analysis of ALH and LCIS group comparison. I constructed figures Ib, 2b and 3a and editing of the manuscript.  xvii  1. INTRODUCTION  Breast and cervical cancer are the most prevalent cancers in women worldwide with over 1.6 million cases diagnosed each year [1, 2]. Breast cancer incidence is increasing in the majority of countries whereas invasive cervical cancer incidence has declined due to the implementation of screening programs designed to identify pre cancerous and early staged lesions [2, 3]. Similarly, breast cancer mortality has also declined due to advances in early detection and treatment [2]. These widely implemented screening programs in breast and cervical cancer have afforded researchers an important opportunity to study the key molecular events in pre cancer and cancer development that may be masked in later stages of disease. The development and advancement of technologies available in whole genome copy number and expression evaluation provide valuable tools not only in the investigation of individual key molecular events but also an opportunity to inquire about the collaboration of several events and ultimately gene networks which may be responsible for altering cell biology to adopt the tumourogenic phenotype.  1.1. Cervical Cancer 1.1.1. Incidence and Mortality Cervical cancer is the third leading cause of cancer associated deaths and the second most common cancer in women worldwide [1]. In 2000 it was estimated that over 471,000 new cases were diagnosed throughout the world, 288,000 women died of cervical cancer with the vast majority of deaths (80%) shouldered by developing countries with highest mortality rates found in the Caribbean and Latin America. According to the National Cancer Institute, it is estimated that 11,070 new cases and 3,870 deaths will occur in the US. Although incidence and mortality rates are decreasing in Canada, 1,300 new cases are expected to be diagnosed in 2008 and 380 women would have died from this disease [4].  1.1.2. Screening and Staging Currently, widely implemented screening programs have been responsible for the drastic reduction of cervical cancer experienced in the developed world over the past fifty years [5]. This has primarily been achieved through the detection of pre cancerous lesions that has allowed for an opportunity to simultaneously biopsy and treat suspicious lesions by colposcopic  excisional cone biopsy or Loop Electrosurgical Excision Procedure (LEEP) thus preventing progression to invasive cervical cancer and metastasis to organs such as lungs, liver, vagina, bladder and rectum. The Papanicolaou (Pap) test is the most commonly used screen. Briefly, it is a cytomorphological assessment of exfoliated cells collected from the cervix and smeared onto a slide. The cells are categorized into normal or atypical. Atypical cells are generally followed up with a biopsy of the affected region that is then classified by pathology into dysplasia, carcinoma in situ and invasive carcinoma. Alternatively, mild atypical cells may first be followed closely  through additional Pap tests. This last group is further classified into several stages according to the staging classification as established by the International Federation of Gynecology and Obstetrics [6]. Dysplasia, also termed cervical intraepithelial neoplasia (CIN), is a precursor lesion to cervical cancer and can be further subdivided into CIN I, CIN II and CIN III (mild, moderate and severe dysplasia, respectively). CIN I lesions are likely to spontaneously regress to normal however CIN III lesions are very likely to progress to invasive cervical cancer if left untreated [7].  The distinction between CIN III and carcinoma in situ is not well defined  therefore CIN III lesions may also be described as carcinoma in situ. Pre cancer lesions can also be cytologically characterized using the Bethesda classification system as high grade squamous intraepithelial lesions (HSIL) and low grade squamous intraepithelial lesions (LSIL) [3]. Typically, CIN I are analogous to LSIL while CIN II and CIN III to HSIL [6]. Further, several histological subtypes of cervical cancer exist including squamous cell carcinoma (SCC), adenocarcinoma, and adenosquamous carcinoma with the majority of cases diagnosed as squamous cell carcinoma (>80%) followed by adenocarcinoma (l5%) [6]. Repeat testing is common with the Pap screen as there are high false positive and false negative rates thus motivating the development of additional markers [3]. An alternative screening option is liquid based cytology Pap test which is an adaption of the traditional Pap test. Briefly,  cells are  collected by cervix brush and transferred into a liquid preservative then a representative aliquot of the dispersed cells are applied to a slide in monolayer for assessment. Marginally higher sensitivity of CIN lesions and adenocarcinoma in situ is reported in several studies using this technology; however, the increase in cost and the unclear benefits of this minute increase in sensitivity do not clearly support replacement of the traditional Pap test [3]. According to the British Columbia Cancer Agency, cure rates for CIS/CIN III and stage 1 invasive disease are 100% and 80-90% respectively. 2  1.1.3. Etiologic Origins  Nearly 18% of cancers in 2002 worldwide are attributed to infection by bacteria or virus. The Human papilloma virus (HPV) alone accounts for over 5% of cancers worldwide [8]. This family of viruses is comprised of small (55nm diameter) non-enveloped DNA viruses that are capable of infecting a spectrum of animals from birds to mammals[9]. The double stranded circular DNA genome encodes 10 proteins (El, E2, E3, E4, E5, E6, E7, E8, Li and L2) that are expressed in either the early or late phase of the virus infection lifecycle within the nucleus of the host cell. The virus infects keratinocytes in the basal layers of stratified epithelium.  Keratinocyte  differentiation regulates the expression of viral genes. El, E2, E4, E6 and E7 are expressed in undifferentiated cells while viral capsid proteins Li and L2 and, E4 are expressed in terminally differentiating keratinocytes. New HPV virons are released at this stage. The infection is usually cleared spontaneously. Oncogenic transformation occurs with persistent infection that is characteristically accompanied by the integration of the viral episome into the host genome in a manner that disrupts the expression of E2, the transcriptional repressor of E6 and E7, resulting in the overexpression of viral oncogenes E6 and E7. Specifically, E6 inactivates the p53 tumour suppressor through an ubiquitin mediated degradation resulting in the deregulation of protective cellular processes governed by p53 including in apoptosis, cell cycle arrest and senescence in response to stressors such as DNA damage[l0, 11]. E6 further contributes to the development of a malignant phenotype through its transcriptional activation of telomerase leading to immortalization of the affected epithelial cells [12]. E7 inactivates pRb tumour suppressor by binding and destabilizing the protein thus not allowing for normal pRb regulation of the E2F family of transcription factors that guide cell cycle [13]. E6 and E7 genes of high risk strains are more efficient in targeting host tumour suppressor genes and activities.  HPV is a sexually transmitted virus and causal agent in 90% of anal cancers, 40% of vulvar and vaginal cancers, 12% of oropharyngeal cancers and 3% of oral cancers [14]. The contribution of HPV to cervical cancer was first proposed by Harald zur Hausen in 1983 when he discovered that a new strain of HPV (HPV 16) was present in over 60% of the cases he investigated [15]. Soon after, a second new strain, HPV 18, was also identified and found to be present in over 20% of cervical cancers [16]. To date, over 100 strains of HPV have been identified and a number of them are associated with cervical cancer including HPV 16, 18, 31, 33, 35, 39, 45, 51, 3  52, 56, 50, 59, 66 and 68[17]  .  One or more strains of HPV are detected in 99% of SCC, 94% of  pre malignant lesions and 46% of normal cervical epithelium [7]. Although HPV 16 and HPV 18 are considered highly virulent strains and play a role in the development of 80% of cervical cancers, infection alone is not sufficient to cause disease [18]. Over 70% of women will be infected with a strain of HPV in their lifetime and nearly all will clear the infection spontaneously therefore additional genetic events must occur for the development of cervical cancer [19]. Persistent infection in a fraction of women affected (<10%) will lead to a 30- 50% chance of developing cervical cancer in their lifetime if left untreated [20]. Progression from infection to cancer is estimated to take 10-15 years [21].  Recently, two HPV first generation vaccines have become available that are comprised of the Li capsid protein. A divalent vaccine from GlaxoSmithKline (Cervarix®) establishes immunity to HPV 16 and 18 and a quadrivalent vaccine (Gardasil®) by Merck immunizes not only against HPV 16 and HPV 18 but also genital wart causing strains HPV 6 and HPV 11 [5, 22]. Gardasil has been approved for use in several countries including Australia, Mexico, the European Union, New Zealand and Canada and is recommended for girls between the ages of 9 and 26. Vaccinating men and women is estimated to result in a 44% reduction in prevalence of HPV while vaccinating only women will result in a 30% reduction [23]. The three dose vaccination of girls in grade 6 and 9 was implemented in British Columbia beginning in the 2008-2009 academic year.  Screening programs such as those described are difficult to implement and establish in the developing world where they are most needed. Pap screening for example, not only requires consistent and thorough assessment of samples collected but also a commitment from patients to submit to regular screening and if necessary, follow up procedures. Vaccination would eventually reduce the cases of cervical cancer by 70% however the vaccine is costly (‘-S500 USD per vaccine schedule) and must be administered in three doses which may be challenging [24]. If these hurdles are overcome and widespread vaccination of women in developing countries were to occur, a reduction in cervical cancer rates would not be observed for one or two decades due to the long latency period between HPV infection and disease development. The female population would need to continue to be screened using conventional methods in order to 4  identify those cases initiated by HPV strains not encompassed by the vaccine or by previous infection of vaccinated strains. Frontline monitoring will therefore continue to play an important role in cervical cancer prevention and treatment.  The high false positive results of current  screening tests lead to costly retesting and motivate the development of improved methods and markers for detection. Genetic events in addition to HPV infection are required for the progression to cervical cancer and markers rooted in those events would aid in distinguishing cases that are likely to progress from those that would spontaneously regress.  Detection of HPV DNA is currently at the forefront of new technologies in cervical cancer screening. The high sensitivity and reproducibility of DNA detection tests enable the recognition of HPV infection prior to any observable cytological abnormalities offering key advantages over the Pap screen [25, 26]. A limitation is the reduced specificity in women under the age of 30 with respect to the detection of cases that are likely to progress as transient infections are common [27]. Several assays are currently available for the detection of HPV including the Hybrid Capture 2 (Hc2) and GP5+/6+ PCR enzyme assay that detect 13 oncogenic strains of HPV (16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58, 59 and 68) [28, 29]. Hc2 has been approved by the US Food and Drug Administration and is used in adjunct to the Pap test in women over 30 in the US. This assay involves the hybridization of specimen HPV DNA to RNA probes comprised of the 13 oncogenic strains. The resulting DNA-RNA hybrid is recognized by an immobilized RNA DNA antibody. The bound DNA-RNA-antibody complex is recognized by an alkaline phosphatase-conjugated monoclonal antibody which cleaves an added chemiluminescent substrate resulting in the emission of light. Luminescence is proportional to the quantity of viral DNA and is measured in relative light units (RLU) against a positive control of HPV 16 DNA.  The effectiveness of HPV DNA testing as a primary screen is currently under investigation. The proposed design is that women who test negatively for the presence of HPV DNA would be re screened in 2-4 years whereas a positive test would be followed up more closely with further cytological evaluation. This scenario aims to reduce the frequency of screening in addition to earlier detection of precancerous lesion without ambiguity.  In order to achieve these goals,  testing must be optimized to detect >CIN II thus eliminating unnecessary follow up procedures  5  as the vast majority of low grade cases will not progress [30]. A combination of screening methods may be optimal in reducing high false positive rates and detecting advanced lesions. Preliminary findings indicate that 30-35% more high grade lesions are detected in women 32-38 years of age when HPV DNA testing and cytology assessment are used in combination as opposed to cytology screening alone [31]. The addition of assays detecting molecular markers that correlated with HPV persistence and high grade CIN to this scheme will further specifically identify cases that are likely to progress to invasive disease if left untreated.  1.2. Breast Cancer 1.2.1. Incidence and Mortality  In Canada, it is estimated that 22,400 women were diagnosed with breast cancer (50-59 age group having the highest occurrence rates) and 5,300 succumbed to this disease in 2008 [4]. Breast cancer remains the most prevalent cancer in women however mortality rates have been on the decline for over twenty years (five year survival is 89% in British Columbia according to the British Columbia Cancer Agency) [1, 32]. This decline is primarily due to improved detection and management of early stage disease [32-34].  1.2.2. Disease Progression  The branching structure of the normal human mammary gland consists of an epithelial bilayer of luminal and myoepithelial cells. Acini comprise terminal duct lobular units (TDLU) which coalesce into subsegmental ducts then segmental ducts and finally collecting duct. [35]. Observations in histology, pathology and genomic analyses have allowed for comment on the evolution of breast cancer. Based on these techniques, it has been postulated that well differentiated ductal carcinoma in situ (DCIS) and lobular carcinoma in situ (LCIS) progress to low grade invasive ductal carcinoma (IDC) or invasive lobular carcinoma (ILC) whereas poorly differentiated DCIS progress to high grade IDC [36-3 8].  More recently, LCIS has been  suggested to be an indicator of increased risk (7-18%) for both IDC and ILC in both the ipsilateral and contralateral breast irrespective of where the LCIS was first identified [39].  6  Genetic instability ultimately resulting in the disruption (tumour suppressor gene inactivation) or increased action of genes involved in cellular processes (oncogene activation) is a well known characteristic of malignant cells and this trait has further contributed to ideas in breast cancer progression [40]. Grade of invasive breast cancer appears to relate to extent of genomic aberrations in that high grade disease harbour a greater number of complex alterations than low grade disease. For example, 16q loss is a common event in low grade breast cancers however is uncommon in the complex aberrations observed in high grade disease which frequently possess gains at 5p, 8q and 17q  ,  losses at 8p, llq, 13q and 14q and amplifications at 6q22, 8q22,  1 1q13, 17q12, 17q22-q24, and 20q13 [36]. This suggests that progression from low grade disease to high grade disease is unlikely and that these lesions may arise through different mechanisms.  The relationship between the accumulation of genetic alterations and colon cancer progression as pivotally described by Vogeistein et a/is also a widely described hypothesis in the progression of breast cancer [36, 37, 41-43]. For example, poorly differentiated ductal carcinoma in situ disease frequently harbour alterations similar to those found in high grade invasive ductal carcinoma such as an increased number and complexity of alterations; specifically, gains at lq, 5p, 8q and 17q and losses at 8p, llq, 13q and 14q in addition to amplifications at 6q22, 8q22, 1 1q13, 17q12 and 17q22-q24[36]. In contrast, high level amplifications are rare in both highly differentiated ductal carcinoma in situ and low grade invasive disease while loss of 16q and gain of lq are the most frequent shared events [44, 45]. Similar observations in genetic changes in lobular carcinoma in situ and invasive lobular carcinoma have been made. For example, the most common event include gains at lq an 6q and losses at 8p,l6p, 16q, l’7p, 17q and 22q in LCIS and gains at 4p, 5p and losses at 16q, 17q, 18q and 22q in lobular carcinoma in situ [46-50]. It is noteworthy to mention the overlap of 1 6q loss and 1 q gain not only in lobular carcinoma in situ and invasive lobular carcinoma but also in low grade ductal carcinoma in situ and invasive ductal carcinoma. Together this illustrates how molecular pathology supports the hypothesis of a clonal genetic relationship of poorly differentiated ductal carcinoma in situ progression to high grade invasive disease and highly differentiated ductal carcinoma in situ progression to low grade invasive ductal carcinoma which may share a mechanistic process with LCIS progression to invasive disease.  7  1.2.3. Classification Approaches in a Heterogeneous Disease Breast cancer is a heterogeneous disease and there are several approaches to its classification including molecular pathology, histology, and more recently genomic copy number and geneexpression analysis. Different subtypes vary in disease behaviour and may carry different prognoses.  The widely used molecular classification schemes for current treatment purposes primarily rely on immuohistochemical (IHC) staining for the presence or absence estrogen receptor (ER) and human epidermal growth factor receptor 2 (Her2) due to the availability of drugs that target these receptors (tamoxifen, lapatinib and trastuzumab) [42, 51]. Further, staining for the cell proliferation marker Ki67 may be utilized to describe the growth rate of the tumour however deciphering the optimal threshold between positive and negative cut-off values in IHC staining is not universally established [42].  Recent discoveries in various clinically relevant breast cancer gene expression signatures have lead to the advent of risk assessment assays. One of the first is the RT PCR based assay of 16 genes used to give a recurrence score in node negative ER positive breast cancers (Oncotype DX) [52]. The Amsterdam signature (Mammaprint) evaluates expression of 70 different genes allowing for the prediction of the potential for distant metastasis in both node negative and node positive cases [53]. Similarly, the Rotterdam signature uses 76 genes and ER status as determined by IHC to predict metastatic potential in node negative breast cancer. Interestingly, only three genes overlap in the latter two assays [54]. Expression analysis is also responsible for the recent biological classification of breast cancer into five subgroups (luminal A, luminal B, basal, normal-like, Her2) thus offering further confirmation that this is not a single disease but several diseases which require tailored treatment methods [55-57].  8  1.3. Global Gene Expression Profiling in Cervical Cancer 1.3.1 Gene Expression Analysis Techniques  During the course of this work we saw that gene expression assessment had rapidly evolved from single gene expression techniques such as Northern blot analysis and RT-PCR to high throughput expression profiling techniques such as Serial Analysis of Gene Expression (SAGE), microarray analysis and most recently, transcriptome sequencing that allow for a landscape view of global expression. SAGE allows for rapid and detailed analysis of thousands of transcripts as first described in 1995 by Velculescu et al [58]. Briefly, this method involves the isolation of  unique cDNA sequence tags from individual mRNAs followed by the concatenation of the tags into long DNA molecules for sequencing. The frequency of each tag in the cloned multimers directly reflects transcript abundance. The key advantages of this technique are the relatively small amount of starting material required and a quantitative analysis of a large number of transcripts without prior knowledge of the genes. Sequencing cost, lengthy and complicated protocol and some ambiguity in tag to gene mapping are shortcomings of this technique.  Expression microarrays utilize competitive hybridization principles as first described by Schena et al [59]. Differentially fluorescently labelled reference material and a sample of interest are hybridized to a known set of cDNA transcript targets printed on a glass slide. The ratios between the two fluorofores denote expression level differences between the two samples.  This  technology has accelerated towards the simultaneous assessment of several thousand cDNAs which when analyzed showed complex patterns in genes expression and have subsequently been utilized to sub classify breast cancers as previously discussed [55, 56].  Oligonucleotide  microarrays (oligo array) are comprised of several 25-7Omer probes per gene for thousands of genes. While the exact number of genes encompassed by an oligonucleotide array and the size of each probe vary by manufacturer, this technology can allow for the detection of single nucleotide polymorphism, mutations, splice forms and  novel transcripts unlike cDNA arrays [60].  Generally, due to signal saturation and high background, array based analyses have a limited dynamic range in comparison to techniques such as SAGE and deep transcriptome sequencing [61].  9  Transcriptome sequencing (e.g. RNA-Seq) is currently at the forefront of transcriptome research and has been made possible through the development of next generation sequencing technologies such as Illumina’s fluorescently labelled reversible terminator-based sequencing chemistry and Roche’s pyro-sequencing based 454 Life Science technique [62-65]. A major advantage of these developing technologies over the traditional capillary electrophoresis based sequencing (Sanger sequencing) is the drastic increase in throughput. Specifically, this novel technology enables simultaneous sequencing of millions of sequences or “reads” as opposed to capillary array electrophoresis which has traditionally been limited to 96 reads per experiment [66]. Significantly, these novel techniques not only allow for the identification of single nucleotide polymorphisms and mutations but also the discovery of splice junctions, splice forms and fusion genes unlike cDNA array and SAGE techniques. Apart from cost, which is projected to be less than capillary electrophoresis but still remains high at approximately $200,000 per human genome sequence (3Gb), the major disadvantage is the short reads.  Second generation  sequencers (GS-FLX) for 454 Life Science allow reads of approximately 250 bps with 99% accuracy whereas Illumina reads are shorter at approximately 50 bp on average with similar accuracy[66]. Shorter sequences are bioinformatically challenging to assemble into contigs especially within regions of repetitive sequences and this issue may be addressed with the future development of these technologies.  1.3.2. Gene Expression Profiling in Cervical Intraepithelial Neoplasia  A number of global expression studies have been published investigating changes in cervical cancer however very few investigate the gene expression changes in pre cancerous CIN lesions [67-70]. Chen et al identified 62 genes over expressed in CIN III and cervical cancer using 30,000 eDNA microarray and 34 specimens (12 tumour, 12 normal, 5 HSIL and 5 LCIL) [71]. Many of the identified genes are known to be involved in cellular proliferation or are associated with the extracellular matrix. Kendrick et al compared expression of 5 pairs of patient matched normal and CIN III lesions by oligonucleotide microarray analysis and found 24 genes to be aberrantly expressed [72]. Predictably, a significant portion of the fourteen over expressed genes identified contributed to cell cycle or immune function. In 2005 Perez-Plasencia et al published  10  the first analysis of the normal cervical epithelium by SAGE [73]. Briefly, one SAGE library was constructed from one normal cervical epithelium sample from which 30,418 tags were sequenced and mapped to genes used to describe the cervix epithelium transcriptome with top expressers contributing to epithelial growth and differentiation networks. Although an important first study, this work did not include analysis of CIN or invasive cervical cancer nor did it offer a deeper sequencing analysis for the discovery of low level transcripts. A comprehensive consideration of global gene expression patterns in the progression of pre cancer lesions from normal cervical epithelium to severe dysplasia would significantly contribute to our understanding of key initial events in disease development.  1.4. Genomic Profiling and Breast Cancer  As previously discussed, aberrations in genomic copy number have long been associated with cancer progression [40].  The accumulation of aberrations during cancer development and  progression first described in the now established colon cancer progression model were important in prompting parallel investigations in breast cancer [41]. Technologies in assessing DNA copy number have quickly evolved from these early beginnings to the high throughput options available today that are being used to elucidate the molecular complexities of breast cancer.  1.4.1. Genomic Profiling Technologies  Molecular cytogenetic techniques such as G-banding, and spectral karyotyping (SKY) allow for an overview of genome copy number. Specifically, metaphase spreads are assessed for chromosome band rearrangements, gains and losses by G-banding while SKY assesses similar genomic alterations through the uses of a 24 colour probe set which virtually creates a karyogram where each chromosome is painted a different colour. Alternatively, fluorescence in situ hybridization (FISH) has allowed for the assessment of copy number at a specific locus through the use of fluorescently labelled DNA probes complementary to the locus of interest. A variation on this technique, M-FISH, enables the assessment of multiple loci in one experiment through the use of multiple differentially labelled probes.  11  Comparative hybridization (CGH) was a significant next leap in copy number detection technology and utilized the same basic principles of hybridization as SKY and FISH although with a marked improvement, competitive cohybridization. Briefly, CGH technique involves differentially labelling isolated DNA from a specimen of interest and reference DNA with Cy3 and Cy5 dCTP and cohybridizing them to a metaphase spread of chromosomes. The two samples compete to hybridize to each locus and ratios of fluorescence between the two probes translate into relative copy number similarities and differences between the two samples (gains/losses) at a resolution of approximately 2OMbp [74].  The advent of array CGH (aCGH) has further improved on resolution through the cohybridization of the differentially labelled sample and reference to segments of the genome tethered onto a slide (targets) [74]. The targets not only vary in derivation between platforms of aCGH but also in genome coverage offered. The earliest genome wide arrays were comprised of approximately 5,000 0.5-2kb cDNA targets (cDNA microarray) as identified through an expressed sequence tag (EST) database [75]. A cDNA aCGH based platform is advantageous in that direct correlation of copy number and expression aberrations can be achieved using the same platform. Bacterial artificial chromosome derived platforms (BAC array) were initially limited to portions of the genome such as chromosomes  and allowed for a sampling at intervals of  approximately three Mbps thus leaving gaps in which information was inferred from the adjacent BAC clones [76]. This technique was later broadened to encompass the entire genome however gaps of Mbp intervals in data remained present until the advent of tiling coverage [77-79]. High density oligonucleotide arrays  for DNA copy number assessment such as ROMA  (representational oligonucleotide microarray analysis) are analogous in principle to those used for expression analysis as previously described. Alternatively, arrays that utilize a shorter sequence of oligonucleotides allow for the detection of allelic imbalances by single nucleotide polymorphisms microarray analysis (SNP) in addition to changes in copy genomic copy number [80-82]. The primary advantage afforded by high density or tiling resolution BAC arrays and oligonucleotide arrays is the valuable opportunity to examine regions of the genome that had previously gone uninvestigated such as regulatory sites, unannotated novel genes and introns. Further, the challenges introduced by tissue heterogeneity and formalin fixed and paraffin embedded (FFPE) specimens, such as lower yield and poor quality of DNA, are overcome by 12  submegabase resolution tiling BAC microarrays as high signal to noise ratios provide increased sensitivity while the redundancy offered by tiling coverage provides confirmation of copy number status at each locus (precision) [83, 84]. This has particular importance when microdissection of sample is not reasonably feasible and surrounding normal tissue cannot be removed prior to experiment. In this case, tiling resolution BAC array has been shown to tolerate up to 70% normal cell contamination and still detect single copy changes [83].  1.4.2. Genomic Profiling in Breast Cancer  With the evolution of genome copy number profiling technologies, the genomic profiling of breast cancer remains at the forefront of cancer research as the availability of specimens and linked clinical information is relatively greater than compared to other tissues of interest thus prompting the use of such samples in preliminary proof of principle studies and first comprehensive analyses. Although not to suggest breast cancer research is merely a means for technology development, instead this relationship has been exceptionally mutually beneficial as our advancement in the field of breast cancer genomics has contributed vastly to our understanding of disease development within a short period of time.  The two most frequent known genomic aberration events in breast cancer are the amplification of neuroblastoma/glioblastoma derived oncogene homolog (HER2) located at 17q1 1-q12 and amplification of a homologue to the avian myelocytomatosis viral oncogene (c-myc) at 8q24 [85]. HER2 increase in copy number is observed in 15-30% of breast cancers while c-myc amplification is present in 15-25% of breast cancers [86, 87]. C-myc is a transcriptional regulator of cellular functions critical in tumour development including proliferation, differentiation and apoptosis. It is frequently implicated in leukemias and lymphomas in addition to solid tumours ;however, the amplification of HER2 is the most significant example of a chromosomal aberration utilized in both clinical evaluation and targeted therapy [88].  HER2 was first cloned and found to be amplified in a breast cancer cell line MACi 17. It was then discovered to be amplified 2-20 fold in 30% of the 189 breast tumours initially investigated by southern blot analysis [89, 90]. HER2 was later correlated to worse tumour behaviour and has been extensively investigated since these original studies [91]. Briefly, HER2 is a complex and 13  potent oncogene belonging to the human epidermal growth factor receptor (HER) family which is comprised of four members of type 1 transmembrane growth actor receptors (EGFRIHER1, HER2, HER3 and HER4). Characteristically, HER genes possess an extracellular ligand-binding domain, a transmembrane domain and an intracellular tyrosine kinase domain. The binding of ligand induces a conformational change that promotes hetrodimerization or homodimerization leading to the transphosphorylation of intracellular domains and subsequent activation of downstream pathways involved in cell cycle regulation, cell polarity, proliferation and invasion including the PI3K!Akt pathway activated via HER2-HER3 dimerization [92]. HER2 is unique in that it is constitutively in the active conformation and possesses the strongest catalytic kinase activity within this family of genes.  Significantly, HER2 has been realized as an important molecular target for therapy which has lead to the development and clinical use of a monoclonal antibody specific to HER2, Trastuzumab [93]. The implementation of Trastuzumab in therapy has resulted in successful treatment of 30-35% of patients with HER2 amplification however the precise mechanism of inhibition remains elusive [94, 95].  The first conventional CGH study in breast cancer described sixteen high-level amplifications several of which are widely accepted as frequent and characteristic events in breast cancer today including 8q24 which harbours the well known oncogene c-myc and 11 q 13 which encompasses the cell cycle regulator oncogene cyclin Dl (CCND1) and cortactin (CCTN also known as oncogene EMS1) which is implicated in cellular adhesion [96, 97]. 17q22-q24 amplification is also characteristic of breast cancer and contains several candidate oncogenes including FAM33A, DHX4O, CLTC, PTRH2, TMEM49, TUBDI, RPS6KBI, ADd, USP32, APPBP2, and PPM1D [98]. Similarly, the frequently amplified 20q13 locus also contains several candidate oncogenes such as zinc-finger protein 217 (ZNF217), a member of the cytochrome P450 superfamily (CYP24), a serine/threonine kinase (STK6) and a cellular apoptosis susceptibility gene (CAS) [96, 99, 100]. Subsequent studies by CGH both confirmed and added to these initial findings [38, 101-103]. A first investigation of breast cancer by aCGH was achieved by Pinkel et al that created and utilized a chromosome 20 BAC array to further define boundaries of both gains and losses [76]. As our understanding of breast cancer as a heterogeneous disease  14  broadened, a number of studies followed which described genomic aberrations by genome wide aCGH in stratified subpopulations of breast cancer [57, 104-108].  1.4.3. Genomic Profiling of Pre cancer Lesions of the Breast  Investigation of pre cancer lesions of the breast by CGH and aCGH is understudied however has contributed significantly to our understanding of disease development. Nearly half of atypical ductal hyperplasias have been shown to contain copy number changes with 16q and l’7p losses being the most recurrent events [109]. Low grade DCIS harbor fewer genetic changes (1 q and 16q loss most common) when compared to high grade DCIS which may possess a complex pattern of aberrations including gains on lq, 5p, 8q, 17q and losses on 8p, 1 lq, 13q and 14q [1101.  Due to the relative rarity of lobular carcinoma in situ lesion samples, copy number analysis studies have been limited in this field. Investigation of atypical lobular hyperplasia and lobular carcinoma in situ by conventional CGH identified genomic alterations at chromosomes 6, 16, 17 and 22 at similar frequencies in both types of lesions. The authours suggested that both lesions may represent the same stage in molecular evolution of cancer [111]. A second study showed that 88% of lobular carcinoma in situ cases (without concurrent invasive lobular carcinoma or invasive ductal carcinoma) also presented reduced expression of E-cadherin (CDH1, 16q22.1) which is characteristic of invasive lobular carcinoma [112].  Array CGH queries of lobular  carcinoma in situ have provided confirmation of the characteristic 1 6q loss and 1 q gain but have further identified novel regions of change including losses at 11 q 11 -q 13, 11 q 14-11 qter, l’7p and 17q [113, 114]. More recently, oligonucleotide microarrays have been used to suggest a direct genetic relationship between lobular carcinoma in situ and invasive lobular carcinoma through the identification of shared regions of copy number gain (l9p, 6p, ip, 3p, l6p, 21q, 2q) and loss (16q and 19q) [115].  15  1.5. Rationale  The study of premalignant lesions will help to identify key causal events in cancer development that may otherwise be masked by the accumulation of aberrations in advanced disease. These key events will not only contribute to our understanding of disease progression but offer a pivotal point for novel therapeutic intervention and marker based assessment.  1.5.1. Objective and Hypothesis  The overall objective of this thesis is to identify genes in biological pathways or networks altered in pre cancer lesions of the breast and cervix using global analysis tools. This objective is used to answer the following hypothesis:  Global transcriptome and high resolution genome analysis will identify altered genes that are shared in pre cancer lesions of both the cervix and the breast.  1.5.2. Specific Aims and Thesis Outline This body of work is comprised of a series of published manuscripts that individually address each aim as described below and collectively address the objective and hypothesis.  Aimi. To perform an unbiased global gene expression analysis to identify multiple genes  influencing biologically relevant functions not previously implicated in cervical intraepithelial neoplasia (Chapters 2-3).  At the time of this study gene expression profiling was a novel tool that was beginning to be used to describe and classify cancer however the expression profile of pre cancer of the cervix (CIN) had not been previously accomplished. I chose to use a comprehensive Serial Analysis of gene expression (SAGE) method for analysis of well characterized, frozen samples of normal cervix, 16  CIN I, CIN II and CIN III resulting in the largest study and collection of SAGE expression data to date. This work has led to the identification of novel genes and gene networks altered in CIN. Chapter 2 describes the detailed characterization of the normal cervical transcriptome thus establishing a baseline which to compare gene expression changes. Chapter 3 describes the investigation of sequential stages of pre cancer lesions by SAGE and the identification of a gene network not previously implicated in cervical intraepithelial neoplasia.  Aim 2. To perform high resolution global copy number analysis of pre cancer lesions of the  breast to identify novel alterations (Chapters 4 and 5).  Expression profiling in CIN lesions brought to light novel aberrations in biologically significant pathways in disease development although the causal events in disease remained elusive. At the time of this work, megabase arrays were beginning to be used to comment on copy number status of genomes however the large gaps in these arrays left much of the genome uninvestigated and possible causal genomic events may have thus been overlooked. We developed a novel (at the time) tiling resolution array that allowed for detailed investigation of the entire genome.  Chapters 4 and 5 describe two pilot studies to confirm efficacy of this tool in commonly used breast cancer cell model genomes and archival breast cancer tissue which resulted in the identification of novel features previously missed including aberrations to multiple components of a single biological pathway (EGFR pathway) and the amplification of gene clusters (HOXB cluster). This established technology was then used to evaluate archival cases of atypical lobular hyperplasia and lobular carcinoma in situ disease of the breast resulting in the identification of a previously not described copy number change in several genes (including HOXB cluster genes) and was further used to elucidate a genomic signature as described in chapter 6.  While archival cervical samples were available the amount and quality of the DNA available from CIN lesion biopsies was considered to be limited for use with high resolution global copy number analysis. As a result, these experiments could not be repeated for the CIN lesions.  17  1.6. References 1. 2. 3.  4. 5. 6.  7. 8. 9. 10.  11. 12. 13. 14. 15. 16.  17.  18. 19. 20.  Parkin DM, Bray F, Ferlay J, Pisani P: Global cancer statistics, 2002. CA Cancer JClin 2005, 55(2):74-108. Pecorelli S, Favalli G, Zigliani L, Odicino F: Cancer in women. International Journal of Gynecology & Obstetrics 2003, 82(3 ):3 69-379 Nijhuis ER, Reesink-Peters N, Wisman GB, Nijman HW, van Zanden J, Volders H, Hollema H, Suurmeijer AJ, Schuuring E, van der Zee AG: An overview of innovative techniques to improve cervical cancer screening. Cell Oncol 2006, 28(5-6):233-246. Canada CCSNCIo: Canadian Cancer Statistics2008. In., vol. Toronto, Canada; 2008. Wheeler CM: Advances in primary and secondary interventions for cervical cancer: human papillomavirus prophylactic vaccines and testing. Nat Clin Pract Oncol 2007, 4(4):224-235. Benedet JL, Bender H, Jones H, 3rd, Ngan HY, Pecorelli S: FIGO staging classifications and clinical practice guidelines in the management of gynecologic cancers. FIGO Committee on Gynecologic Oncology. mt J Gynaecol Obstet 2000, 70(2):209-262. Scheurer ME, Tortolero-Luna G, Adler-Storthz K: Human papillomavirus infection: biology, epidemiology, and prevention. mt J Gynecol Cancer 2005, 1 5(5):727-746. Parkin DM: The global health burden of infection-associated cancers in the year 2002. mt J Cancer 2006, 1 18(12):3030-3044. Zheng ZM, Baker CC: Papillomavirus genome structure, expression, and post-transcriptional regulation. Front Biosci 2006, 11:2286-2302. Scheffner M, Werness BA, Huibregtse JM, Levine AJ, Howley PM: The E6 oncoprotein encoded by human papillomavirus types 16 and 18 promotes the degradation of p53. Cell 1990, 63(6):1 129-1136. Whibley C, Pharoah PD, Hollstein M: p53 polymorphisms: cancer implications. Nat Rev Cancer 2009, 9(2):95-107. Veldman T, Horikawa I, Barrett JC, Schiegel R: Transcriptional activation of the telomerase hTERT gene by human papillomavirus type 16 E6 oncoprotein. J Virol 2001, 75(9):4467-4472. Munger K, Baldwin A, Edwards KM, Hayakawa H, Nguyen CL, Owens M, Grace M, Huh K: Mechanisms of Human Papillomavirus-Induced Oncogenesis. J Virol 2004, 78(21):11451-1 1460. Parkin DM, Bray F: Chapter 2: The burden of HPV-related cancers. Vaccine 2006, 24 Suppl 3:S311 1-25. zur Hausen H: Papillomaviruses in the causation of human cancers a brief historical account. Virology, In Press, Corrected Proof. Boshart M, Gissmann L, Ikenberg H, Kleinheinz A, Scheurlen W, zur Hausen H: A new type of papillomavirus DNA, its presence in genital cancer biopsies and in cell lines derived from cervical cancer. EMBOJ 1984, 3(5):l 151-1 157. Nicolas Wentzensen MSSTDREZJWRAARZMESSWJJMAGSS: Grading the severity of cervical neoplasia based on combined histopathology, cytopathology, and HPV genotype distribution among 1,700 women referred to colposcopy in Oklahoma. International Journal of Cancer 2009, l24(4):964-969. Woodman CB, Collins SI, Young LS: The natural history of cervical HPV infection: unresolved issues. Nat Rev Cancer 2007, 7(1): 11-22. Baseman JG, Koutsky LA: The epidemiology of human papillomavirus infections. J Cliii Virol 2005,32 Suppl l:S16-24. McCredie MR, Sharples KJ, Paul C, Baranyai J, Medley G, Jones RW, Skegg DC: Natural history of cervical neoplasia and risk of invasive cancer in women with cervical intraepithelial neoplasia 3: a retrospective cohort study. Lancet Oncol 2008, 9(5):425-434. --  18  21.  22.  23. 24. 25.  26.  27.  28. 29.  30.  31.  32.  33. 34. 35.  Mitchell MF, Tortolero-Luna G, Wright T, Sarkar A, Richards-Kortum R, Hong WK, Schottenfeld D: Cervical human papillomavirus infection and intraepithelial neoplasia: a review. JNatl Cancer Inst Monogr 1996(21):17-25. Markowitz LE, Dunne EF, Saraiya M, Lawson HW, Chesson H, Unger ER: Quadrivalent Human Papillomavirus Vaccine: Recommendations of the Advisory Committee on Immunization Practices (ACIP). MMWR Recomm Rep 2007, 56(RR-2): 1-24. Hughes JP, Garnett GP, Koutsky L: The theoretical population-level impact of a prophylactic human papilloma virus vaccine. Epidemiology 2002, 1 3(6):63 1-639. Katz IT, Wright AA: Preventing Cervical Cancer in the Developing World. N Engi J Med 2006, 354(11):lllO—. Sherman ME, Lorincz AT, Scott DR, Wacholder S, Castle PE, Glass AG, Mielzynska-Lohnas I, Rush BB, Schiffman M: Baseline cytology, human papillomavirus testing, and risk for cervical neoplasia: a 10-year cohort analysis. JNatl Cancer Inst 2003, 95(1):46-52. Wright TC, Jr., Schiffman M, Solomon D, Cox JT, Garcia F, Goldie 5, Hatch K, Noller KL, Roach N, Runowicz C et al: Interim guidance for the use of human papillomavirus DNA testing as an adjunct to cervical cytology for screening. Obstet Gynecol 2004, 103(2):304-309. Cuzick J, Clavel C, Petry KU, Meijer CJ, Hoyer H, Ratnam S, Szarewski A, Birembaut P, Kulasingam S, Sasieni P et a!: Overview of the European and North American studies on HPV testing in primary cervical cancer screening. lntJ Cancer 2006, 1 19(5):1095-1 101. Terry G, Ho L, Londesborough P, Cuzick J, Mielzynska-Lohnas I, Lorincz A: Detection of highrisk HPV types by the hybrid capture 2 test. JMed Virol 2001, 65(1):155-162. van den Brule AJ, Pol R, Fransen-Daalmeijer N, Schouls LM, Meijer CJ, Snijders PJ: GP5+/6+ PCR followed by reverse line blot analysis enables rapid and high-throughput identification of human papillomavirus genotypes. J Clin Microbiol 2002, 40(3):779-787. Chris J.L.M. Meijer JBPECATHELFGRMAFXBJCJDDA: Guidelines for human papillomavirus DNA test requirements for primary cervical cancer screening in women 30 years and older. International Journal of Cancer 2009, 124(3):5 16-520. Naucler P, Ryd W, Tornberg S, Strand A, Wadell G, Elfgren K, Radberg T, Strander B, Forslund 0, Hansson B-G et al: Efficacy of HPV DNA Testing With Cytology Triage andJor Repeat HPV DNA Testing in Primary Cervical Cancer Screening. JNatl Cancer Inst 2009, 101(2):88-99. Nakagawa T, Huang SK, Martinez SR, Tran AN, ElashoffD, Ye X, Turner RR, Giuliano AE, Hoon DS: Proteomic profiling of primary breast cancer predicts axillary lymph node metastasis. Cancer Res 2006, 66(24):11825-11830. Parkin DM, Fernandez LM: Use of statistics to assess the global burden of breast cancer. BreastJ 2006, 12 Suppl 1:S70-80. Coleman RE: Clinical features of metastatic bone disease and risk of skeletal morbidity. C/in Cancer Res 2006, 12(20 Pt 2):6243s-6249s. Stingl J, Raouf A, Emerman JT, Eaves CJ: Epithelial progenitors in the normal human mammary gland. J Mammary Gland Biol Neoplasia 2005, 10:49 59. Simpson PT, Reis-Filho JS, Gale T, Lakhani SR: Molecular evolution of breast cancer. JPathol 2005, 205(2):248-254. Polyak K: Is breast tumor progression really linear? Clin Cancer Res 2008, 14(2):339-341. Roylance R, Gorman P, Harris W, Liebmann R, Bames D, Hanby A, Sheer D: Comparative genomic hybridization of breast tumors stratified by histological grade reveals new insights into the biological progression of breast cancer. Cancer Res 1999, 59(7):1433-1436. Afonso N, Bouwman D: Lobular carcinoma in situ. Eur J Cancer Prey 2008, 17(4):3 12-3 16. Hanahan D, Weinberg RA: The hallmarks of cancer. Cell 2000, 100(1):57-70. Vogelstein B, Fearon ER, Hamilton SR, Kern SE, Preisinger AC, Leppert M, Nakamura Y, White R, Smits AM, Bos JL: Genetic alterations during colorectal-tumor development. NEnglJ Med 1988, 3 l9(9):525-532. -  36. 37. 38.  39. 40. 41.  19  42. 43.  44.  45.  46.  Sims AR, Howell A, Howell SJ, Clarke RB: Origins of breast cancer subtypes and therapeutic implications. Nat Clin Pract Oncol 2007, 4(9):516-525. Schuetz CS, Bonin M, Clare SE, Nieselt K, Sotlar K, Walter M, Fehm T, Solomayer E, Riess 0, Wallwiener D et al: Progression-specific genes identified by expression profiling of matched ductal carcinomas in situ and invasive breast tumors, combining laser capture microdissection and oligonucleotide microarray analysis. Cancer Res 2006, 66(10):5278-5286. Boecker W, Buerger H, Schmitz K, Ellis IA, van Diest PJ, Sinn HP, Geradts J, Diallo R, Poremba C, Herbst H: Ductal epithelial proliferations of the breast: a biological continuum? Comparative genomic hybridization and high-molecular-weight cytokeratin expression patterns. JPathol 2001, 195(4):415-421. Moore E, Magee H, Coyne J, Gorey T, Dervan PA: Widespread chromosomal abnormalities in high-grade ductal carcinoma in situ of the breast. Comparative genomic hybridization study of pure high-grade DCIS. JPathol 1999, 187(4):403-409. Buerger H, Simon R, Schafer KL, Diallo R, Littmann R, Poremba C, van Diest PJ, Dockhom Dworniczak B, Bocker W: Genetic relation of lobular carcinoma in situ, ductal carcinoma in situ, and associated invasive carcinoma of the breast. Mol Pathol 2000, 53:118 121. Gunther K, Merkelbach-Bruse 5, Amo-Takyi BK, Handt 5, Schroder W, Tietze L: Differences in genetic alterations between primary lobular and ductal breast cancers detected by comparative genomic hybridization. JPathol 2001, 193:40 47. Nishizaki T, Chew K, Chu L, Isola J, Kallioniemi A, Weidner N, Waldman FM: Genetic alterations in lobular breast cancer by comparative genomic hybridization. mt J Cancer 1997, 74:513-517. Loveday RL, Greenman J, Simcox DL, Speirs V, Drew PJ, Monson JR, Kerin MJ: Genetic changes in breast cancer detected by comparative genomic hybridisation. mt Cancer 2000, 86(4):494-500. Richard F, Pacyna-Gengelbach M, Schluns K, Fleige B, Winzer KJ, Szymas J, Dietel M, Petersen I, Schwendel A: Patterns of chromosomal imbalances in invasive breast cancer. mt j Cancer 2000, 89(3):305-310. Crabb SJ, Bajdik CD, Leung S, Speers CH,Kennecke H, Huntsman DG, Gelmon KA: Can clinically relevant prognostic subsets of breast cancer patients with four or more involved axillary lymph nodes be identified through immunohistochemical biomarkers? A tissue microarray feasibility study. Breast Cancer Res 2008, 10(1):R6. Paik 5, Shak 5, Tang G, Kim C, Baker J, Cronin M, Baehner FL, Walker MG, Watson D, Park T et al: A Multigene Assay to Predict Recurrence of Tamoxifen-Treated, Node-Negative Breast Cancer. NEngi JMed 2004, 351(27):2817-2826. Mook 5, Van’t Veer U, Rutgers EJ, Piccart-Gebhart MJ, Cardoso F: Individualization of therapy using Mammaprint: from development to the MINDACT Trial. Cancer Genomics Proteomics 2007, 4(3):147-155. Sotiriou C, Piccart MJ: Taking gene-expression profiling to the clinic: when will molecular signatures become relevant to patient care? Nat Rev Cancer 2007, 7(7):545-553. Sorlie T, Perou CM, Tibshirani R, Aas T, Geisler 5, Johnsen H, Hastie T, Eisen MB, van de Rijn M, Jeffrey SS et al: Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proceedings of the NationalAcademy ofSciences 2001, 98(19):1086910874. Perou CM, Sorlie T, Eisen MB, van de Rijn M, Jeffrey SS, Rees CA, Pollack JR, Ross DT, Johnsen H, Akslen LA et al: Molecular portraits of human breast tumours. Nature 2000, 406(6797):747-752. Bergamaschi A, Kim YH, Wang P, Sorlie T, Hernandez-Boussard T, Lonning PE, Tibshirani R, Borresen-Dale AL, Pollack JR: Distinct patterns of DNA copy number alteration are associated with different clinicopathological features and gene-expression subtypes of breast cancer. Genes Chromosomes Cancer 2006, 45(1 1):1033-l040. -  47.  -  48.  49.  50.  51.  52.  53.  54. 55.  56.  57.  20  58. 59. 60.  61. 62. 63. 64.  65.  66.  67.  68.  69.  70.  71.  72. 73.  74.  75.  76,  Velculescu yE, Zhang L, Vogeistein B, Kinzler KW: Serial analysis of gene expression. Science 1995, 270(5235):484-487. Schena M, Shalon D, Davis RW, Brown P0: Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 1995, 270(5235):467-470. Bertone P, Stolc V, Royce TE, Rozowsky JS, Urban AE, Zhu X, Rinn JL, Tongprasit W, Samanta M, Weissman S et al: Global identification of human transcribed sequences with genome tiling arrays. Science 2004, 306(5705):2242-2246. Okoniewski Mi, Miller Ci: Hybridization interactions between probesets in short oligo microarrays lead to spurious correlations. BMC Bioinformatics 2006, 7:276. Nagalakshmi U, Wang Z, Waern K, Shou C, Raha D, Gerstein M, Snyder M: The transcriptional landscape of the yeast genome defined by RNA sequencing. Science 2008, 320(5881):1344-1349. Wang Z, Gerstein M, Snyder M: RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 2009, 10(1):57-63. Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J, Brown CG, Hall KP, Evers DJ, Barnes CL, Bignell HR et at: Accurate whole human genome sequencing using reversible terminator chemistry. Nature 2008, 456(7218):53-59. Margulies M, Egholm M, Altman WE, Attiya 5, Bader JS, Bemben LA, Berka J, Braverman MS, Chen YJ, Chen Z et al: Genome sequencing in microfabricated high-density picolitre reactors. Nature 2005, 437(7057):376-380. Hert DG, Fredlake CP, Barron AE: Advantages and limitations of next-generation sequencing technologies: a comparison of electrophoresis and non-electrophoresis methods. Electrophoresis 2008, 29(23):46 18-4626. Shim C, Zhang W, Rhee CH, Lee JH: Profiling of differentially expressed genes in human primary cervical cancer by complementary DNA expression array. Clin Cancer Res 1998, 4(1 2):3045-3050. Chao A, Wang TH, Lee YS, Hsueh 5, Chao AS, Chang TC, Kung WH, Huang SL, Chao FY, Wei ML et at: Molecular characterization of adenocarcinoma and squamous carcinoma of the uterine cervix using microarray analysis of gene expression. IntJ Cancer 2006, 119(1):91-98. Contag SA, Gostout BS, Clayton AC, Dixon MH, McGovern RM, Calhoun ES: Comparison of gene expression in squamous cell carcinoma and adenocarcinoma of the uterine cervix. Gynecot Oncol 2004, 95(3):610-617. Wong YF, Cheung TH, Tsao GS, Lo KW, Yim SF, Wang VW, Heung MM, Chan SC, Chan LK, Ho TW et at: Genome-wide gene expression profiling of cervical cancer in Hong Kong women by oligonucleotide microarray. mt j Cancer 2006, 118(1 0):246 1-2469. Chen Y, Miller C, Mosher R, Zhao X, Deeds J, Morrissey M, Bryant B, Yang D, Meyer R, Cronin F et at: Identification of Cervical Cancer Markers by cDNA and Tissue Microarrays. Cancer Res 2003, 63(8):1927-1935. Kendrick JE, Conner MG, Huh WK: Gene expression profiling of women with varying degrees of cervical intraepithelial neoplasia. JLow Genit Tract Dis 2007, 11(1):25-28. Perez-Plasencia C, Riggins G, Vazquez-Ortiz G, Moreno J, Arreola H, Hidalgo A, Pina-Sanchez P, Salcedo M: Characterization of the global profile of genes expressed in cervical epithelium by Serial Analysis of Gene Expression (SAGE). BMC Genomics 2005, 6:130. Kallioniemi A, Kallioniemi OP, Sudar D, Rutovitz D, Gray JW, Waldman F, Pinkel D: Comparative genomic hybridization for molecular cytogenetic analysis of solid tumors. Science 1992, 258(5083):818-821. Pollack JR, Perou CM, Alizadeh AA, Eisen MB, Pergamenschikov A, Williams CF, Jeffrey SS, Botstein D, Brown P0: Genome-wide analysis of DNA copy-number changes using cDNA microarrays. Nat Genet 1999, 23(1):41-46. Pinkel D, Segraves R, Sudar D, Clark 5, Poole I, Kowbel D, Collins C, Kuo WL, Chen C, Zhai Y et at: High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays. Nat Genet 1998, 20(2):207-2 11. 21  77.  78.  79.  80.  81.  82.  83. 84.  85. 86.  87. 88. 89. 90.  91.  92. 93.  94.  95.  Snijders AM, Nowak N, Segraves R, Blackwood 5, Brown N, Conroy J, Hamilton G, Hindle AK, Huey B, Kimura K et al: Assembly of microarrays for genome-wide measurement of DNA copy number. Nat Genet 2001, 29(3):263-264. Ishkanian AS, Malloff CA, Watson SK, DeLeeuw RJ, Chi B, Coe BP, Snijders A, Albertson DG, Pinkel D, Marra MA et al: A tiling resolution DNA microarray with complete coverage of the human genome. Nat Genet 2004. Watson SK, deLeeuw RJ, Ishkanian AS, Malloff CA, Lam WL: Methods for high throughput validation of amplified fragment pools of BAC DNA for constructing high resolution CGH arrays. BMC Genomics 2004, 5(l):6. Lucito R, Healy J, Alexander J, Reiner A, Esposito D, Chi M, Rodgers L, Brady A, Sebat J, Troge J et al: Representational oligonucleotide microarray analysis: a high-resolution method to detect genome copy number variation. Genome Res 2003, 13(lO):2291-2305. Bignell GR, Huang J, Greshock J, Watt S, Butler A, West S, Grigorova M, Jones KW, Wei W, Stratton MR et al: High-resolution analysis of DNA copy number using oligonucleotide microarrays. Genome Res 2004, 14(2):287-295. Matsuzaki H, Loi H, Dong 5, Tsai YY, Fang J, Law J, Di X, Liu WM, Yang G, Liu G et al: Parallel genotyping of over 10,000 SNPs using a one-primer assay on a high-density oligonucleotide array. Genome Res 2004, 14(3):414-425. Garnis C, Coe BP, Lam SL, MacAulay C, Lam WL: High-resolution array CGH increases heterogeneity tolerance in the analysis of clinical samples. Genomics 2005, 85(6):790-793. Lockwood WW, Coe BP, Williams AC, MacAulay C, Lam WL: Whole genome tiling path array CGH analysis of segmental copy number alterations in cervical cancer cell lines. mt J Cancer 2007, 1 20(2):43 6-443. Bieche I, Lidereau R: Genetic alterations in breast cancer. Genes Chromosomes Cancer 1995, 14(4):227-251. Funkhouser WK, Kaiser-Rogers K: Review: significance of, and optimal screening for, HER-2 gene amplification and protein overexpression in breast carcinoma. Ann Clin Lab Sd 2001, 31 (4):349-358. Nass SJ, Dickson RB: Defining a role for c-Myc in breast tumorigenesis. Breast Cancer Res Treat 1997, 44(1):1-22. Popescu NC, Zimonjic DB: Chromosome-mediated alterations of the MYC gene in human cancer. J Cell Mol Med 2002, 6(2): 151-159. King CR, Kraus MH, Aaronson SA: Amplification of a novel v-erbB-related gene in a human mammary carcinoma. Science 1985, 229(4717): 974-976. Slamon DJ, Clark GM, Wong SG, Levin WJ, Ullrich A, McGuire WE: Human breast cancer: correlation of relapse and survival with amplification of the HER-2/neu oncogene. Science 1987, 235(4785):177-182. Slamon DJ, Godolphin W, Jones LA, Holt JA, Wong SG, Keith DE, Levin WJ, Stuart SG, Udove J, Ullrich A et al: Studies of the HER-2/neu proto-oncogene in human breast and ovarian cancer. Science 1989, 244(4905):707-712. Moasser MM: The oncogene HER2: its signaling and transforming functions and its role in human cancer pathogenesis. Oncogene 2007, 26(45):6469-6487. Carter P, Presta L, Gorman CM, Ridgway JB, Henner D, Wong WL, Rowland AM, Kotts C, Carver ME, Shepard HM: Humanization of an anti-p185HER2 antibody for human cancer therapy. Proc NatlAcadSci USA 1992, 89(10):4285-4289. Vogel CL, Cobleigh MA, Tripathy D, Gutheil JC, Harris EN, Fehrenbacher L, Slamon DJ, Murphy M, Novotny WF, Burchmore M et al: Efficacy and safety of trastuzumab as a single agent in first-line treatment of HER2-overexpressing metastatic breast cancer. J Clin Oncol 2002, 20(3):7 19-726. Mass RD, Press MF, Anderson S, Cobleigh MA, Vogel CL, Dybdal N, Leiberman G, Slamon DJ: Evaluation of clinical outcomes according to HER2 detection by fluorescence in situ 22  96.  97. 98.  99.  hybridization in women with metastatic breast cancer treated with trastuzumab. Clin Breast Cancer 2005, 6(3):240-246. Kallioniemi A, Kallioniemi OP, Piper J, Tanner M, Stokke T, Chen L, Smith HS, Pinkel D, Gray JW, Waidman FM: Detection and mapping of amplified DNA sequences in breast cancer by comparative genomic hybridization. Proc Nati Acad Sci US A 1994, 91 (6):2 156-2160. Ormandy CJ, Musgrove EA, Hui R, Daly RJ, Sutherland RL: Cyclin Dl, EMS1 and 1 1q13 amplification in breast cancer. Breast Cancer Res Treat 2003, 78(3):323-335. Parssinen J, Kuukasjarvi T, Karhu R, Kallioniemi A: High-level amplification at 17q23 leads to coordinated overexpression of multiple adjacent genes in breast cancer. Br J Cancer 2007, 96(8): 1258-1264. Quinlan KGR, Verger A, Yaswen P, Crossley M: Amplification of zinc finger gene 217 (ZNF217) and cancer: When good fingers go bad. Biochimica et Biophysica Acta (BBA) Reviews on Cancer 2007, 1775(2):333-340. Hodgson JG, Chin K, Collins C, Gray JW: Genome amplification of chromosome 20 in breast cancer. Breast Cancer Res Treat 2003, 78(3):337-345. Barlund M, Tirkkonen M, Forozan F, Tanner MM, Kallioniemi 0, Kallioniemi A: Increased copy number at 17q22-q24 by CGH in breast cancer is due to high-level amplification of two separate regions. Genes Chromosomes Cancer 1997, 20(4):372-376. Cingoz S, Altungoz 0, Canda T, Saydam S, Aksakoglu G, Sakizli M: DNA copy number changes detected by comparative genomic hybridization and their association with clinicopathologic parameters in breast tumors. Cancer Genet Cytogenet 2003, 145(2):108-114. Rennstam K, Ahlstedt-Soini M, Baldetorp B, Bendahl P0, Borg A, Karhu R, Tanner M, Tirkkonen M, Isola J: Patterns of chromosomal imbalances defines subgroups of breast cancer with distinct clinical features and prognosis. A study of 305 tumors by comparative genomic hybridization. Cancer Res 2003, 63(24):886 1-8868. Pollack JR, Sorlie T, Perou CM, Rees CA, Jeffrey SS, Lonning PE, Tibshirani R, Botstein D, Borresen-Dale AL, Brown P0: Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors. Proc Nati Acad Sci US A 2002, 99(20):12963-12968. Albertson DG: Profiling breast cancer by array CGH. Breast Cancer Res Treat 2003, 78(3):289298. Fridlyand J, Snijders AM, Ylstra B, Li H, Olshen A, Segraves R, Dairkee 5, Tokuyasu T, Ljung BM, Jam AN et al: Breast tumor copy number aberration phenotypes and genomic instability. BMC Cancer 2006, 6:96. Loo LW, Grove DI, Williams EM, Neal CL, Cousens LA, Schubert EL, Holcomb IN, Massa HF, Glogovac J, Li CI: Array comparative genomic hybridization analysis of genomic alterations in breast cancer subtypes. Cancer Res 2004, 64:854 1 8549. Chin K, DeVries S, Fridlyand J, Spellman PT, Roydasgupta R, Kuo WL, Lapuk A, Neve RM, Qian Z, Ryder T et al: Genomic and transcriptional aberrations linked to breast cancer pathophysiologies. Cancer Cell 2006, lO(6):529-541. Gong G, DeVries 5, Chew KL, Cha I, Ljung BM, Waldman FM: Genetic changes in paired atypical and usual ductal hyperplasia of the breast by comparative genomic hybridization. Clin Cancer Res 2001, 7(8):2410-2414. Rescher U, Gerke V: Annexins--unique membrane binding proteins with diverse functions. J Cell Sci 2004, 117(Pt 13):2631-2639. Lu YJ, Osin P, Lakhani SR. Di Palma S, Gusterson BA, Shipley JM: Comparative genomic hybridization analysis of lobular carcinoma in situ and atypical lobular hyperplasia and potential roles for gains and losses of genetic material in breast neoplasia. Cancer Res 1998, 58:472 1 4727. Etzell JE, Devries 5, Chew K, Florendo C, Molinaro A, Ljung BM, Waldman FM: Loss of chromosome 16q in lobular carcinoma in situ. Hum Pathol 2001, 32(3):292-296. -  100. 101.  102.  103.  104.  105. 106.  107.  -  108.  109.  110. 111.  -  112.  23  113.  114.  115.  Hwang ES, Nyante SJ, Yi Chen Y, Moore D, DeVries S, Korkola JE, Esserman U, Waidman FM: Clonality of lobular carcinoma in situ and synchronous invasive lobular carcinoma. Cancer 2004, 100(12):2562-2572. Nyante SJ, Devries 5, Chen YY, Hwang ES: Array-based comparative genomic hybridization of ductal carcinoma in situ and synchronous invasive lobular cancer. Human Pathology 2004, 35(6):759-763. Morandi L, Marucci G, Foschini MP, Cattani MG, Pession A, Riva C, Eusebi V: Genetic similarities and differences between lobular in situ neoplasia (LN) and invasive lobular carcinoma of the breast. Virchows Arch 2006, 449(1):14-23.  24  2. CHARACTERIZATION OF THE NORMAL CERVIX TRANSCRIPTOME  2.1. Introduction  Approximately 500,000 women are diagnosed with cervical cancer worldwide each year and more than half of them will die from this disease [1]. The highest incidence rates are observed in developing countries where it is the second most prevalent cancer in women and remains a leading cause of cancer related death [1]. Widely implemented screening programs have been responsible for the much lower incidence and mortality rates seen in the developed world. Present day screening methods primarily identify precancer lesions termed cervical intraepithelial neoplasia (CIN). CIN lesions are classified into three subgroups, CIN I, CIN II and CIN III, corresponding to mild, moderate and severe dysplasia/carcinoma in situ (CIS), respectively. CIN III lesions have a high likelihood of progression to invasive disease if left untreated [2]. Human Papillomavirus (HPV) has long been established as a necessary but not sufficient cause for cervical carcinoma development.  HPV is detected in 99% of invasive  disease, 94% of CIN lesions and 46% of normal cervical epithelium [2]. The high risk strains HPV 16 and HPV 18 are most prevalent in invasive disease.  A comprehensive characterization of gene expression of the normal cervical tissue is critical to establish a baseline for comparison against transcriptomes of precancer and cancer. A recent report described the global expression of genes in cervical epithelium using a serial analysis of gene expression (SAGE) based method, enumerating 30,418 sequence tags generated from one normal uterine ectocervical tissue [3]. Another study compared cDNA microarray profiles of cervical tissue to exfoliated cervical cells used in cytology-based cancer screening [4]. In this study, we increased the depth of our understanding of the normal cervical transcriptome and identified gene expression changes in CINIII. We achieved this (i) by using an unbiased Long SAGE (L-SAGE) approach to improve the accuracy of tag-to-gene mapping [5-7], and (ii) by examining 691,390 L-SAGE tags thus increasing publicly available cervical SAGE data by greater than 20 fold.  A version of this chapter has been published. Shadeo A, Chari R, Vatcher G, Campbell J, Lonergan KM, Matisic J, van Niekerk D, Ehlen T, Miller D, Follen M, Lam WL, Macaulay C Comprehensive serial analysis of gene expression of the cervical transcriptome. BMC Genomics. Jun 1;8(1):142 25  2.2. Materials and Methods  2.2.1. Sample Selection  The specimens were collected immediately prior to the LEEP (Loop electrosurgical excision procedure) cone biopsy targeting a small portion of the affected epithelium. These specimens were collected with patient consent at the Vancouver General Hospital Women’s Clinic at Vancouver Hospital & Health Science Centre.  Cases were assessed by cervical cancer  pathologists at Vancouver Hospital and Health Science Centre and were selected without prior knowledge of HPV status. Specimens Ni and N2 in this study were observed to be normal squamous epithelia whereas Cl and C2 were identified as high grade dysplasia or CIN III. Detailed information on specimen pathology based on the LEEP cone specimens can be found in Additional file 4. All samples were stored immediately in RNAlater and stored at -80°C. Three cases each of CIN III (CIN III A, CIN III B and CIN III C) and normal cervical tissue (NA, NB and NC) which were used for target validation through real-time PCR were also collected, assessed and stored in the same manner.  2.2.2. L-SAGE Library Construction and Sequence Tag Analysis  The biopsies were individually homogenised in Lysis Binding buffer (100mM Tris-HC1, pH7.5, 500mM LiC1, 10mM EDTA, pH 8.0, 1% LiDS, 5mM dithiothreitol). Long SAGE libraries were constructed according to the L-SAGE kit manual (Invitrogen, Ontario, Canada). Sequencing was performed at the BC Cancer Agency Michael Smith Genome Sciences Centre. L-SAGE employs 21 basepair sequence tags, reducing the ambiguity in tag-to-gene mapping that is sometimes encountered in classic SAGE libraries which use 14 basepair sequence tags.  2.2.3. Data Analysis  Tags were mapped using the February 12, 2006 version of SAGE Genie [8], and raw tag counts excluding duplicate ditags were normalized to tags per million (tpm). A Z-test analysis, standard for SAGE data analysis, was performed as previously established by Kal et al. for comparison of one SAGE library to another using an established cut-off of 1.96 on the absolute Z-score to determine statistically significant differences in expression levels between normal and CIN III [9]. 26  2.2.4. Reverse Transcriptase PCR  For validation of cervical tissue gene signature, human Multiple Tissue cDNA Panel I and II (Clontech, Mississauga, Ontario) and total RNA for human larynx, skin, stomach and tongue (Stratagene, Cedar Creek, Texas  )  were used. Five-hundred nanograms from larynx, skin,  stomach and tongue was used to generate eDNA. Final concentrations of PCR reagents for cervical tissue signature were 0.5M primer, 2mM MgCI , 0.2mM dNTP, 1X PCR buffer 2 (Invitrogen), 0.5U Taq polymerase and 1 L of cDNA (annealing temperatures: 55°C (ACTB, CEACAM7), 60°C (KRT6A), 65°C (SPRR3 and S100A7)).  Total RNA for the panel of six  cervical specimens used for validation of genes was isolated using Trizol (Invitrogen) and eDNA was generated using the High Capacity TaqMan Reverse Transcription Reagents, according to the manufacturer’s instructions (Applied Biosystems, Foster City, CA).  Expression of genes selected by data analysis were analyzed by real-time PCR using TaqMan® Gene Expression Assays on the ABI 7500 Real-Time PCR System (Applied Biosystems, Foster City, CA), according to manufacturer’s instructions.  Samples were run in duplicate and  normalized against an beta-actin (ACTB) endogenous control (HuACTB, Applied Biosytems). Assay IDs include AQP3, Hs00185020 ml; GAL7, Hs00170104 ml; RPL37, Hs02340038g1; and GJA-1, Hs00748445 si. The relative quantification of these target genes in CIN III (CIN III A, CIN III B and CIN III C) samples compared to normal tissue (NA, NB and NC) samples was performed using the established  method (Applied Biosystems, Relative Quantitation Of  Gene Expression, ABI PRISM 7700 Sequence Detection System: User Bulletin #2). The cycle threshold (Ct) value of the target gene was normalized to ACTB by subtracting the Ct value for ACTB from that of the target gene (ACt gene  =  Ct gene  -  Ct ACTB). This number was then  averaged from the three normal cases (ave ACt normal). The mean fold change of each gene between normal and CIN III was calculated using the following equation: 2 (ACt gene ave ACt normal) The relative quantification values were then plotted, one indicating no change with -  respect to normal cervical tissue.  27  -  2.2.5. HPV Tag-to-Gene Mapping Viral  genomic  sequence  files  (virali.genomic.fna  and  HPV11.txt)  were  downloaded from public repositories and processed into 21 bp SAGE tags at every Nia III site in both orientations using custom Perl scripts [10, 11].  2.3. Results In this study, we sequenced 691,390 SAGE tags from four libraries. Cervical L-SAGE libraries Ni, N2, Ci, and C2 were sequenced to 165,624, 181,224, 173,534, and 171,008 tags, respectively.  Duplicate ditags were eliminated from analysis resulting in 136,276, 139,656,  154,828 and 136,386 useful tags respectively and a total of 24, 058 unique tags (Figure 2.1A). 15,438 of the unique tags mapped to annotated UniGene identifiers.  The raw data of the  sequence tags have been made publicly available (Gene Expression Omnibus, series accession number GSE6252). We characterized the transcriptome of normal cervical tissue and evaluated the highly expressed genes in terms of tissue specificity, concordant expression among the normal libraries and their altered expression in CIN III lesions (Figure 2. 1B).  2.3.1. Genes Highly Expressed in Normal Cervical Epithelium 118 unique tags were found to be highly expressed in the normal cervical epithelium (at >500 tpm in both normal libraries). 103 of these tags mapped to UniGene clusters and represent 100 unique genes and hypothetical proteins (Figure 2.1). Manual examination of tags not mapped by SAGE Genie yielded three additional tags. This results in a total of 107 unique tag-to-gene mappings and 103 unique genes.  The abundance of the 118 tags and the genes they represent  are summarized in Table 2.1.  To determine cervical tissue specific expression, we first investigated the expression of the 107 genes using expression data available at the National Center for Biotechnology Information (NCBI) Unigene database and the National Cancer Institute (NCI) Cancer Genome Anatomy Project (CGAP) SAGE Anatomical Viewer. Based on CGAP information, only four of the 107 genes were unique to cervical tissue: carcinoembryonic antigen-related cell adhesion molecule 7 (CEACAM7), keratin 6A (KRT6A), small proline-rich protein 3 (SPRR3) and SiOO calcium binding protein A 7 (Si OOA 7). These genes were further investigated for expression by RT-PCR 28  in 20 different tissue types and three normal cervical specimens (Figure 2.2). CEACAM7 was found to be expressed in colon, larynx, pancreas and two of the three normal cervical specimens. KRT6A expression was detected in placenta, thymus, tongue, prostate, larynx, colon, skin and in all three of the normal cervical specimens. SPRR3 was found strongly expressed in placenta, thymus, colon, tongue, larynx and all three of the normal cervical cases.  SJOOA7 showed  expression in placenta, thymus, and tongue and in all three of the normal cervical specimens. All four genes were prominently expressed in the cervical epithelium but this combination of genes was not expressed in the tissues examined (Figure 2.2).  2.3.2. Disrupted Gene Expression in CIN III All tags were assessed for altered expression in CIN III. Four hundred and seventy-six tag show greater than two fold increase in CIN III and are expressed at greater than 15 tpm (see Additional file 1) while 315 tags were decreased in CIN III (see Additional file 2). We determined if the expression of the 107 unique tags, that were highly expressed in normal cervical libraries (> 500TPM), were disrupted in CIN III. Comparison of expression levels in Ni, N2 to the CIN III libraries using the Z-test revealed five differentially expressed genes (Table 2.2). Annexin 2 (ANXA2), galectin 7 (LGALS7) and connexin 43 (GJAJ) exhibited decreased expression in CIN III (Z<-i.96) while aquaporin 3 (AQP3) and ribosomal-like protein 37 (RPL37) increased in expression (Z >1.96). Real-time PCR was performed on a panel of 6 new cervical specimens, three each of normal and CIN III for all five of these genes (Figure 2.3). Expression results were normalized to housekeeping gene ACTB and 18S (Figure 2.3A and 2.3B, respectively). Decrease in expression of ANX42, LGALS7 and GJA] in CIN III was confirmed while increase in expression of AQP3 and RPL37 were not.  2.3.3. Viral (HPV 16) Tags in L-SAGE Libraries HPV transcripts were also detected by L-SAGE. Tags from all four libraries were mapped against the genomes of HPV 16 and HPV 18. While no tags mapped to HPV 18, twelve tags from the CIN III libraries mapped to the more prevalent HPV 16 genome (Table 2.3). The highest transcript counts of known genes belonged to E5 at 1,180 and 290 tpm and E2 at 240 and 20 tpm, in libraries C2 and Ci, respectively. Compared by BLAST [12] against the RefSeq Genome collection, none of the twelve tags matched 100% to the human genome. All twelve 29  tags were also mapped against human transcript sets (mitochondrial genome, RefSeq, UCSC gene set, Unigene, Ensembl, UCSC mRNA, UCSC EST, SAGEmap and SAGEgenie SAGE tag sets). No tags matched to any of the described transcript sets with the exception of CATGCACGCTTTTTAATTACA and CATGTGTATGTATTAAAAATA which mapped to human EST BF909200. The full length EST sequence is 97% identical with the HPV 16 E5 gene and was likely amplified from HPV sequences in the originating uterine tumour lesion.  2.4. Discussion This study represents the most comprehensive gene expression analysis of cervical tissue reported to date. In total, 691,390 L-SAGE tags were sequenced (Figure 2.1 A). The length of the L-SAGE tags (21 bp as compared to 14 bp in conventional SAGE) greatly reduces tag-togene mapping ambiguity [6]. 107 of the 118 (88%) highly expressed tags (i.e. >500 tpm) were mapped to known genes or hypothetical proteins encompassing 103 unique genes (Figure 2.1B).  2.4.1. Assessing Highly Expressed Tags by Functional Group  Of the 107 highly expressed tags (>500 tpm), 47 were expressed at extremely high levels (>10 tpm) including genes frequently used as controls in expression analysis, GAPDH and ACTB. High expression of 20 genes in normal cervical tissue was reported in a previous study (3). Fifteen of these genes are encompassed by our list of 107 high expressers. The most highly expressed  tags  expressed  at  >i0  tpm  (GTGGCCACGGCCACAGC  and  TACCTGCAGAATAATAA) mapped to the genes SJOOA9 and SJOOA8, respectively. Both genes belong to the calcium binding protein family. These findings are in agreement with a previous report of high SJOOA8 and S]00A9 levels in cervical tissue [3]. Although the function of these genes is not well understood, genes within this family have been proposed to participate in a variety of cellular process including cell cycle, wound healing and cell differentiation [13].  Assigning the 103 highly expressed genes to one of eleven broad functional groups allowed for an assessment of those cellular processes represented by the most abundant transcripts. These cellular processes include calcium binding proteins, cell cycle or cell death, cytoskeleton, immune functioning, keratinization, membrane proteins, mitochondrial, protein processing, translation (ribosomal proteins), translation (non ribosomal proteins) with a small fraction of tags 30  mapping to other functional groups or to genes with no known function (Figure 2.4A). The 41 ribosomal genes account for the greatest proportion of highly expressed genes at 28% and 31% (normal and CIN III, respectively). In contrast, only five calcium binding genes account for the second largest functional subgroup of highly expressed tags, 18% and 19% (normal and CIN III, respectively).  The relative expression levels of the functional groups do not change greatly between the normal and CIN III libraries however the keratin and immune related functional groups show slight decrease from 12% and 17% in CIN III to 9% and 14% in normal tissue (keratin and immune groups, respectively). All tags expressed at or greater than 15 tpm in Normal and CIN III libraries (2,814 and 3,279 respectively) were also evaluated according functional group using Onto-Express (see Additional File 3) [14]  .  The most represented groups included DNA  dependent transcription regulators and transcription in both the Normal and CIN III libraries.  2.4.2. Cervical Tissue Gene Signature Four of the 103 unique genes we found to be abundantly expressed in normal cervical tissue were documented to have limited or no expression in other tissues according to the web resources NCBI UniGene [15] and NCI CGAP SAGE Anatomical Viewer [8, 16] (Figure 2.1 B). These genes (CEACAM7, SPRR3, S]00A7 and KRT6A) are our candidates for an expression signature unique to normal cervical tissue and were further investigated in a panel of 20 different tissue types and three new normal cervical specimens. We found that all four of these genes were not abundantly expressed simultaneously in any of the 20 tissues examined (Figure 2.2). Placenta, thymus and tongue were found to express a combination of three genes (Si OOA 7, SPRR3 and KRT6A), while colon expressed another combination (CEACAM7, SPRR3 and KRT6A). KRT6A and SPRR3 expression was observed in larynx tissue with only minimal expression detected for SiOOA7 and CEACAM7. In contrast, two of the cervical cases strongly expressed all four genes investigated while only the third showed very low CEACAM7 expression. Significantly, our study is the first to document CEACAM7 expression in cervix. The data suggest that abundant expression of CEACAM7 and SiOOA7 collectively, are unique to cervical tissue and have the potential to serve as useful biomarkers in identifying origins of metastatic disease.  31  It is interesting to note that decreased expression of three of the genes is linked to abnormal growth and organization of epithelium. For example, CEACAM7 is a member of the carcinoembryonic antigen family of genes and expression has been documented in highly differentiated normal colon epithelium and the apical surface of normal ductal pancreas epithelium, while loss of expression has been reported in colon hyperplastic polyps [17]. Another member of this gene family, CEACAM 1, is shown to have no or very low expression in cervical carcinoma [18]. Decreased expression of KRT6A and S]OOA 7 have been associated with breast, lung and ovarian cancer [19-23]. SPRR3 belongs to the class of small proline rich genes which are expressed in differentiated keratinocytes and has previously been shown to be highly expressed in normal cervical tissue [24].  2.4.3. Genes Altered in Expression in CIN III The 107 highly expressed unique tags in the normal libraries were assessed for expression changes in CIN III. Two genes showed increased expression (AQP3 and RPL37) while three genes declined in transcript counts in the CIN III libraries (ANXA2, GJA] and LGALS7). All five were evaluated by real-time PCR in a new cervical tissue panel. Results for ANX42, LGALS7 and GJA] confirmed L-SAGE findings.  Panel results for GJA] are in agreement with those reported by King et al. in that GJA] expression was detected in normal cervical epithelium while reduced expression was observed in CIN III lesions investigated [25]. It has been suggested that this pattern may be a consequence of epithelial disorganization and not causative in dysplasia development [26, 27]. The high expression of LGALS7 in normal cervical epithelium contrasted by the low expression seen in the CIN III lesions in the tissue panel we report here is similar to those expression patterns seen in studies of other normal tissue types compared to their respective carcinomas including cornea and larynx [28]. LGALS7 expression has been hypothesized in all stratified epithelium tissue types and has been experimentally detected in human cornea, heart, larynx, tongue, skin, thymus and thyroid [29-3 1]. Though, LGALS7 has been investigated in the context of cell line models, it is interesting to note that expression of this gene in cervical epithelium has 32  not been previously reported [29-31]. LGALS7 is one of fifteen members of the B-galactoside binding lectin family, some of which have been shown to influence cell growth, cell cycle, apoptosis and cell migration via their predicted role in homeostasis however the role of LGALS7 in cancer is unclear [30, 32]. Support for the pro-apoptotic function of LGALS7 was reported by Kuwabara et al. and Bernard et al. who identified cells more sensitive to apoptosis when LGALS7 expression was high in the epithelium derived cells [31, 33, 34]. In contrast, Demers et al. showed an increase in LGALS7 expression in lymphoma cells and suggested a positive role in cell growth and dispersal through induced matrix metalloproteinase 9 (MMP9) expression [35]. We did not observe a statistically significant change in MMP9 in the four libraries investigated. This variance in expression suggests multiple roles for LGALS7 that may be tissue-type dependent.  The third gene investigated, ANXA2, is known to be highly expressed in epithelial cells and is localized to the plasma membrane and endosome. It has been suggested to function in linking membrane to membrane and membrane to cytoskeleton [36]. The decrease in expression we observe suggests that the loss of ANXA2 may be a causative factor in disorganisation of the epithelial architecture, which is characteristic of cervical neoplastic lesions. ANXA2 is also known to bind with S100A1O and participates in transport channel function across the plasma membrane [36]. Interestingly, the ANXA2 binding site on S100A1O also binds NS3, a viral protein from the bluetongue virus, and therefore directly competes with ANXA2. The SI OOA 10 protein has also been shown inhibit Hepatitis B virus polymerase (HBV pol) activity [37]. It is plausible that a S100A1O-ANXA2 complex may have a role in HPV infection or viral lifecycle. S100A1O expression was high and consistent in Normal and CIN III libraries whereas ANX42 decreased in the CIN III libraries.  The above real-time PCR results were normalized to the widely accepted housekeeping gene ACTB (Figure 2.3A). For comparison we also normalized the genes to a second housekeeping gene 18S (Figure 2.3B). Briefly, results for GJA1 and LGALS7 were in agreement to those when normalizing to ACTB, however ANXA2 was shown to be increased in CIN III as expected by the SAGE data but this does not concur with the QPCR results when normalized to ACTB. One possible explanation is that on average, the ACTB cycle threshold was >1.3 Ct lower in the CIN 33  III cases indicating an increase in ACTB expression in CIN III lesions. Any Ct decrease less than this in the genes investigated would appear as a decrease in gene expression in CIN III when normalized to ACTB but increase when normalized to 1 8S.  The real-time PCR results for AQP3 and RPL37 did not concur with L-SAGE data and may be due to interindividual differences rather than a representation of changes present in CIN lesions or cancer. It is interesting to note that RPL37 over expression in prostate cancer, colon cancer cell lines and clinical specimens have been reported [38, 39]. Our L-SAGE results also suggest a similar pattern in cervical neoplasia. Results for these genes when normalized to 1 8S showed an increase in expression in only CIN III A (see Additional file 3).  Wong et al investigated gene expression in invasive cervical carcinoma by DNA microarray [40]. We investigated this publicly available data through NCBI GEO [41]. Briefly, in this data, GJA1 showed a small decrease in expression in invasive disease while LGALS7 was detected in only four of the 26 specimens, three of which were normal tissue. Moderate AQP3 expression was detected in the majority of cases including the control group. Expression of ANXA2 and RPL37 was not assessed in the microarray study.  2.4.4. Human L-SAGE Tags Map to the HPV Genome  HPV is an established etiological factor in cervical cancer [2]. There are over 100 known strains of HPV, however HPV 16 and HPV 18 are considered to be the frequent high risk types owing to higher rates of persistent infection, higher rates of progression to cervical neoplasia, and shorter median progression times than other HPV strains [2]. HPV 16 is the most common strain and can be detected in approximately 60% of cervical cancers, while HPV 18 infection occurs in approximately 10-20%. Uncontrolled expression of E6 and E7 genes from strains 16 or 18 are considered to be essential for oncogenic transformation and function through inhibition of host cell tumour suppressors p53 and the retinoblastoma protein (Rb) [2].  This is the first study to mine human SAGE libraries for viral transcripts. Overall, CIN III library C2 (2,548 tpm) possessed a greater number of tag counts from the more prevalent HPV 16 strain when compared to Cl (320 tpm) (Table 2.3). HPV 18 tags were not found in any library and no 34  viral tags were detected in the normal libraries. With the exception of two tags, the viral tags expressed in cervical SAGE did not map to known human genes or expressed sequences. The exceptions, CATGCACGCTTTTTAATTACA and CATGTGTATGTATTAAAAATA, mapped to a single human EST isolated from a uterine tumour (EST BF909200). The full length EST sequence is 97% identical with the HPV 16 genome, more specifically the E5 gene, and therefore was likely amplified from HPV sequences in the original lesion.  Tags mapping to the ES gene accounted for the greatest proportion of HPV tags mapping to known transcripts in both Cl and C2 (93% and 52%, respectively). ES is considered to be one of three HPV 16 oncogenes (E5, E6 and E7) and is highly expressed in basal cells of premalignant cervical lesions [42]. This expression declines as cells differentiate and move toward the apical face of the epithelium whereas E6 and E7 expression increases [42]. ES is detected throughout all epithelium layers in high grade lesions such as CIN III. In contrast, expression is restricted to layers closest to the basal cells in low grade lesions, implying that E5 expression may be limited to undifferentiated basal cells [43, 44]. The high expression of ES we observe in the CIN III libraries and the absence of HPV 16 genes in the normal libraries is in concordance with such studies.  An increase in sample size and inclusion of mild and moderate stages of cervical intraepithelial neoplasia will aid in quantifying the relationship between viral gene expression and disease. This will also assist in further elucidating genes important in early lesion transcriptome events. A comparison of such events with those seen in later stages will help to identify genes important in the molecular pathogenesis of the disease.  2.5. Conclusions  In this study we have described the transcriptome of normal cervical tissues and compared against that of CIN III lesions. This was achieved by construction of four L-SAGE libraries and sequencing to the depth of 172,848 tags per library on average. We highlighted that the Long SAGE technique provides a comprehensive profile of the transcriptome without focusing on only known genes. Potent tumour suppressors (e.g. PTEIV), cell cycle mediators (e.g. CCND]), and cellular respiration genes (e.g. NDUFAJ) were found to be tightly regulated in the normal 35  libraries. An expression signature of four highly expressed genes (KRT6A, CEACAM7, SJOOA 7 and SPRR3) in normal cervical epithelium was identified and confirmed, and three abundantly expressed genes (ANXA2, GJA] and LGALS7) were found to have altered expression in CIN III. Furthermore, this is the first study to have identified viral tags in human SAGE libraries demonstrating the versatile nature of SAGE data, which allows for mining and re-mining according to newly posed questions. HPV 16 E5 transcripts were found most highly expressed while few E7 and no E6 transcripts were enumerated.  The identification of expression changes associated with stages of disease progression will help further our understanding of cervical cancer development and potentially elucidate novel targets for diagnosis and treatment. Establishing a baseline from which to compare is essential to the identification of such aberrations and the 20 fold increase in cervical gene expression data presented here is a significant contribution to this effort.  36  —.  —‘J —[t)  E  -  L’-)  i  C)  Cl)  C)  OH--  -t  CD  CD  Cl)  CD  o -t  —  N C  CD  C)  —.  Cl) Cl)  CD  CD  CD  -  Cl)  CD  CD  CD  Cl) Cl)  QO 4 Q  —  oo  —  I  (-:•  .  CD Cl) Cl)  CD  —  C)H  —  fl_  CThH OC) C)  HH  (ThC  -  ]j c_  Table 2.2 Highly expressed genes with altered expression in CIN III Genc Symbol  NI  CI  C2  ---  --  --  N2  Gene Symbol  SAGE Tag  unknown  2510  3623  2655  873  TGATITCAC1TCCACTC  unknown  2517  3430  3307  1672  unknown  594  559  497  594  unknown  2033  1683  2280  unknown  807  852  1298  NI  N2  CI  --  --  RPLI3A  1490  716  1434  CTAAGACTTCACCAGTC  RPL2!  209!  1353  GTAGGGGTAAAAGGAGG  RI’L23  1196  1031  1239  CACCTAATTGOAAGCGC  RPL27A  1923  0  CAAGCATCCCCOTTCCA  RPL28  1996  C2  SAGE Tog  --____  1195  CTCCTCACCTGTA1TTT  1899  1554  GCATAATAGGTGTTAAA  1072  1114  ATTCTCCAGTATA1TFG  1310  989  1899  GA060AGTTTCATTAAA  1504  983  1833  OCAGCCATCCGCA606C  unknown  822  680  814  631  AGGTGGCAAGAAAT000  RPL29  1049  773  1104  814  060CT0000TCCTCCTG  unknown  3757  2714  4134  2757  ACTTTTTCAAAAAAAAA  RPL3  917  702  840  983  GGACCACTGAAGAAAGA  unknown  616  687  782  374  ACTAACACCCTTAATTC  RPL3O  1541  1017  259  961  CCAGAACAGACTGGTGA  unknown  1519  1525  2222  2126  AAAAAAAAAAAAAAAAA  RPL3I  609  573  568  491  AAGGAOAT000AACTCC  unknown  565  780  588  308  TTTAACGGCCGCGGTAC  RPL32  1512  1024  550  1371  TGCACGUTTCTGTTTA  unknown  6289  6602  6510  4326  TTCATACACCTATCCCC  RPL35  638  551  665  572  CGCCGCCGOCTCAACAA  ACTB  2099  1031  1292  1459  GCTTTATTTGTTrrTTT  RPL36  1570  1275  1628  1173  AGGAAAGCTGCTGCCAA  ACTOI  1424  1955  1195  598  CTAGCCTCACGAAACTG  RPL37  5063  3272  6362  6064  CAATAAATJrTCTO0TT  ANXAI  3713  6480  3701  10162  AGAAAOATGTCTATGTA  RPL37A  1350  938  1259  953  AAGACAGTGGcTGGCGG  ANXA2  1644  1754  298  887  CTTCCAGCTAACAGGTC  RPL4  521  537  465  359  CGCCGGAACACCATrCT  AQP3  609  573  930  1136  TTTGC1TFTG1TTTGYT  RPL4I  2649  1812  2609  906  TTGGTCCTCTOCCCTGG  ATP5G3  543  773  743  851  GGAATGTACGITA1TfC  RPL7  1702  1160  1427  1180  ATTAT1TrTCTAAGCTG  82M  8718  2821  9443  5712  GTTGTGGTTAATCTGGT  RPLPO  815  1239  846  601  CTCAACATCTCCCCCTr  BZRP  550  960  794  675  OAATTTTATAAGCTGAA  RPLPI  3574  2735  3507  2398  TrCAATAAAAAGCTGAA  CCNBIIPI  580  924  988  565  CCACTGCACTCCAGCCT  RPLP2  2422  1461  2196  2376  GOATTTGGCCTTTTTGA  C1324  851  559  775  697  GGAACAAACAGATCGAA  RPSII  1365  802  917  770  TCTGTACACCTGTCCCC  CD74  2348  573  1789  1444  0TTCACATTAGAATAAA  RPSI2  690  695  601  396  GCCGAGGAAGGCATfGC  CEACAM7  793  866  89!  594  AATCACAAATAAAAGCC  RPSI4  1159  859  1240  843  TAAAAAAAAAAAAAAAA  CFLI  1937  1031  1744  1679  GAAGCAGGACCAGTAAG  RPSI8  107!  1053  1085  858  TGGTGTTGAGGAAAGCA  COX4II  646  523  969  616  CCTATTTACTGGAAACC  RPSI9  2598  1604  2357  2618  CT060TTAATAAATTGC  CSTB  712  388!  1227  2676  ATGAGCTGACCTATTrC  RPS2O  903  695  1020  667  GCTTTTAAGGATACCGG  CTSD  2502  931  2894  1987  GAAATACAGTTGTTGGC  RPS23  96!  637  833  704  CTGTTGGTGATATTCCT  EEFIA!  2245  1353  1737  1496  TGTGTTGAGAGCTTCTC  RPS25  1042  988  924  660  AATAGGTCCAACCAGCT  EEFIB2  96!  1210  1072  1298  GCATTTAAATAAAAGAT  R1’S26  2576  002  1938  873  TAAGGAGCTOAGTrCTT  EEFIG  726  501  807  418  T000CAAAGCCTTCAAT  RPS27A  815  623  988  887  AACTAAAAAAAAAAAAA  FLJ2O7OI  550  50!  685  264  ATTTGAGAAGCCTTCGC  RPS28  998  902  1053  1034  GACGACACGAGCCGATC ATAATFCrrTGTATATA  Ffl-!l  2311  275  2325  5176  TT000GmCCTTTACC  RPS29  2245  1940  2325  2317  GAPD  1666  2363  2312  2816  TACCATCAATAAAGTAC  RPS3A  1218  823  1427  1232  GTGAAGGCAGTAGTTCT  GSA!  741  1310  504  513  TGTTCTGGAGAGTGTtC  RPS4X  1541  952  1615  924  TCAGATC1TrGTACGTA  GLUL  1658  1275  905  2845  TACAGTATGTTCAAAGT  RPS8  402!  3587  4282  3996  TAATAAAG0TG]TfA]7  161-101  14118  788  349  4715  GAAATAAAGCACCCACC  RPS9  587  530  407  521  CCAGTGGCCCGGAGCTG  101-161  18286  773  60!  6548  GAAATAAAGCACCCAGC  SIOOAIO  514  50!  543  506  AGCAGATCAGGACACTT  ITM2B  1203  1031  1369  865  TAAGTAGCAAACA000C  SIOOA2  2950  723  2235  3505  GATCTC1T006CCCAGG  K-ALPHA-I  118!  795  1783  1239  TGTACCTGTAATAT1TT  SIOOA7  2899  752  388  5990  GAGCAGCGCCCTGTrCC  KRTI3  1380  8206  815  2046  AAAGCG600CTGGAGAA  SIOOA8  690  103!  827  1466  GCT1TTTUGT000CTG  KRTIS  1152  3494  2713  2427  TAATAAAGAATTACTTT  SIOOA8  1880  25205  8855  24629  TACCTGCAGAATAATAA  KRT5  2216  1539  1892  1371  OCCCCTGCTGACACGAG  SIOOA9  1356!  17078  10089  8660  GTGGCCACGGCCACAGC  KRT6A  1181  2657  975  1166  AAAGCACAAGTGACTAG  SERPINB3  1137  1454  704  917  CCTTTCTCTCTTrCTCT  LAMRI  2987  2370  3268  2911  OAAAAATGGTTGATGGA  SF94  LGALS7  682  2220  446  425  TAAACCTOCTGTGCGGG  SMAD2  L0C44051  719  702  588  719  000TTGGCTTGAAACCA  SPINKS  712  4189  0  1305  TCCACCAAGTCTGAGCC  LY6D  424  2248  796  3072  GAGATAAATGAITrAAA  SPRR3  1600  11757  3888  5968  TTTCCTGCTCTGCCCTC  255  1976  505  1569  mCCTCTCAATAAAGT  2935  234!  3378  1987  CCCATCOTCCTAGAArr  MGC7 1999  998  1353  924  565  TCACAAGCAAATGTGTC  SPRR3  636  13068  412!  8329  TrrCCTGCTCTrCCCrC  MYL6  1115  559  995  1063  OTGCTGAATGGCTGAGG  STX3A  646  1905  1318  312  TAAAATrnTrATGATAA  NET-S  1783  780  298  997  CACAAACGGTAGT1TFG  SUII  1446  1239  1259  2009  CAATAAACTGAAAAGAG  PER!’  306  2828  1828  2236  CCACAGGAGAATrC000  TAOLN2  660  616  769  543  GTCT0066CTTGAGGAA TTGGTGAAGGAAOAAGT  PLCB4  866  752  1072  733  AAAACATTCTCCTCCGC  TMSB4X  2245  616  1175  917  PPIA  1504  1697  1951  1452  CCTAGCTOGATTGCAGA  TPII  947  1382  1589  1584  TGA006AATAAACCTGG  PTMA  1049  1017  1518  697  TTCATTATAATCTCAAA  TPTI  4726  3659  4915  4568  TAGO]TGTCTAAAAATA TTAGCAATAAATGATGT  RPLIO  2223  1797  1944  1283  A000CTTCCAATGTGCT  TXNL5  660  838  258  924  RPLIOA  998  652  1414  821  GGCAAGCCCCAGCGCCT  UQCRH  646  508  930  726  GGTTTGGCTTAGGCTGG  RPLII  1130  773  1124  829  CGCTGGTTCCAGCAGAA  VAMP8  52!  795  459  49!  TGGCTGGGAAACTGTTG  RPLI2  932  580  865  726  ACATCATCGATGACATC  VPSI3B  763  594  911  0  CACTACTCACCAGACGC  RPL!3A  2818  1998  2855  2383  AGGCTACGGAAACAGGC  YWHAZ  587  702  517  814  TAAGTGGAATAAAAG1T  38  —  —  CJ  Ji  t.)  I’J  r’i  — —  C)  S  zt’J  S  z  U  —  1’J  t’J —  I  C)  —a  +++  —  ?? )J  —  ++  H  )  J4_  ‘,j  +  (DCTh>  HHH>HHC >>flCH)HH HflH>>HHH HH>>>H -H>HHc HHflHH (HQHC)HH  C)HH)fl> H>.c  HHHHHHHHHHHH  flThflflC)  C  ON  C  Cr2  H  A  All Cervical Tissue Ubraries  B  Tag Analysis Summary for Normal Cervical Libraries  4 Cervical Tissue L.SAGE Libraries  Normal Cervical Tissue Derived Libraries (Ni and N2)  691,390 tags sequenced  1  567,146 useful tags  124244 duplicate ditags  1  24,058 unique tags + 15,438 with UniGene identifiers  + 8.620 without UniGene identifiers  Jr  118 tags >500 tprn in Ni and N2  431 tags with concordant expression in Ni and N2  lo7mapto Unicene identifiers  358mapto UniGene identifiers  103 unique genes  351 unique genes  4 genes offer a gene signature unique to normal cervical tissue  5 genes show altered expression mCI andC2  Figure 2.1 Flow diagram of SAGE analysis and tag-to-gene mapping. A. Sequence tags yielded from the four SAGE libraries were categorized. Useful tags indicate all sequenced tags less duplicate ditags. B. The abundance and classification of unique tags in the SAGE libraries of normal cervix tissue (Ni, N2) are summarized.  40  Ht  r  P1  Lv  Lg  5k. Kd Pn  5p  Ty  KRT6A ACAM7  SIOI3AT .SPRR3  Pr Ts  Ov  Sin  Lk Tg  Co  KRTM  !LJ  St  5k  Ca  Cb —  Cc  CEACAMT  SOOA7 SPRR3  Figure 2.2 Validation of tissue specificity of gene expression. Reverse transcriptase PCR of four genes in 20 tissue types and three normal cervical specimens. Heart (Ht), breast (Br), placenta (P1), lung (Lg), liver (Lv), skeletal muscle (Sk), kidney (Kd), pancreas (Pn), spleen (Sp), thymus (Ty), prostate (Pr), testis (Ts), ovary (Ov), small intestine (Sm), colon (Co), peripheral leukocytes (Lk), tongue (Tg) larynx (Lx), stomach (St), skin (Sn). Ca, Cb and Cc are three individual normal cervical tissue specimens.  41  A GA1  U  ::___ •cIN Ill A  B  Figure 2.3 Summary of test panel quantitative PCR results of genes with altered expression in CIN III L-SAGE libraries. A panel of three new CIN III cases (CIN III A, CiN III B, CIN III C) were investigated for expression and compared to three new normal specimens. Gene expression was normalized to ACTB and 18S (Figure 2.3A and 2.3B, respectively). Zero on the Y-axis denotes mean expression levels of the respective genes in normal cervical tissue. All five genes investigated showed decreased expression.  42  A  Normal Total Tag Counts 18%  2%  17%  CIN NI Total Tag Counts  4%  2%3%  Calcium Binding Cell Cyde/eatti o Cytoskeeton o Immune • Keratinization •Membrane Protein • Mitochonddal Protein Processing • Ribosomal Proteins (translation) • Translation (noft ribosornal) mother • Unknown function  9%  Figure 2.4 Functional groupings of tags highly expressed (>5 00 tpm) in normal libraries. Categories are as described. The Other group consists of tags which map to known genes but are not encompassed by any of the ten categories. The Unknown function group consists of tags mapping to no known genes. A) Tag counts in the normal libraries are categorized by functional group. B) Tags found in A were quantified in both C1N III libraries and categorized according to the same functional grouping scheme. In both groups Ribosomal genes accounted for the greatest number of tags while only the keratins changed in expression or were decreased in the CIN III libraries.  43  2.6. References  1. 2. 3.  4.  5. 6.  7. 8.  9.  10.  11.  12. 13. 14.  15.  16.  Parkin DM, Bray F, Ferlay J, Pisani P: Global cancer statistics, 2002. CA Cancer J Clin 2005, 55(2):74-108. Scheurer ME, Tortolero-Luna G, Adler-Storthz K: Human papillomavirus infection: biology, epidemiology, and prevention. IntJGynecol Cancer 2005, 15(5):727-746. Perez-Plasencia C, Riggins G, Vazquez-Ortiz G, Moreno J, Arreola H, Hidalgo A, Pina Sanchez P, Salcedo M: Characterization of the global profile of genes expressed in cervical epithelium by Serial Analysis of Gene Expression (SAGE). BMC Genomics 2005, 6:130. Steinau M, Lee D, Rajeevan M, Vernon S, Ruffin M, Unger E: Gene expression profile of cervical tissue compared to exfoliated cells: Impact on biomarker discovery. BMC Genomics 2005, 6(1):64. Siddiqui AS, Delaney AD, Schnerch A, Griffith OL, Jones SJ, Marra MA: Sequence biases in large scale gene expression profiling data. Nucleic Acids Res 2006, 34(12):e83. Saha 5, Sparks AB, Rago C, Akmaev V, Wang CJ, Vogelstein B, Kinzler KW, Velculescu VE: Using the transcriptome to annotate the genome. Nat Biotechnol 2002, 20(5):508-5 12. Velculescu VE, Zhang L, Vogelstein B, Kinzler KW: Serial analysis of gene expression. Science 1995, 270(5235):484-487. Boon K, Osorio EC, Greenhut SF, Schaefer CF, Shoemaker J, Polyak K, Morin PJ, Buetow KH, Strausberg RL, De Souza SJ et al: An anatomy of normal and malignant gene expression. Proc NatlAcad Sci USA 2002, 99(17): 11287-11292. Kal AJ, van Zonneveld AJ, Benes V, van den Berg M, Koerkamp MG, Albermann K, Strack N, Ruijter JM, Richter A, Dujon B et al: Dynamics of gene expression revealed by comparison of serial analysis of gene expression transcript profiles from yeast grown on two different carbon sources. Mol Biol Cell 1999, 10(6):1859-1872. Cutts FT, Franceschi 5, Goldie 5, Castellsague X, de Sanjose S. Garnett G, Edmunds WJ, Claeys P, Goldenthal KL, Harper DM et al: Human papillomavirus and HPV vaccines: a review. Bull World Health Organ 2007, 85(9):719-726. Shadeo A, Chari R, Vatcher G, Campbell J, Lonergan KM, Matisic J, van Niekerk D, Ehlen T, Miller D, Follen M et al: Comprehensive serial analysis of gene expression of the cervical transcriptome. BMC Genomics 2007, 8(1): 142. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. JMolBiol 1990, 215(3):403-410. Eckert RL, Broome AM, Ruse M, Robinson N, Ryan D, Lee K: S100 proteins in the epidermis. J Invest Dermatol 2004, 123(1 ):23-3 3. Draghici 5, Khatri P, Bhavsar P, Shah A, Krawetz SA, Tainsky MA: Onto-Tools, the toolkit of the modern biologist: Onto-Express, Onto-Compare, Onto-Design and OntoTranslate. Nucleic Acids Res 2003, 31(1 3):3775-378 1. Wheeler DL, Church DM, Federhen S, Lash AE, Madden TL, Pontius JU, Schuler GD, Schriml LM, Sequeira E, Tatusova TA et al: Database resources of the National Center for Biotechnology. Nucleic Acids Res 2003, 3 1(1):28-33. Strausberg RL, Buetow KH, Emmert-Buck MR, Klausner RD: The Cancer Genome Anatomy Project: building an annotated gene index. Trends in Genetics 2000, 16(3): 103106.  44  17.  18.  19.  20.  21.  22.  23.  24.  25.  26.  27.  28.  29. 30.  Scholzel S, Zimmermann W, Schwarzkopf G, Grunert F, Rogaczewski B, Thompson J: Carcinoembryonic antigen family members CEACAM6 and CEACAM7 are differentially expressed in normal tissues and oppositely deregulated in hyperplastic colorectal polyps and early adenomas. Am JPathol 2000, 156(2):595-605. Albarran-Somoza B, Franco-Topete R, Delgado-Rizo V, Cerda-Camacho F, Acosta Jimenez L, Lopez-Botet M, Daneri-Navarro A: CEACAM1 in cervical cancer and precursor lesions: association with human papillomavirus infection. JHistochem Cytochem 2006, 54(12):1393-1399. Cury PM, Butcher DN, Fisher C, Coffin B, Nicholson AG: Value of the mesothelium associated antibodies thrombomodulin, cytokeratin 5/6, calretinin, and CD44H in distinguishing epithelioid pleural mesothelioma from adenocarcinoma metastatic to the pleura. Mod Pathol 2000, 13(2):107-1 12. Emberley ED, Alowami S, Snell L, Murphy LC, Watson PH: S100A7 (psoriasin) expression is associated with aggressive features and alteration of Jab 1 in ductal carcinoma in situ of the breast. Breast Cancer Res 2004, 6(4):R308-315. Emberley ED, Niu Y, Njue C, Kliewer EV, Murphy LC, Watson PH: Psoriasin (S100A7) expression is associated with poor outcome in estrogen receptor-negative invasive breast cancer. Clin Cancer Res 2003, 9(7):2627-263 1. Ordonez NG: Value of cytokeratin 5/6 immunostaining in distinguishing epithelial mesothelioma of the pleura from lung adenocarcinoma. Am J Surg Pathol 1998, 22(10):1215-1221. Tsuda H, Birrer MJ, Ito YM, Ohashi Y, Lin M, Lee C, Wong WH, Rao PH, Lau CC, Berkowitz RS et al: Identification of DNA copy number changes in microdissected serous ovarian cancer tissue using a cDNA microarray platform. Cancer Genet Cytogenet 2004, 155(2):97-107. Gibbs S, Fijneman R, Wiegant J, van Kessel AG, van De Putte P, Backendorf C: Molecular characterization and evolution of the SPRR family of keratinocyte differentiation markers encoding small proline-rich proteins. Genomics 1993, 16(3):630637. King TJ, Fukushima LH, Hieber AD, Shimabukuro KA, Sakr WA, Bertram iS: Reduced levels of connexin43 in cervical dysplasia: inducible expression in a cervical carcinoma cell line decreases neoplastic potential with implications for tumor progression. Carcinogenesis 2000, 21(6):1097-1 109. Aasen T, Hodgins MB, Edward M, Graham SV: The relationship between connexins, gap junctions, tissue architecture and tumour invasion, as studied in a novel in vitro model of HPV-16-associated cervical cancer progression. Oncogene 2003, 22(39):7969-7980. Steinhoff 1, Leykauf K, Bleyl U, Durst M, Alonso A: Phosphorylation of the gap junction protein Connexin43 in CIN III lesions and cervical carcinomas. Cancer Lett 2006, 235(2):291-297. Chovanec M, Smetana K, Jr., Pizak J, Betka I, Pizakova Z, Stork J, Hrdlickova E, Kuwabara I, Dvorankova B, Liu FT et al: Detection of new diagnostic markers in pathology by focus on growth-regulatory endogenous lectins. The case study of galectin 7 in squamous epithelia. Prague Med Rep 2005, 106(2):209-216. Magnaldo T, Fowlis D, Darmon M: Galectin-7, a marker of all types of stratified epithelia. Differentiation 1998, 63(3): 159-168. Saussez 5, Kiss R: Galectin-7. Cell Mol Life Sci 2006, 63(6):686-697. 45  31. 32.  Ueda S, Kuwabara I, Liu FT: Suppression of tumor growth by galectin-7 gene transfer. Cancer Res 2004, 64(16):5672-5676. Dumic J, Dabelic 5, Flogel M: Galectin-3: An open-ended story. Biochimica et BiophysicaActa (BBA) General Subjects 2006, 1760(4):616-635. Bernerd F, Sarasin A, Magnaldo T: Galectin-7 over expression is associated with the apoptotic process in UVB-induced sunburn keratinocytes. Proc Nati Acad Sd USA 1999, 96(20):1 1329-11334. Kuwabara I, Kuwabara Y, Yang RY, Schuler M, Green DR, Zuraw BL, Hsu DK, Liu FT: Galectin-7 (P101) exhibits pro-apoptotic function through JNK activation and mitochondrial cytochrome c release. JBiol Chem 2002, 277(5):3487-3497. Demers M, Magnaldo T, St-Pierre Y: A novel function for galectin-7: promoting tumorigenesis by up-regulating MMP-9 gene expression. Cancer Res 2005, 65(12):52055210. Rescher U, Gerke V: Annexins--unique membrane binding proteins with diverse functions. J Cell Sci 2004, 11 7(Pt 1 3):263 1-2639. Choi J, Chang JS, Song MS, Ahn BY, Park Y, Lim DS, Han YS: Association of hepatitis B virus polymerase with promyelocytic leukemia nuclear bodies mediated by the S 100 family protein p11. Biochem Biophys Res Commun 2003, 305(4):1049-1056. Vaarala MH, Porvari KS, Kyllonen AP, Mustonen MV, Lukkarinen 0, Vihko PT: Several genes encoding ribosomal proteins are over-expressed in prostate-cancer cell lines: confirmation of L7a and L37 over-expression in prostate-cancer tissue samples. mt JCancer 1998, 78(1):27-32. Zhang L, Zhou W, Velculescu yE, Kern SE, Hruban RH, Hamilton SR, Vogelstein B, Kinzler KW: Gene expression profiles in normal and cancer cells. Science 1997, 276(53 16):1268-1272. Wong YF, Selvanayagam ZE, Wei N, Porter J, Vittal R, Hu R, Em Y, Liao J, Shih JW, Cheung TH et al: Expression Genomics of Cervical Cancer: Molecular Classification and Prediction of Radiotherapy Response by DNA Microarray. Clin Cancer Res 2003, 9(1 5):5486-5492. Barrett T, Troup DB, Wilhite SE, Ledoux P, Rudnev D, Evangelista C, Kim IF, Soboleva A, Tomashevsky M, Edgar R: NCBI GEO: mining tens of millions of expression profiles--database and tools update. Nucleic Acids Res 2007, 35(Database issue):D760765. Disbrow GL, Hanover JA, Schlegel R: Endoplasmic reticulum-localized human papillomavirus type 16 ES protein alters endosomal pH but not trans-Golgi pH. J Virol 2005, 79(9):5839-5846. Chang CH, Tsai LC, Chen ST, Yuan CC, Hung MW, Hsieh BT, Chao PL, Tsai TH, Lee TW: Radioimmunotherapy and apoptotic induction on CK19-overexpressing human cervical carcinoma cells with Re-188-mAbCx-99. Anticancer Res 2005, 25(4):27192728. Kell B, Jewers RJ, Cason J, Best JM: Cellular proteins associated with the ES oncoprotein of human papillomavirus type 16. Biochem Soc Trans 1994, 22(3):333S. -  33.  34.  35.  36. 37.  38.  39.  40.  41.  42.  43.  44.  46  3. TRANSCRIPTOMIC ABERRATIONS IN CHROMATIN REMODELLING PATHWAY IN CERVICAL INTRAEPITHELIAL NEOPLASIA  3.1. Introduction Cervical cancer affects approximately 500,000 women worldwide each year with highest rates in developing countries [1]. Cervical Intraepithelial Neoplasia (CIN) is a precursor lesion to cervical cancer and can be further subdivided into CIN I, CIN II and CIN III (mild, moderate and severe dysplasia, respectively). Most CIN I lesions spontaneously regress to normal however CIN III lesions are much more likely to progress to cervical cancer if left untreated [1]. The CIN I to CIN II junction may be critical in disease development.  Human Papillomavirus (HPV) is the recognized etiologic agent for cervical cancer however, alone it is not sufficient for invasive disease. HPV is detected in nearly all cervical cancers, 94% of CIN lesions and up to 46% of normal cervical epithelium [1]. Over 100 strains of HPV exist however HPV 16 and HPV 18 are considered highly virulent strains and account for the majority of cervical cancers [1, 2].  The study of cervical cancer prevention has progressed impressively in the recent past. Widely implemented screening programs have resulted in 80% reduction of cervical cancer rates in North America within the past fifty years [3]. Cytological assessment is currently the frontline method for identifying precancerous cervical lesions however repeat evaluations can frequently be required due to low sensitivity [3]. Although vaccines against the most virulent strains of HPV have recently become available, vaccination is not an easily utilized resource for those countries most inflicted with the highest cervical cancer rates [3]. This is largely due to cost and that the administration of the vaccine occurs in three doses over six months to a pre-adolescent female population. Few countries have established programmes targeting healthcare to this population [4]. In addition, perceived social implications in developed countries regarding vaccinating girls at an early age are hindering widespread administration. Together, these social, A version of this chapter has been published. Shadeo A, Chari R, Lonergan KM, Pusic A, Miller D, Ehien T, van Niekerk D, Matisic J, Richards-Kortum R, Follen M, Guillaud M, Lam WL, Macaulay C Up regulation in gene expression of chromatin remodelling factors in cervical intraepithelial neoplasia. BMC Genomics, 2008 Feb 4;9(1):64 47  clinical and genetic factors indicate that frontline monitoring will continue to play an important role in cervical cancer prevention and that improved methods and markers for detection are needed.  It is unclear that HPV alone is responsible for disease progression. A thorough understanding of genetic events in precancerous cervical intraepithelial neoplasia is required to both delineate important causal events in cervical cancer and to identify informative candidate biological markers. Gene expression of cervical tissue and changes in expression pattern have been the focus of several recent publications. Studies by Pérez-Plasencia et al and Shadeo et al both characterized the transcriptome of normal cervical epithelium using serial analysis of gene expression  (SAGE)  [5,  6],  Additionally,  Gius  et  al  reported  changes  in  proproliferative/immunosuppression gene expression in CIN I lesions, as well as proangiogenic and proinvasive expression signatures that coincide with CIN II and CIN III, respectively [7].  In this study, we build upon our previous work in defining the normal cervical epithelial transcriptome and aim to identify genes differentially expressed between normal cervical epithelium and those precancerous lesions which are more apt to progress to cervical cancer if left without treatment (CIN Ill). In this study we have distinguished gene expression aberrations across mild/moderate dysplasia (CIN I, CIN II) in addition to CIN III and non cancerous (NC) cervical epithelium using an unbiased long serial analysis of gene expression (L-SAGE) method that simultaneously allows for the discovery of tags which map to HPV 16. In total, sixteen L SAGE libraries were sequenced for a total of 2,481,387 tags, establishing the largest SAGE data collection for cervical tissue worldwide. Upon evaluation of expression differences between NC cervical epithelium and CIN lesions, we have identified two gene networks directly or indirectly involved in chromatin remodelling altered in expression in CIN III.  3.2. Methods 3.2.1. Sample Selection and Collection  Specimens were collected as previously described [6], immediately prior to the LEEP (Loop electrosurgical excision procedure) using bite biopsy targeting a small portion of the affected epithelium. These specimens were collected with patient consent and University of British 48  Columbia (UBC) British Columbia Cancer Agency (BCCA) Research Ethics Board approval at the Vancouver General Hospital Women’s Clinic at the Vancouver Hospital & Health Science Centre (VHHSC). Anonymized cases were assessed by cervical cancer pathologists at BCCA and were selected without prior knowledge of HPV status. Specimens Nito N4 in this study were observed to be non cancerous squamous epithelia whereas CltoC6 were identified as high grade dysplasia or CIN III. Specimens Mito M3 were assessed as CIN I while M4to M6 as CIN II. Detail description of LEEP cone specimens are listed in Additional file 7. All samples were stored in a RNA preservation solution (RNAlater, Ambion) within 2 minutes of retrieval and then placed in storage at -80°C.  3.2.2. L-SAGE The biopsies were homogenised in Lysis Binding buffer (100mM Tris-HC1, pH7.5, 500mM LiCI, 10mM EDTA, pH 8.0, 1% LiDS, 5mM dithiothreitol).  Long SAGE libraries were  constructed using the L-SAGE kit (Invitrogen, Ontario, Canada) and sequenced to an average depth of 155,000 tags allowing for deep sampling of the transcriptome without an initial restriction to known genes [8-10].  3.2.3. Analysis Raw tag count data for each SAGE library was normalized to tags per million (TPM) to facilitate comparison between libraries. Each sequence tag was then mapped to genes using the August 1 2006 version of SAGEGenie [11]. CIN I and CIN II libraries were grouped in analysis (CIN 1/CIN II or mild/moderate dysplasia). Subsequently, those tags detected at a level of 20 TPM in at least one of the NC libraries, CINI/CINII, and CINIII groups were retained for differential expression analysis.  To determine differential expression between the three groups, a  combination of a two-fold difference in means as well as a permutation score  1.96  (corresponding to an unadjusted p-value of 0.05) was used [12]. Briefly, the permutation test used in this analysis examines the statistical significance of the observed differences in means between two groups in comparison to the average and standard deviation of difference of 10,000 random permutations of the data.  49  3.2.4. eDNA Synthesis A validation panel of 20 cervical specimens (independent from those used in the SAGE analysis) were assembled: normal (NCa, NCb, NCc), CIN I (A-C), CIN I/Il (E), CIN II (E-J) and CIN III (K-Q).  Specimens were collected and stored as described above.  RNA was isolated as  previously described [6]. In addition, cervix squamous cell carcinoma (Ti) and adjacent normal tissue (Ni) total RNA sample set was purchased (Ambion, #7276). All eDNA was generated from 200ng of total RNA using the High Capacity TaqMan Reverse Transcription Reagents (Applied Biosystems, Foster City, CA).  3.2.5. Real Time PCR Analysis The expression levels of individual genes were analyzed by real-time PCR using TaqMan® Gene Expression Assays in triplicate in a new panel of 22 cases using the ABI 7500 Real-Time PCR System (Applied Biosystems).  Taqman probes (Assay IDs) include 18S (Hs99999901_sl),  MORF4L2 (Hs002021 imi), MRFAPJ (Hs00738i44_gl), SMARCC] (Hs00268265_ml), NCOR1  (Hs00i96920_mi),  DHFR  (Hs0075822_si),  PTEN  (Hs00829813_sl),  RBL2  (Hs00180562_ml) and CDKN2B (Hs00793225_ml). The relative quantification of the target genes in CIN I, CIN II and CIN III samples compared to the average Ct of NC samples was performed using the established  method (Applied Biosystems, Relative Quantitation Of  Gene Expression, ABI PRISM 7700 Sequence Detection System: User Bulletin #2).  Gene  expression is normalized to ]8S. The relative quantification values were then plotted, a ratio of one indicating no change with respect to NC cervical tissue.  A single squamous cervical  carcinoma sample and adjacent normal tissue were also compared using this method (Ti and Ni, respectively).  50  3.3. Results 3.3.1. L-SAGE Libraries from Cervical Tissue In this study 16 L-SAGE libraries were constructed and analyzed (Figure 3.3.1). Libraries Ni to N4 were made from NC cervical tissue samples, Ml to M3 from CIN I samples, M4, to M6 from CIN II samples and Cl, to C6 from CIN III samples. Ni, N2, Cl and C2 were mined in a preliminary study characterizing normal cervical tissue [6]. Collectively, 2,481,387 useful tags were sequenced (Figure 3.3.1). This data collection has been made publicly available at Gene Expression Omnibus, series accession number GSE7433 [13].  3.3.2. Early Stage Changes The mean tag counts were compared between NC tissue samples (libraries Ni-4) and mild/moderate stage (CIN I/TI) samples (libraries Ml -6) in order to identify tags differentially expressed in the early stages of neoplasia. One-hundred sixty-nine tags were identified to be differentially expressed according to our selection criteria as described in Methods [see Additional file 1]. Both increased and decreased expression is observed at comparable frequencies (75 tags increased, 94 tags decreased in CIN I/Il) and 138 of these tags mapped to known genes including loci encoding hypothetical proteins.  The most commonly affected biological process in the early events of disease is DNA dependent regulation of transcription and transcription (11% and 8% of differentially expressed tags, as assessed by Onto-Express, respectively) (Figure 3.3.2A) [14, 15].  51  3.3.3. Late Stage Changes  The mean tag counts were compared between CINI/Il tissue samples (libraries M1-6) and CIN III libraries (C 1-6). This identified 128 tags differentially expressed in later stages of neoplasia [see Additional file 2]. Seventy-three tags were increased in CIN III while 55 were decreased. Overall 97 tags mapped to known genes. The major pathways affected by these genes were similar to those observed in early stage changes, DNA dependent regulation of transcription and transcription (8% and 7% as assessed by Onto-Express. respectively) (Figure 3.3.2B) [14, 15]. Twelve tags mapped to the HPV 16 genome that could not be mapped to the human genome or transcripts using methods previously described [see Additional file 3]. [6]. In addition, 42 tags showed concurrent expression between all sixteen libraries [see Additional file 4].  3.3.4. Pathway Analysis of Overall Changes  The mean score of tags in the four NC libraries were compared to the mean score of tags in the six severe neoplasia libraries (CIN III). We identified 108 tags increased in frequency in CIN III and 138 tags decreased and overall observed 246 tags differentially expressed between NC tissue and CIN III [see Additional file 5]. Two hundred and forty tags mapped to 222 unique genes. Genes differentially expressed between normal and CIN III were evaluated for gene network associations using Ingenuity Pathway Analysis [161. Biological functions most influenced by these genes include cell death, cell growth/proliferation and cellular movement (Figure 3.3.3). The gene network with the most number of differentially expressed genes influenced cellular processes such as cell cycle and cell morphology and involved 17 of the differentially expressed genes (Table 3.1, Figure 3.3.4).  The network with the higher magnitude gene expression  changes influenced cell cycle and gene expression (.3 3.5).  Eight genes from these networks were selected for validation in an independent panel of cervical samples (Table 3.2). Cyclin-dependent kinase inhibitor 2B (CDKN2B), retinoblastoma-like 2 (p1.30 or RBL2) and phosphatase and tensin homolog (PTEN) had lower counts in the CIN III derived libraries. Conversely, Mof4 family associated protein 1 (MRFAPJ), mortality factor 4 like 2 (MORF4L2), nuclear receptor co-repressor 1 (NCORJ), dihydrofolate reductase (DHFR) and SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily c, member 1 (SMARCCJ) showed higher tag counts in the CIN III relative to NC. 52  The validation panel consists of 22 samples representing NC, CIN I, CIN II and CIN III in addition to a tumor sample with paired normal tissue. Quantitative analysis of expression was determined for each gene on this panel by real-time PCR using TaqMan® Gene Expression Assays (Figure 3.3.6). NCORJ expression increased in nine of ten CIN I/IT samples ranging from 1.3-6.0 fold increase relative to NC with a statistically significant overall all trend (p <0.05). SM4RCC] expression increase was confirmed in six CIN I/IT samples and four CIN III cases (Figure 3.3.6). Similarly, we confirmed the increased expression of DHFR and MRFAP] in the validation panel including in nine often CIN 1/11 cases and six of seven CIN III cases and the overall trend of increase for this gene was statistically significant (p <.05). MORF4L2 expression increase was confirmed in eight of ten CIN I/TI and six of seven CIN III with statistically significant overall trend of increase (p <.05) in the validation panel. Although we were able to confirm decreased CDKN2B expression in the tumour, we were unable to confirm this in the earlier staged cases in the validation panel. Reduced expression of phosphatase and tensin homolog (PTEN) was confirmed in two CIN II and two CIN III cases [see Additional file 6]. 3.4. Discussion The study of precancer lesions is essential in understanding the initiating events in cancer development and is of particular importance in cervical cancer as moderate and high grade lesions (CIN II and CINIII) are more likely to progress to cancer than those of low grade (CIN I). In this study, we have comprehensively evaluated gene expression changes in the precancerous stages of cervical cancer by comparing 16 cervical intraepithelial neoplasia and NC cervical specimens using the L-SAGE method of gene expression profile determination.  We have  sequenced a total of 2,481,387 tags, establishing the largest SAGE data collection for cervical tissue worldwide.  CIN I, CIN II, CINITI as well as normal tissues are represented in this  collection. In a preliminary study, we had identified Keratin 6A (KRT6A), carcinoembryonic antigen-related cell adhesion molecule 7 (CEACAM7), S 100 calcium-binding protein A7 (S 100A7) and small proline-rich protein 3 (SPRR33) to be highly expressed (>500 TPM) in NC cervical tissue in the Ni and N2 libraries [6]. Tags mapping to these genes are present at high levels in the additional normal libraries (N3 and N4) unique to this study with the exception of CECEAM7 which shows slightly reduced expression of 363 and 397 TPM (N3 and N4, 53  respectively). The decreasing trend in Galectin 7 (LGALS7) arid Gap junction protein Al (GJA 1) previously observed in CIN III when compared to normal mean values, continues to be true in this study (3.0 and 1.7 fold decrease in CIN III, respectively, p value not significant).  3.4.1. Early and Late Events Of the tags differentially expressed between NC and mild/moderate dysplasia (169 tags), and between mild/moderate and severe dysplasia (128 tags); 25% fewer tags were altered in later stage disease. These differentially expressed tags most frequently mapped to genes involved in DNA dependent regulation of transcription. Interestingly, tags mapping to genes with biological functions unique to late stage changes include cell cycle (5% of tags), cell division (3%), immune (4%) and inflammatory (3%) response.  3.4.2. Biological Characteristics of Differentially Expressed Tags Of the 246 tags that are altered in expression in CIN III relative to NC, cell death, cell growth and proliferation, cell movement, cell cycle and DNA replication, recombination and repair are the top five biological functions. These functions encompass characteristics frequently described as the hallmarks of cancer [17].  Tags mapping to seven unique genes showed a greater than ten fold increase in expression in CIN III libraries. A tag mapping to SEC13 homolog (SECJ3LJ) showed the greatest change as it was not present in any one of the four NC libraries. However, an average of 20 TPM were observed in CIN III libraries. Additional tags with greater than ten fold change mapped to MORF4L2, MRFAP1, WD repeat domain 18 (WDRJ8), SMARCC], eukaryotic translation elongation factor 1 gamma (EEFJG) and G protein-coupled receptor 180 (GPRJ8O). SMARCCJ is a component of the chromatin remodeling complex SWI/SNF while both MRFAP1 and MORF4L2 belong to the MGF/MORF family of transcription factors involved in growth, cell senescence and are implicated indirectly in chromatin remodeling [18-23]. WDRJ8 is a member of the WD repeat protein family. Members of this gene family are involved in a variety of cellular processes including cell cycle progression, signal transduction, apoptosis, and gene regulation [24]. Tags mapping to this gene varied in counts between NC and CIN III by greater than twelve fold increase in the latter subgroup. EEFJG is a subunit of the elongation factor-i 54  complex involved in translation and GPR]80 contains transmembrane domains and may play a role in vascular remodelling [25]. It is interesting to note that both homeobox B7 (HOXB7) and BH3 interacting domain death agonist (BID) are also increased in CIN III although to a lesser magnitude (seven and five fold, respectively). Many of the described genes may have the potential to influence processes characteristic of epithelial cancers such as, apoptosis (BID), angiogenesis  (GPR18O),  proliferation  (WDR]8),  and  transcription  influencing  events  (SMARCC], MRFAP], MORFL2). Functional assays are required to delineate their biological function in CIN and cervical cancer.  3.4.3. Network Analysis  Upon further investigation of the 246 differentially expressed tags between NC and CIN III, 240 mapped to 222 unique genes. When analyzed for gene network relationships, many of the genes targeted pathways influencing properties such as cell cycle, cell morphology, cancer related events and gene expression. In this study, we have identified a group of 28 genes that fall into two gene networks encompassing these properties. The first network (A) contains 15 of the differentially expressed genes while the second network (B) includes 13 of the 222 genes. Only cyclin-dependent kinase inhibitor 2b (CDKN2B) and kruppel-like factor 6 (KLF6) overlap between Network A and B. Network B includes three of the seven tags showing greater than 11 fold change and one of the five tags showing more than 20 fold decrease in NC when compared to CIN III. Genes from both networks were selected for validation in a new cervical tissue panel.  3.4.4. Network A  Cell cycle, cancer and cell morphology are the top functions influenced by genes in Network A. Sixteen genes differentially expressed between NC and CIN III are involved in this network including RBL2/p130, SMARCCJ, NCOR], PTEN, DHFR and CDKN2B. RBL2/p]30 is one of three members of the Retinoblastoma (Rb) gene family, others include RBL1/p107 and RB] [26]. This family of proteins regulate cell cycle through the GuS phase by sequestering E2F transcriptional regulators. Loss and mutations in RBL2/p 130 have been linked to various cancers [27, 28]. E7 from high risk HPV strains targets all members of the Rb family including RBL2/p130, for degradation thus releasing sequestered E2F and allowing for passage 55  through the GuS phase [27, 29]. Zhang et al found that low risk HPV 6 E7 also has the capacity to bind RBL2/p 130 although not with as high affinity and did not bind to the other members of the Rb family [29]. We observed a progressive reduction in RBL2/p130 transcripts from NC to mild/moderate dysplasia and severe dysplasia suggesting that a reduction in RBL2/p130 may also be regulated at the transcriptional level in cervical preneoplasia and may be the first Rb family gene event in the development of cervical intraepithelial neoplasia.  SIvL4RCCJ is a member of the SWI/SNF family of genes. Members of this evolutionarily conserved gene family are proposed to regulate gene specific transcription with downstream effects in cell cycle progression [30]. SMARCC1 has recently been reported to directly interact with components of the ATP-dependent SWI/SNF chromatin remodelling complex including, SWI/SNF-related, matrix-associated, actin-dependent regulator of chromatin A4 (BRG 1), BRG 1-associated factor A (BAF6Oa) and SWI/SNF-related, matrix-associated, actin-dependent regulator of chromatin BI (SNF5) [20]. The SWI/SNF chromatin remodelling complex consists of approximately ten components and is thought to regulate gene transcription by altering the surrounding chromatin structure. This complex can recruit both histone acetyltransferases (HATs) and histone deacetyltransferases (HDACs) in a gene specific manner thus influencing gene expression [31]. The SWI/SNF chromatin remodelling complex also has been implicated in hormone receptor signalling and growth [21]. SMARCC1 was characterized as a key regulator of core complex components (BRG1, BAF6Oa and SNF5) by positively influencing the stabilization of the complex proteins as opposed to regulation of their transcription [20]. In our study, there was no expression change in these core components. In contrast, we observed an 11 fold increase in tags mapping to SMARCC 1 in CIN III libraries when compared to normal libraries suggesting that SWI/SNF chromatin remodelling complex stability may have a role in severe dysplasia development.  56  DHFR is involved in folate metabolism in eukaryotes and is essential for purine and thyrnidylate biosynthesis and thus DNA replication [32]. We observed over a 250% increase in both mild/moderate and severe dysplasia as compared to normal, indicating that this event is present in very early stage lesions. It has recently been reported that DHFR is subjected to RB mediated repression via SWI/SNF chromatin remodelling activity [33]. The increase in DHFR observed is in concordance with the expected downstream effects of decreased RB gene family repression.  NCOR] is involved in transcription repression via chromatin condensation. It has been found to physically interact with members of the SWI/SNF complex, including SMARCC I and core component BRG1 [34]. We observed greater than 300% increase in NCOR1 in severe and mild/moderate dysplasia relative to normal suggesting an increase in NCOR1 occurs very early in disease development (CIN Il/CIN I).  Phosphatase and tensin homolog (PTEN), an established tumour suppressor gene, functions through the AKT/PKB signalling pathway [35]. We found PTEN to decrease in expression in CIN Il/CIN I libraries by >3 fold compared to normal indicating that this may be an early event. CDKN2B negatively regulates CDK4 and CDK6 [36]. We found CDKN2B decreased in CIN III libraries by nearly 3.5 fold when compared to NC. Decreased expression of CDKN2B has not been found due to copy number loss or mutation in cervical cancer [36]. Genes from Network A were selected for validation in a new 14 sample set consisting NC, CIN I, CIN II and CIN III cervix tissue and one tumour and normal pair. A similar pattern of fold change is present in this panel for DHFR, SMARCC1 and NCOR] in all three subgroups. Specifically, increased expression is observed in all three CIN II and two CIN III in the new cervical tissue panel, validating the disruption of genes involved in chromatin remodelling.  57  For PTEN, we observed a decrease in one of the three CIN III cases investigated which is consistent with Lee et al who found PTEN to be reduced in protein expression in only 10% of CIN III cervical cases and 18% of cervical cancers [37]. We also found CDKN2B to be over expressed in all CIN I, two CIN IT and one CIN III however, greater than four fold decrease in cervical cancer (Ti) was observed. It has been reported that cervical cancers and CIN III lesion have elevated levels of CDKN2B (84% and 79%, respectively) [38].  3.4.5. Network B  Fourteen genes differentially expressed between normal and CIN III targeted Network B including MRFAPJ, MORFL2 and CDKJ’/2B. Top functions influenced by Network B include cancer related factors, cell cycle and gene expression. The previously discussed gene CDKJ’/2B overlaps with this network.  MRFAPJ binds with strong affinity to mortality factor 4 like 1 (MRG]5) [22]. MRGJ5 is a member of the MRG/MORF family of genes. All members contain a MRG domain which has the capacity to bind multiple transcriptional regulators [23]. Members of this gene family are suggested to be involved in embryonic development, cell proliferation and senescence [29]. MRG 15 is a component of the MRG associated factors 1 and 2 complexes (MAF], MAF2, respectively) and thus has a putative role in chromatin remodelling [39]. MRG15 has been shown to specifically associate with MRFAP1 in MAF1, tumour suppressor RB and the mammalian co repressor complex, mSin3A [19, 40]. We observed a 12.9 fold increase of MRFAPJ in CIN III libraries when compared to NC libraries. Notably, the majority of this increase occurs between mild/moderate and severe dysplasia. Although the biological mechanisms of MRFAP 1 and MRG 15 interaction has yet to be described, it is plausible that the relationship may influence the essential role MRG 15 plays in chromatin remodelling. We observed a 20 fold increase of MORF4L2 in CIN III libraries. Like MRGJ5, MORF4L2 also belongs to the MRG/MORF transcriptional regulator family of genes involved in senescence and cell growth [41]. Although the two genes share nearly 90% similarity, MORF4L2 is unique to vertebrates whereas MRG15 is evolutionarily conserved [19, 39]. Similarly to MRG15, MORF4L2 directly interacts with MRFAP1 and the mSin3A co repressor complex, components 58  of which include HDAC1 and HDCA2 and are involved in chromatin remodelling [18, 19]. MORF4L2 is reported to have both repressive and stimulatory activity in transcription regulation of well known cell cycle regulator, v-myb myeloblastosis viral oncogene homolog (avian)-like 2 (MYBL2) [19, 42].  MRFAP] and MORFL2 were selected from Network B for validation in 22 new samples as described above. In both cases, six of seven CIN III samples showed the greatest increase in expression (1.8  -  6.6 fold increase) and the overall increasing trend was found to be statistically  significant (p value .033 and .017, respectively). We observed a 2.0-2.5 fold increase in expression in cervical cancer (Ti). Together, these results validate the differential expression again supporting a potential role of chromatin remodelling in cervical cancer progression. This differential expression in the chromatin remodelling genes could result in changes in the organization of the DNA. If these changes are large enough they could be detectable. Previously, we quantified changes in nuclear texture with increasing grade of CIN [43]. These measured phenotype changes, as seen in Figure 3.4 (an updated graph of the data), begin with CIN I and are more prominent in the high grades of CIN (II, III) and cancer [43]. This appears to correlate CIS with the novel observations presented in the present study with respect to the increased expression of chromatin remodelling complex components indicating that these expression changes result in an alteration of the cell/nuclear phenotype.  3.5. Conclusion Events in expression change involving genes in Network A and Network B (DHFR, MORFL2, MRFAPJ, NCOR], and SMARCC]) occur at or before the stage of moderate dysplasia (CIN II). These events are maintained, although at a reduced frequency, in CIN III suggesting a role in intermediate events prior to severe dysplasia that may be an important stepping stone in disease progression. Genomic instability is characteristic of CIN III lesions and cervical cancer, as such the non-maintenance of such aberrations in later stages may be due to the masking the of initial events by additional changes acquired in severe dysplasia [3, 44, 45].  59  Deregulation of chromatin remodelling functions has analogous effects to mutation in DNA repair components in that the repercussions can be genome wide. It is interesting that evidentially two deregulated genes from both networks investigated are associated with chromatin remodelling (SMARCCJ, NCORJ, MRFAP], MORF4L2) and such disruptions are targeted to the critical stage of moderate dysplasia. The increase in SWI/SNF stabilizing molecule SMARCCJ and other novel genes has not been previously illustrated as events in the early stages of dysplasia development thus providing not only novel candidate markers for screening but a biological function for targeting treatment. Together, our results suggest altered expression events in chromatin remodelling complex components and influencing factors occur in precancerous cervical intraepithelial neoplasia. Future investigation on protein DNA interaction will be necessary to further elucidate the precise role of chromatin remodelling in cervical cancer progression.  60  Table 3 1 Genes differentially expressed in Networks A and B. IPA Network Score Increased in CIN III Decreased in CIN III *  A  B  22  17  BH3 interacting domain death agonist  cyclin-dependent kinase inhibitor 2B  cytochrome b-561  Cbp/p300-interacting transactivator, with  dihydrofolate reductase  Glu/Asp-rich carboxy-terminal domain, 2  nuclear receptor co-repressor 1  endothelin 3  sialidase I  interferon-related developmental regulator I  peptidyl-prolyl cis/trans isomerase, NIMA-  Kruppel-like factor 6  interacting  phosphatase and tensin homolog RAB4A  SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily c, member 1  retinoblastoma-like 2 (p130)  UDP-Gal:betaGlcNAc beta 1,4galactosyltransferase, polypeptide 5  adaptor-related protein complex 3, mu 2 subunit  B double prime I, subunit of RNA polymerase III transcription initiation factor IIIB  cyclin-dependent kinase inhibitor 2B  creatine kinase, brain  Kruppel-like factor 6  mortality factor 4 like 2  keratin 17  Mof4 family associated protein 1  tropomodulin 3  cellular repressor ofElA-stimulated genes I  ribophorin I stomatin (EPB72)-like 2 * The score is a numerical value used to rank networks according to how relevant they are to the genes in your input dataset. Calculations are based on the hypergeometric distribution calculated via the computationally efficient Fishers Exact Test for 2x2 contingency tables . Gene symbols in bold denote those selected for validation in a new 8 panel of fourteen cases.  61  Table 3.2 Candidate genes in Networks A and B. P Score N CINHI  NC Mean  CINIIII Mean  CINIII Mean  Fold Change  2.78  .83  7.49  20.21  1 1.02  2.14  1.83  3.64  23.67  12.90  mortality factor 4 like 2  3.36  1.78  18.65  35.20  19.78  nuclear receptor corepressor I  2.36  12.65  38.27  42.02  3.32  dihydrofolate reductase  2.68  10.07  28.04  27.47  2.73  cyclin-dependent kinase inhibitor 2B  2.56  23.95  9.32  3.67  -6.53  2.43  31.05  15.54  5.70  -5.45  2.32  20.42  4.95  5.89  -3.47  Gene  SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily_c,_member_I Mof4 family associated protein 1  retinoblastoma-like 2 ) 130 (p phosphatase and tensin homolog  62  Total Tags: 2481 387  Figure 3.3.1 Summary of L-SAGE library analysis. Sixteen L-SAGE libraries were created and  a total of 2,481,387 useful tags were sequenced. An average of 130,560 tags were sequenced per NC library (4), 155,807 per CIN 1/CIN II library (6) and 170,718 per CIN III library (6). One hundred and sixty-nine tags were differentially expressed when comparing NC to CIN 1/CIN II, 246 between NC and CIN III and 128 tags were differentially expressed between CIN 1/CIN II and CIN III. Eight candidates were identified for greatest amplitude of change between NC and CIN III and gene network most affected.  63  1  Early Biological Processes Affected  •LbImyM Pitei SONAD€ç€d9ntRaguWafTrczon O1hIwim 8igProes8 •T,anaipIon  •naITraic& Protm B411hi •PiomTia’qo.t  •Pboç4oiyIaion •Transpoit  Gote •Nude,nRNA Spig  B Late Biological Processes Affected  •tkioM1 Piotern •ONA T,ansathn •PrhiTraipofl •gnaJTrduc*ii ncelcyde  IbnnRrn Rapwe •Me1abdo •bftninakyRpQrfie  OTt  saipthn ReulonØ4AFd II)  •OeiDMkm1  Figure 3.3.2 Functional groupings of tags differentially expressed in SAGE libraries. (A) Genes differentially expressed between NC and CIN 1/CiNli. (B) Genes differentially expressed between C1N 1/CIN II and CiN III. Categories are as described. The Unknown function group consists of tags mapping to genes with no known function. DNA Dependent Regulation of Transcription was the function most targeted in both groups of differentially expressed tags.  64  Ilili  11 [[11.111 [[.[[tI. I !III:j I  Figure 3.3.3 Biological functions targeted by gene expression changes between NC and CIN III Functional categories deemed as significant by Ingenuity Pathway Analysis software, are displayed along the x-axis. The y-axis displays the -(log) significance and the orange horizontal line denotes the cut off for significance (p-value of 0.05). The functional categories most significantly influenced include cell death, cell growth and proliferation and cell movement.  65  E1r2cMLiIir Spce  xPi  jI  -  1nL1añe 1 PI  Figure 3.3.4 Gene Network A. Green denotes loss in expression in C1N III while red indicates gain. Intensity of colour signifies magnitude of change. Solid lines indicated direct interactions while dashed lines indicate indirect interactions as described by Ingenuity Pathway Systems. Functions influenced by Network A include cell cycle, cancer and cell morphology.  66  UrnowT  ‘IaIvaF.IeIrixare  -  51tM_  .—--,---  KRT1  -  -—----  -  -  —  t-tci  -_p1  Figure 3.3.5 Gene Network B. Green denotes loss in expression in CEN III while red indicates  gain. Intensity of colour signifies magnitude of change. Solid lines indicated direct interactions while dashed lines indicate indirect interactions as described by Ingenuity Pathway Systems. Functions influenced by Network B include cell cycle, gene expression and cancer specific function.  67  12  ——  MRFAP1  10  iizIii111zi?i1111I I1 .4  -6-—-— 4  A8COFFGHI  -to  JKL4NOPP  ACOEF  12  I  J  <1  MNCP  12  I-4FH  10  10  izii4J -0  G14  .  NCOR1  I  0 : rz  II I  I  i  I  iiih  .5  --  -10  -to AIDF  OH  2  MJPQR  12. 10  iiLiF  I  IL  -4  4—  -5. -10  .  A  BCEP  GH  I  .1  1(1.  -10  --  Mt4PQR  -  A  -  2  0  F  G  H  I  -  J  t(  L  4  14  0  P  a  R  Figure 3.3.6 Summary of validation panel quantitative PCR results of genes with altered expression in CIN III L-SAGE libraries. A panel of 22 new cases was investigated in triplicate for expression changes for eight genes. Samples are indicated as follows: CIN I (A-C), CIN I/TI (D), CIN II (E-J), CIN III (K-Q) and tumour (R). Zero on the Y-axis denotes mean expression levels of the respective genes in three NC cervical tissues. Expression in tumour is relative to matched normal.  68  5 154  4  118  3 153  LJ  [ .1  CIN II  CIN III  2 489 0  I ‘1  294  1890  U) 4)  S  77 I  -2 -3  -4 Negative  Atypia  HPV Ac  CIN  Carcinoma  Histopathology Diagnosis  Figure 3.4 An updated box plot from Guillaud et al. showing the correlation between a linear discriminant function score (Texture Score) based entirely on only texture phenotype features measured of the nuclei in formalin fixed quantitatively stained sections of biopsied tissue [43]. The error bars represent the  th 5  and  thi 95  percentiles; the box represents the central  th 50  percentile  and the solid square the mean of the distribution of the scores for measured sections with the noted histopathological diagnosis. The numbers over the boxes are the number of samples measured for the specific diagnosis.  69  3.6. References  1. 2. 3.  4.  5.  6.  7.  8. 9.  10.  11.  12.  13. 14.  15.  Scheurer ME, Tortolero-Luna G, Adler-Storthz K: Human papillomavirus infection: biology, epidemiology, and prevention. IntJ Gynecol Cancer 2005, 15(5):727-746. Woodman CB, Collins SI, Young LS: The natural history of cervical HPV infection: unresolved issues. Nat Rev Cancer 2007, 7(1):11-22. Wheeler CM: Advances in primary and secondary interventions for cervical cancer: human papillomavirus prophylactic vaccines and testing. Nat Clin Pract Oncol 2007, 4(4):224-235. Cutts FT, Franceschi 5, Goldie S, Castellsague X, de Sanjose S, Garnett 0, Edmunds WJ, Claeys P, Goldenthal KL, Harper DM et al: Human papillomavirus and HPV vaccines: a review. Bull World Health Organ 2007, 85(9):719-726. Perez-Plasencia C, Riggins G, Vazquez-Ortiz G, Moreno J, Arreola H, Hidalgo A, Pina Sanchez P, Salcedo M: Characterization of the global profile of genes expressed in cervical epithelium by Serial Analysis of Gene Expression (SAGE). BMC Genomics 2005, 6:130. Shadeo A, Chari R, Vatcher G, Campbell J, Lonergan KM, Matisic J, van Niekerk D, Ehlen T, Miller D, Follen M et al: Comprehensive serial analysis of gene expression of the cervical transcriptome. BMC Genomics 2007, 8(1):142. Gius D, Funk MC, Chuang EY, Feng 5, Huettner PC, Nguyen L, Bradbury CM, Mishra M, Gao S, Buttin BM et al: Profiling microdissected epithelium and stroma to model genomic signatures for cervical carcinogenesis accommodating for covariates. Cancer Res 2007, 67(15):7113-7123. Velculescu VE, Zhang L, Vogelstein B, Kinzler KW: Serial analysis of gene expression. Science 1995, 270(5235):484-487. Albarran-Somoza B, Franco-Topete R, Delgado-Rizo V, Cerda-Camacho F, Acosta Jimenez L, Lopez-Botet M, Daneri-Navarro A: CEACAMI in cervical cancer and precursor lesions: association with human papillomavirus infection. JHistochem Cytochem 2006, 54(12):1393-1399. Chen J, Sun M, Lee 5, Zhou G, Rowley JD, Wang SM: Identifying novel transcripts and novel genes in the human genome by using novel SAGE tags. Proc Nail Acad Sci USA 2002, 99(19):12257-12262. Boon K, Osorio EC, Greenhut SF, Schaefer CF, Shoemaker J, Polyak K, Morin PJ, Buetow KH, Strausberg RL, De Souza SJ et al: An anatomy of normal and malignant gene expression. Proc Natl Acad Sci USA 2002, 99(17): 11287-11292. Forozan F, Mahlamaki EH, Monni 0, Chen Y, Veldman R, Jiang Y, Gooden GC, Ethier SP, Kallioniemi A, Kallioniemi OP: Comparative genomic hybridization analysis of 38 breast cancer cell lines: a basis for interpreting complementary DNA microarray data. Cancer Res 2000, 60(16):4519-4525. Edgar R, Domrachev M, Lash AE: Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res 2002, 30(1):207-210. Draghici S, Khatri P, Bhavsar P, Shah A, Krawetz SA, Tainsky MA: Onto-Tools, the toolkit of the modern biologist: Onto-Express, Onto-Compare, Onto-Design and OntoTranslate. Nucleic Acids Res 2003, 31(13):3775-3781. Draghici 5, Khatri P, Martins RP, Ostermeier GC, Krawetz SA: Global functional profiling of gene expression. Genomics 2003, 81(2):98-104. 70  16. 17. 18.  19.  20.  21.  22.  23.  24. 25.  26. 27.  28.  29.  30. 31.  Lacroix M, Leclercq G: Relevance of breast cancer cell lines as models for breast tumours: an update. Breast Cancer Res Treat 2004, 83(3):249-289. Hanahan D, Weinberg RA: The hallmarks of cancer. Cell 2000, 100(1):57-70. Yochum GS, Ayer DE: Role for the mortality factors MORF4, MRGX, and MRG15 in transcriptional repression via associations with Pfl, mSin3A, and Transducin-Like Enhancer of Split. Mol Cell Biol 2002, 22(22):7868-7876. Tominaga K, Leung JK, Rookard P, Echigo J, Smith JR, Pereira-Smith OM: MRGX is a novel transcriptional regulator that exhibits activation or repression of the B-myb promoter in a cell type-dependent manner. JBiol Chem 2003, 278(49):49618-49624. Sohn DH, Lee KY, Lee C, Oh J, Chung H, Jeon SH, Seong RH: SRG3 interacts directly with the major components of the SWI/SNF chromatin remodeling complex and protects them from proteasomal degradation. JBiol Chem 2007, 282(14):10614-10624. Shanahan F, Seghezzi W, Parry D, Mahony D, Lees E: Cyclin E associates with BAF155 and BRG1, components of the mammalian SWI-SNF complex, and alters the ability of BRG1 to induce growth arrest. Mol Cell Biol 1999, 19(2):1460-1469. Leung JK, Berube N, Venable S, Ahmed S, Timchenko N, Pereira-Smith OM: MRG15 activates the B-myb promoter through formation of a nuclear complex with the retinoblastoma protein and the novel protein PAM14. JBiol Chem 2001, 276(42):3917139178. Bowman BR, Moure CM, Kirtane BM, Welschhans RL, Tominaga K, Pereira-Smith OM, Quiocho FA: Multipurpose MRG domain involved in cell senescence and proliferation exhibits structural homology to a DNA-interacting domain. Structure 2006, 14(1):151—158. Li D, Roberts R: WD-repeat proteins: structure characteristics, biological function, and their involvement in human diseases. Cell MolLfe Sci 2001, 58(14):2085-2097. Tsukada S, Iwai M, Nishiu J, Itoh M, Tomoike H, Horiuchi M, Nakamura Y, Tanaka T: Inhibition of Experimental Intimal Thickening in Mice Lacking a Novel G-Protein Coupled Receptor. Circulation 2003, 107(2):313-319. Giacinti C, Giordano A: RB and cell cycle progression. Oncogene 2006, 25(38):52205227. Cinti C, Claudio PP, Howard CM, Neri LM, Fu Y, Leoncini L, Tosi GM, Maraldi NM, Giordano A: Genetic alterations disrupting the nuclear localization of the retinoblastoma related gene RB2/p 130 in human tumor cell lines and primary tumors. Cancer Res 2000, 60(2): 383-389. Zamparelli A, Masciullo V, Bovicelli A, Santini D, Ferrandina G, Minimo C, Terzano P, Costa 5, Cinti C, Ceccarelli C et al: Expression of cell-cycle-associated proteins pRB2/p 130 and p27kip in vulvar squamous cell carcinomas. Hum Pathol 2001, 32(I):49. Zhang B, Chen W, Roman A: The E7 proteins of low- and high-risk human papillomaviruses share the ability to target the pRB family member p130 for degradation. Proc Natl Acad Sci USA 2006, 103 (2):43 7-442. Cristiano S: SWI/SNF: The crossroads where extracellular signaling pathways meet chromatin. Journal of Cellular Physiology 2006, 207(2):309-3 14. Cho KS, Elizondo LI, Boerkoel CF: Advances in chromatin remodeling and human disease. Curr Opin GenetDev 2004, 14(3):308-315.  71  32. 33.  34.  35. 36.  37.  38.  39.  40.  41. 42. 43.  44.  45.  Assaraf YG: Molecular basis of antifolate resistance. Cancer Metastasis Rev 2007, 26(1):153-181. Gunawardena RW, Fox SR, Siddiqui H, Knudsen ES: SWI/SNF activity is required for the repression of dNTP metabolic enzymes via the recruitment of mSin3B. JBiol Chem 2007. Underhill C, Qutob MS, Yee SP, Torchia J: A novel nuclear receptor corepressor complex, N-CoR, contains components of the mammalian SWI/SNF complex and the corepressor KAP-l. JBiol Chem 2000, 275(51):40463-40470. Dillon RL, White DE, Muller WJ: The phosphatidyl inositol 3-kinase signaling network: implications for human breast cancer. Oncogene, 26(9): 1338-1345. Kim JW, Namkoong SE, Ryu SW, Kim HS, Shin JW, Lee JM, Kim DH, Kim 1K: Absence ofpl5INK4B and p16INK4A gene alterations in primary cervical carcinoma tissues and cell lines with human papillomavirus infection. Gynecol Oncol 1998, 70(1):75-79. Chao A, Wang TH, Lee YS, Hsueh 5, Chao AS, Chang TC, Kung WH, Huang SL, Chao FY, Wei ML et al: Molecular characterization of adenocarcinoma and squamous carcinoma of the uterine cervix using microarray analysis of gene expression. mt J Cancer 2006, 1 19(1):91-98. Brooks LA, Sullivan A, ONions 3, Bell A, Dunne B, Tidy JA, Evans DJ, Osin P. Vousden KH, Gusterson B et al: E7 proteins from oncogenic human papillomavirus types transactivate p73: role in cervical intraepithelial neoplasia. Br I Cancer 2002, 86(2):263268. Tominaga K, Kirtane B, Jackson JG, Ikeno Y, Ikeda T, Hawks C, Smith JR, Matzuk MM, Pereira-Smith OM: MRG15 regulates embryonic development and cell proliferation. Mol Cell Biol 2005, 25(8):2924-2937. Zhang P, Zhao J, Wang B, Du J, Lu Y, Chen J, Ding J: The MRG domain of human MRG 15 uses a shallow hydrophobic pocket to interact with the N-terminal region of PAM14. Protein Sd 2006, 15(10):2423-2434. Tominaga K, Matzuk MM, Pereira-Smith OM: MrgX is not essential for cell growth and development in the mouse. Mol Cell Biol 2005, 25(12):4873-4880. Joaquin M, Watson RJ: Cell cycle regulation by the B-Myb transcription factor. Cell Mol Life Sci 2003, 60(1 1):2389-2401. Guillaud M, Cox D, Adler-Storthz K, Malpica A, Staerkel G, Matisic J, Van Niekerk D, Poulin N, Follen M, MacAulay C: Exploratory analysis of quantitative histopathology of cervical intraepithelial neoplasia: objectivity, reproducibility, malignancy-associated changes, and human papillomavirus. CytometryA 2004, 60(1):81-89. Lockwood WW, Coe BP, Williams AC, MacAulay C, Lam WL: Whole genome tiling path array CGH analysis of segmental copy number alterations in cervical cancer cell lines. mt J Cancer 2007, 120(2):436-443. Gopeshwar Narayan VBSCHA-PSVNPHRLGMDASBPMMKBRS: Gene dosage alterations revealed by cDNA microarray analysis in cervical cancer: Identification of candidate amplified and overexpressed genes. Genes, Chromosomes and Cancer 2007, 46(4):373-384.  72  4. COMPREHENSIVE COPY NUMBER PROFILES OF BREAST CANCER MODEL GENOMES  4.1 Introduction Breast cancer is the most prevalent cancer worldwide and is the second leading cause of cancer related deaths in women in North America [1, 2]. It is a complex disease where multiple genetic factors can combine to drive pathogenesis [3-5]. Copy number changes of genes such as ERBB2 and C-MYC have been extensively documented in breast cancer and are present in model cell lines [6-9]. Amplified (and over expressed) genes are prime therapeutic targets as for example, the use of the drug Trastuzumab against ERBB2 has been shown to improve breast cancer survival rates alone or in combination with other treatments [10-12].  Strategies to detect gene copy number alterations will facilitate the identification of novel molecular targets. Previous studies using 10 Mb resolution conventional metaphase comparative genomic hybridization (CGH) have identified gross regions of recurrent chromosomal aberrations in multiple breast cancer cell lines including loci within chromosomes lq, 8q, llql3, 1 7q, and 20q 13. Many of these alterations proved to be relevant as they were also present in primary tumours investigated [13-1 5]. Recent advances in array CGH (aCGH) have greatly improved the resolution of this technology, enabling detection of segmental copy losses and gains [16, 17]. Regional genomic arrays, providing contiguous or tiling coverage of a locus of interest, have been constructed to fine map commonly altered regions in breast cancer (e.g. 20q 13) [18-20]. Whole chromosome arrays have been used to provide information at 500 kb intervals.  For example, a chromosome 17 array was used to identify 13 regions of change  present in breast cancer cell line models and primary breast cancers [21]. Similarly, a genome wide array containing nearly 2,500 bacterial artificial chromosome (BAC) clones with a resolution at 1 .4 Mb was used to illustrate detection of copy number alterations (CNA) in various breast cancer cell lines [22]. Recently, a separate study using a 422 genomic loci array detected frequent alterations at 1, 6, ‘7p, 9, llq, 12q, 17, 20q and 22q in archival breast cancer  A version of this chapter has been published. Shadeo A and Lam WL. Comprehensive Copy Number Profiles of Breast Cancer Cell Model Genomes. Breast Cancer Res. 2006;8(1):R9 (1-14). 73  specimens [23]. cDNA arrays have also detected DNA copy changes of amplicons containing ERRB2 on 17q [24-27]. More recently, a cDNA array containing 6,691 mapped human genes  was used to explore the relationship between copy number alteration and gene expression changes in breast tumors and cell lines [28]. While large insert clone megabase interval CGH arrays and cDNA arrays provide a robust platform for rapid survey of tumor genomes, valuable information could be overlooked due to their limited resolution. It is clear that a more detailed description of breast tumor genomes would require re-examination using a higher resolution array platform.  Genetic, biochemical and pharmacologic studies of breast cancer have been greatly dependent on a number of commonly used model breast cancer cell lines: MCF-7, BT-474, SK-BR-3, T-47D, UACC-893, MDA-MB-23 1 and ZR-75-30. That is, a summation of studies involving at least one of these seven cell lines produces over 13,500 hits on Medline. These cells are known to harbor gross chromosomal aberrations; measuring the precise segmental copy number status across their entire genome may uncover novel discrete changes.  In the current study, we  expanded the use of array CGH to survey the genomes of these breast cancer cells at unprecedented detail using a recently developed whole genome tiling path array that cover the genome with 32,433 overlapping BAC clones [29]. Analysis at this resolution has led to the identification of novel features in these genomes and the delineation of segmental genetic alterations that have escaped detection by conventional molecular cytogenetic techniques and previous marker based or interval array CGH analysis.  4.2. Materials and Methods 4.2.1. Cell Line DNA A panel of seven breast cancer derived cell line DNA was obtained from the American Type Tissue Culture Collection (ATCC): MCF-7, T-47D, Sk-Br-3, MDA-MB-231, BT-474, UACC 893 and ZR-75-30. Pooled normal female DNA was used as reference for all array CGH experiments (Novagen, Mississauga, ON). DNA was quantified using a NanoDrop ND-bOO spectrophorometer (NanoDrop Technologies, Wilmington, DE).  74  4.2.2. Array CGH The seven cell lines were assayed for genetic alterations using a whole genome tiling path BAC array in comparative genomic hybridization experiments. The submegabase resolution tiling set (SMRT) array contains 32,433 overlapping BAC derived DNA segments that provide tiling coverage over the human physical genome map. All clones were spotted in triplicate resulting in 97,299 elements over two sides [29-31]. A detailed protocol has been included as supplemental material (S-i).  4.2.3. Imaging and Analysis Hybridizations were scanned using a CCD based imaging system (Arrayworx eAuto, Applied Precision) and analyzed using SoftWoRx Tracker Spot Analysis software. were applied to filter spot intensity data.  Stringent criteria  A standard deviation greater then 0.075 between  triplicate spots was deemed unreliable and thus excluded from our analysis [29]. Only data points with a signal intensity to background intensity noise ratio exceeding 15 were use in the analysis.  Custom software (SeeGH) was used to visualize log2 ratios of clones with respect to location in the genome [32]. SeeGH is publicly available at http://www.arraycgh.ca. Due to the complexity of the genomes of these cell lines with respect to ploidy, we have set thresholds for high level gains and losses to +0.8 and -0.7 respectively in order to limit the number regions for discussion. This threshold encompasses high level or multi-copy changes previously reported in literature while excluding the abundant number of low level or single copy changes common to these cell lines. The complete data set has been made publicly available for further inquiry. In addition, only those loci containing two or more altered overlapping clones were included in analysis to reduce false positives while breakpoints were confirmed using the publicly available aCGH smooth software (http ://www. few.vu.nl/vumarray/) [33].  75  4.2.4. Fluorescence in situ Hybridization  For fluorescence in situ hybridization (FISH) probe synthesis, DNA samples from BAC clones RP11-118L18, RP11-419H8, RP11-813P3 and RP11-790113 were amplified using a modified ligation mediated PCR protocol as previously described [31]. Imaging and analysis was performed as previously described [34].  4.3. Results and Discussion 4.3.1. Whole Genome Tiling Path Analysis of Segmental Alterations  SMRT array CGH technology provides a tool to comprehensively assess genomic aberrations in great detail.  Comprehensive genomic profiles of segmental gains and losses for seven  commonly used breast cancer model cell lines were revealed using this technology. Due to the large amount of data generated, we present the complete genomic profiles and frequency analysis as supplemental information (Figure 4.4.1 and karyograms: S2-BT474, S3-MCF7, 54-T47D, S5SKBR3, 56-MDAMB23 1, S7-ZR7530, S8-frequency plot of 7 lines).  The raw data of the  97,299 spot signal intensity ratios for each array CGH experiment have been made publicly available at www.arrayCGH.ca and also deposited at the GEO database at NCBI, series accession number GSE3 106.  Figure 4.4.1 demonstrates the details of a tiling path SeeGH karyogram, summarizing SMRT array CGH results for cell line UACC-893. Whole chromosomal arm gains can be seen at lq, 5p, 7p, 8q and lOp while arm losses are evident at 3p, 4p, 5q, 8p, 13q, l’7p, l9p, 20p and Xp. Smaller segmental changes such as the telomeric gained region of 6p or loss at lOq are readily detected. Complex alterations indicating multiple levels of change are denoted by higher level peaks embedded within a region of change, for example the central region of the 2p arm. The magnified display of 1 7q demonstrates the identification of a discrete CNA. Beginning at the centromere, we can see two regions of segmental loss separated by a high copy number amplicon containing the ERBB2 gene. The centromeric breakpoint of this amplicon is located between the overlapping regions of clones RP11-25P3 and RP11-592L16 while the telomeric breakpoint is located between clones RP11-686E5 and RP11-259G21. The second region of segmental loss at I 7q2 1.1 -q2 1.31 is followed by a large segmental gain and a second discrete multiple copy amplification at 17q25.1. 76  To establish detection sensitivity, we first examined previously reported regions of CNA. Our data indicated high level gains at the C-MYC locus in SK-BR-3 and MCF-7 (+2.84 and +1.19 log2 ratios respectively) corresponding to previous reported copy number change [7, 35, 36]. Similarly, BT-474, ZR-75-30, UACC 893 and SK-BR-3 are known to harbor a high level amplification of the ERBB2 locus. SMRT aCGH, in addition to detecting the ERBB2, revealed several additional discrete changes on the 17q arm in these cell lines. In another example, a previously reported homozygous deletion at 3q13.31, detected by a 10K Single Nucleotide Polymorphism (SNP) array in MCF-7, yielded a log2 ratio of-1.2 in our SMRT aCGH analysis [37]. Further comparison of SNP data and SMRT aCGH for cell line BT-474 showed that many of the alterations detected by SMRT aCGH were not clearly delineated or not detected by the SNP platform. While SNP arrays offer the advantage of genotype data, they are only suited to detect large scale copy number changes. However, the two technologies are clearly complementary as each is designed to address a different question.  Six of the seven cell lines (not MDA-MB-23 I) were previously profiled for genomic alterations using a 6,691 gene cDNA microarray [28]. Pollack et al. showed numerous genomic alterations, both gains and losses, which were correlated with expression patterns on the same array platform. All the CNAs reported were detected by SMRT array CGH, along with the discovery of numerous novel alterations when re-evaluated at tiling path resolution. Known and novel CNAs for the seven genomes are summarized in Table 4.1. Interestingly, not all CNAs contain annotated genes, consistent with the fact that the annotation of coding and non-coding transcripts within the human genome sequence is an on-going process.  4.3.2. Novel Features of the Genome of Model Cell Lines  Amongst the seven cell lines, 75 regions of high level (multi-copy) segmental gains and 48 regions of multi-copy loss were identified. Since these cell lines serve as model systems for investigating breast cancer biology, a detailed understanding of their genetic alterations is essential to the interpretation of studies using these cell lines. We will first describe features noteworthy of the individual genomes and then compare across multiple profiles in order to identify common alterations. 77  4.3.3. MCF-7 Genome The MCF-7 genome harbors 21 high level CNAs, summarized in Table 4.1. Remarkably, many of the previously reported regions of genetic alteration split into multiple segments upon tiling resolution analysis. The ipl3 amplification previously described [38] in fact divides into three distinct segments of high level amplifications: a 1300 Kb segment at ipi3.3, containing only two genes, arginine N-methyltransferase-6 (PMRT6) and netrin Gi (NTNGJ);  a 300 Kb  segment at ipi , encompassing a single gene, potassium voltage-gated channel subfamily D 2 . 3 member (KCND3); and a 1300 Kb region at the centromeric end of ipl3.2, containing 20 genes including BCAS2 which has been shown to be amplified and over expressed in breast cancer cell lines and tumours (Figure 4.4.2) [38-40]. Although a loss at 4p15-qter has been reported, [14] we observed a 7 Mb loss at 4q34.3-q35.2. The same group also reported an lip loss; however our data show that this alteration represents a large 45 Mb segment at lipi5.5-pll.2 and an adjacent but distinct 2Mb loss at 1 lip 1.2. Likewise, amplifications at the distal end of 15q [13, 14] were fine mapped to reveal a 4.9 Mb high level gain at 15q21.i-q21.3 encompassed by clones RP1 i-416B20 and 664B9 containing FGF7, CYPJ9AJ and MAPK6. A lower level gain was also observed at 15q22.2-qter.  4.3.4. BT-474 Genome BT-474 possesses the greatest number of high level gains and complex alterations and has previously been profiled using the SMRT aCGH platform [29]. Briefly, the lq arm showed multiple rearrangements. A complex aberration at 1q21 .2-q25. 1 is highlighted by 3 peaks of high level gain: 1q21.2-q21.3 (350 kb), 1q22-q23.l (500 kb) and 1q24.2 (550 kb). In addition, two previously undocumented, distinct regions of gain were identified at 1q31.3 (1650 kb) and 1q32.1 (950 kb). Figure 4.4.3A shows FISH verification of the 1q32.1 amplicon. Although a 1q42-qter gain has been previously reported for BT-474 [14] we observed four separate regions of high level gain: 1q42.12-q42.13 (500 kb), 1q43 (450 kb), 1q44-q43 (850 kb) and 1q44 (1700 kb).  A 1 1q13-ql4 gain was redefined by SMRT array CGH as a complex high level  amplification at 1 1q13.i-13.5 (19.8 Mb) containing two distinct and localized high level peaks at liql3.i (700 kb) and 11q13.4(iO5Okb).  78  In addition to fine mapping of regions previously reported, several prominent novel alterations were detected: high level gains at 4q21.1 (2700 kb), 9pl3.3 (2050 kb), 1 1q22.1-q22.2 (3600 kb), 14q11.2-q21.1 (21 Mb) and 14q31.3-q32.12 (3100 kb).  Gains of 20q have been well  documented in breast cancer [13, 20, 23, 41]. In BT-474 we observed four distinct segments with increased copy numbers: 20q11.22 (1.3 Mb), 20q13.11-q13.32 (14.8 Mb), 20ql3.33 (300 Kb), 20q13.33-tel (1.4 Mb).  Prefoldin 4 (PFDN4) located within 20q13.1 1-13.32 has been  shown to be overexpressed in those cell lines it is amplified in, including BT-474 [18]. This chromosome arm also harbors regions of loss at 20q1 1.22 (650 kb) and 20q1 1.23-13.11 (7150 kb) that have not previously been reported.  4.3.5. ZR-75-30 Genome  In total, 11 high level losses and 13 high level gains were identified in ZR-75-30. Multiple discrete alterations were observed on chromosome arms frequently implicated in breast cancer including: ip (four deletions), 8q (eight amplicons), 17q (seven amplicons and 4 deletions). Novel segmental losses of varying sizes were detected at 4q21.1 (150 kb), 11q13.5-qter (57.6 Mb), 21q11.2-q22.11 (16.3 Mb). The discrete high level amplifications on 8q at 8q11.21 (700 kb), 8q13.3 (500 kb), and 8q22. 1 (700 kb) encompassed interesting gene loci such as Protein Kinase DNA-activated Catalytic Subunit (PRKDC), which may have a role DNA repair and non homologous DNA end joining, Transient Receptor Potential Cation Channel Al (ANKTM]), which when overexpressed, affects normal eukaryotic cell growth; and Cadherin 17 (CDH] 7), which shares structural features with the cadherin superfamily of calcium-dependent cell-cell adhesion proteins [42-45].  79  4.3.6. UACC 893 Genome High level gains at 1 1q13-q14 have been documented in UACC 893 [14]. We also observed this alteration (1 1q13.3-q 14.3, 24.7 Mb,) however an additional discrete high level gain at 1 1q22.l (600 kb) was also discovered, which interrupts a portion of the gene locus Contactin 5 (CNTN5), a neural adhesion molecule. A novel gain at 7p2l.l (700 kb) was also detected that encompasses several gene loci including: Anterior Gradient 2 (AGR2) and Breast Cancer Membrane Protein (BCMP]). AGR2 has been shown to positively correlate with estrogen receptor expression and negatively with epidermal growth factor receptor expression in breast cancer tissue [46]. A loss at l6p12.l (1400 kb) was also observed.  4.3.7. SK-BR-3 Genome 3p22-pter amplifications in SK-BR-3 have previously been reported [13, 14]. We observed a 400 kb amplification at 3p22.2 as well as two novel regions of high level amplification at 3q25.1 (700 kb) and 3q22.3-q23 (2000 kb).  Figure 4.4.3B shows FISH confirmation of this  amplification. Genetic alterations of 8q appear to be complex in SK-BR-3. We observed the three previously reported regions of gain at 8q13.2-q21.13 (10.6 Mb), 8q21.2-q21.3 (6 Mb) and 8q23.2-q24.21 (17 Mb). However, we also identified three distinct amplicons within the 6 Mb region: 8q21.2 (300 kb), 8q21.3 (550 kb) and 8q21.3 (500 kb), as well as four distinct high level peaks within the 17 Mb gain described above: 8q23.3 (750 kb), 8q23.3 (350 kb) 8q24.12 (800 kb) and 8q24.21 (700 kb, contains C-MYC). We also observed 4 regions of deletion not previously reported on 8q: 8q21.3-q22.1 (6 Mb), 8q22.3-q23.1 (4.9 Mb), 8q24.22 (1.6 Mb), 8q24.23-q24.3 (3.8 Mb).  In addition to chromosome 3 and 8 losses, our analysis has also  identified novel regions of loss at 12q23.3-q24.11 (1.4 Mb) and 12q24.21-q24.31 (5.4 Mb) and further delineated a I 7q 12 gain into two distinct high level gains at 1 7q 11.1-11.2 (3.2 Mb) and 17q12-21.2 (3.4 Mb). Also previously reported gain of 17q24-qter fine mapped to a 1550 kb amplicon at 17q25.3 [13, 14].  80  4.3.8. MDA-MB-231 Genome MDA-MB-23 1 possessed the fewest number of high level alterations. 6p gains have previously been reported [14, 47]; however, two distinct regions of high level gain were observed within this arm in our analysis at 6 3 l 2 p . 1-21.2 (3.5 Mb) and 6p2l.2-2l.l (3.3 Mb). We also observed a novel 670 Kb gain at 7q35. 9p loss has also been reported; however, we were able to discern two distinct segmental losses each containing an amplicon [47-49].  4.3.9. T-47D Genome T-47D was unique in that it possessed three times as many genomic losses than gains. We observed gains at l8p1 l.32-pl 1.32 (350 kb) and 18q21.1 (300 kb) that have not previously been reported [14, 36, 47, 49]. Only five genes reside within the 18q21.1 region: Protein Inhibitor of Activated STAT2 (PIAS2), Elongin genes TCEB3L2 and TCEB3L and hypothetical genes DKFZP564D1378 and HSPCO39.  4.3.10. Common Regions of Copy Number Alteration Gains at 8q, 1 7q and 20q are among the most frequently documented alterations in breast cancer. All cell lines investigated showed high level gains at one or more of these chromosome arms with the exception of MDA-MB-23 1. Multiple alignment of genomic profiles delineated novel minimum altered regions (MARs) common to these cell lines.  8q gains are arguably the most frequently documented alteration in a variety of cancers including breast and prostate cancer [5]. We have highlighted four that were common to multiple cell lines (S 10). First, a discrete 500 kb amplicon at 8q13.3 in ZR-75-30 is also included within the larger alteration at 8q13.33-q21.13 in SK-BR-3. Only one gene resides within this MAR: Transient Receptor Potential Cation Channel Subfamily A, Member 1 (TRAPAJ).  Hyman et al  investigated 14 breast cancer cell lines including BT-474, MCF7, SK-BR-3, T47D and ZR-75-30 using a 13K cDNA array identifying four independent genomic amplicons at 8q, including 8q21.11-q21.13, 8q21.3, 8q23.3-q24.14 and 8q24.22. However, the distinct amplicon at 8q13.3 in ZR-75-30 detected by SMRT aCGH was missed in this study. We observed a second larger MAR at 8q21.2-q21.3 common to alterations in MCF7 and SK-BR-3. Approximately 20 genes reside in this 5 Mb region including E2F Transcrzption Factor, exonuclease GOR and Matrix 81  Metatloproteinase 16. A third MAR is located at 8q24.12-q24.21 and is common to MCF-7, ZR-75-30 and SK-BR-3 while lower level gains are apparent in BT-474, UACC-893 and MDA MB-23 1.  Although Zinc Finger Transcription Factor (TRPSI) and Eukaryotic Translation  Initiation Factor 3 (EIF3S3) are excluded from this MAR (C-Iv[YC is included), some of the cell lines possess highly complex gains that extend through a much larger region of the arm and can include the TRPSI and EIF3S3 loci. Savinainen et al reported 41 copies of TRPSJ and 21 copies of EIF3S3 and MYC in Sk-Br-3.  The fourth and most telomeric MAR, 8q24.3, has boundaries  defined by a peak of high level change within the large complex alteration 8q22.2-q24.3 found in ZR-75-30. MCF-7, BT-474 and UACC-893 share low level gains within this region of approximately ten genes.  Chromosome 17q gains have been well documented in both breast cancer cell lines and clinical cases [14, 15, 21, 37, 48, 50].  Re-examination of this chromosome arm at tiling resolution  suggests that the 17q amplification is complex and involves multiple but distinct regions (Figure 4.4). First we identified a common high level gain at 17q25.1 containing a narrow MAR of 760 kb bounded by BAC clones RP1 1-76G4 and RP1 l-552F3. RECQ Protein-like 5 (RECQL5), H3 Histone Family 3B (H3F3B) and Growth Factor Receptor-bound Protein 2 (GRB2) reside within this gene-rich region, with GRB2 shown to interact with EGFR [51]. Secondly, at 17q23, two separate amplicons in MCF-7 and one large amplicon in BT-474 have been described previously, though it is unclear if these amplicons are overlapping and harbor the same candidate oncogene [25, 52]. Our data show the presence of a large complex alteration in MCF-7 at 17q21.32-q24.3 with a high level amplification at 1 7q23 .2. BT-474 contained two regions of complex alterations at 17q21.32-q23.2 comprised of three distinct high level peaks as well as a single peak at 17q24.1-q24.3 with a single peak. Similarly, three regions of high level gains were observed in ZR-75-30 and one large region of lower level gain in UACC-893. Interestingly, our alignment revealed that the high level peaks involving the 17q23.2 region in MCF-7, BT-474 and UACC893 do overlap, defining a 800 kb MAR from RP1 1-50F16 to RP1 l-653P10 containing candidate genes RPS6KB], LOC5J]36, FLJ22087, CA4, NY-REN-60, APPBP2 and PPM1D.  Another striking feature identified through our tiling resolution scan of 1 7q is the overlapping 82  amplicons at 17q21.32-q2 1.33 present in BT-474 and ZR-75-30. The 600 Kb MAR from RP1 171 G24 to RP 11-60007 harbors the HOXB family (HOXBJ-9). Previously described by Hyman et al, this amplicon is shown to be present in 10.2% of primary breast cancers suggesting the involvement of developmental genes in breast cancer pathogenesis (Figure 4.4) [26].  Chromosome arm 20q has been shown to be frequently amplified in breast cancer and amplification of 20q 13 is associated with aggressive tumor phenotype, disease recurrence and reduced survival duration [20].  We identified multiple copy number alterations within this  cytoband and defined distinct minimal regions of alteration (S 11). The detection of a 1.5 Mb MAR at 20q13.2 in MCF-7, BT-474 and SK-BR-3 (RP11-20J8 to RP11-346B3) containing the genes Zinc Finger Protein 217 (ZNF2] 7), Breast Cancer Ampflfied Sequence 1 (BCAS]), Cytochrome P450 24A1 (CYP24A]), Prefolding 4 (PFDN4) and Docking Protein 5 (DOK5) is consistent with previous CGH studies which identified amplification of this region in breast cancer [18, 19]. Likewise, we identified a MAR at 20q13.31 from RP1 l-44A6 to RP1 1-671P16, containing Bone Morphogenic Protein 7 (BMP7), SPOil and RNA Export 1 (RAE]) corresponding to a previous report in MCF-7 and BT-474 [411. A large 1.5 Mb amplification at 20q13.12 has also been reported in MCF-7 and BT-474. [41].  Our analysis identified an  amplification at 20q13.12-q13.13 common to MCF-7, BT-474 and SK-BR-3. This spanned from BAC clones RP1 1-702E3 to RP1 1-637D22 defining a narrow 680 kb MAR implicating Protein Kinase C Binding Protein (PRKCBPJ) and Nuclear Receptor Coactivator (NCOA3) as potential oncogenes relevant to breast cancer.  4.3.11. EGFR (ERBB1) and Associated Pathways The EGFR and associated pathways play an important role in several aspects of mammalian cell growth such as cell survival, proliferation and differentiation [53, 54]. The receptor family is composed of four type-i tyrosine kinases (ERBB 1-4) that dimerize upon stimulation by ligand and initiate downstream signaling. Receptor ligand recognition is redundant to some extent and affinity varies.  Although ERBB2 has no known ligand, it becomes activated upon  heterodimerization with other ERBB family members, the most preferred and potent combination being with ERBB 1, while the ERBB3 homodimer remains inactive [55].  83  The redundancy of this pathway suggests its importance as cells have invested to failsafe this regulatory pathway. We have investigated genomic loci for approximately 60 genes implicated in this pathway (Table 4.2) [54]. Overall, gains were 2.4 times more frequent than losses and all cell lines contained at least three loci of change. Our data revealed that, as anticipitated, the ERBB2 locus is highly amplified in four cell lines (UACC-893, ZR-75-30, BT-474 and SK-BR  3), and over expression has been demonstrated in two of them [9]. While amplification of EGFR interacting genes RECQL5, H3F3B and GRB2 has been described above, other frequently altered loci include C-MYC, LIMK], PRCKA, CHN2, ERBB2, PYK2, MAP2K3, M4P2K3 and PLGJ.  Interestingly, T-47D and the two ERBB2 over expressing lines, BT-474 and SK-BR-3, share amplifications at five gene loci: M4P2K6, CHN2, PRKCA, LIMK1 and C-MYC.  4.4. Conclusion We examined the genomes of seven commonly used breast cancer cell models in unprecedented detail for segmental copy number status, cataloguing the boundaries of gains and losses throughout these genomes. In addition, we demonstrated that copy number alteration of multiple genetic loci involved in the EGF family of pathways is common in these cell lines which suggests disruption of this frequently dys-regulated pathway in breast cancer may occur at several points in the signalling cascade and may occur in combination.  Furthermore, since these cell lines serve as models for studying breast cancer molecular biology, it is essential to take into account the potential influence of genetic alterations when interpreting biological data. For example, using these lines to study the EGF family of pathways, multiple endogenous genetic alterations may play a role in biochemical and biological observations. Our work provides a comprehensive list of high level segmental gains and losses for each genome providing a database of copy number alterations as a resource for breast cancer research using these cell lines.  84  Table 4.1. High Level Alterations Detected by Array CGH Cell Lines MCF7  Locus  Start Clone  End Clone  Size (kb)  N0451114 N0099M15 N0626F04 N0669F02 N0133G02 N04 I 6B20 N0716B04  N0228E23 N0795009 N0517B05 N0589G04 N0315E09 N0664B09 N0203A19  1,290 288 1,330 3,180 43,300 4,930 113,200  Locus  Amplifications  1p13.3 1p13.2 1pl3.2 3pl4.2-14.l 8q21.2-q24.21 I 5q21 .1-21.3 .3 4 17q23.2-2 17q23.2  peak:  peaks:  N0760B22  N0433B24  4,900  N0552F03 N0476P15  766 1,790  N0272C13  N0476115  17500  N0702E03  N0730020  1,790  20q13.13B  N0711M06  F0592G15  309  20q13.2C 20q1331D  N0020J08 iv’0044A06  N0346B03 N0671P16  1,450 411  N0094J12  N0457P07  8450  20:13 13A  21q22.13-  22.3(tel)  End Clone  N0747H24 N0442N05 M2007C03 p N0412M16  N0362H1 1 N0746B09 M2258B24 p M2326E01  Size (kb)  Deletions  N0076G04 N0385G02  17q25.l 20q12 20q13.1220q13.33(tel)  Start Clone  3q13.31 4q34.3-35.2 6q25.2-27 8p arm lIpl5.5-II.2 I 11 .2 1 IqI 1-q12.1 1 1q14.2-23.3 11q23.3-25(tel) 13q14.2-34(tel)  N0070A09  M2326E0 I  N0010E21 N0282G16 N0196E01 N0155D15  F0627109 N0004N09 N0715D10 M2323L19  700 7,100 14,400 p 45,300 I 4,300 30,000 15,400 66,900  N0607H20 N0552D03 N0484P15  N0208F21 M2200G17 N0710L06  5030 949 624  N0196P12 N0771D19  N0278E15 M2014K24  4090 13800  N0071G24  N0607J-113  2500  N0515J20  N0473G17  2180  N0399018 N0583F02  N0767P09 N0693H1 1  1170 5300  BT474 Amplifications  peaks:  1q21.2-q25.1 1:21.2-q21.3A 2 1 2 9 3.JB 224.2C 9 1 1q31.3  N0035F14 N0647N20 N0137J06 N0662E13  1’10714N02 N0740J19 N0616K15 N0141E20  1q32.l  N0783D13  N0617D19  1q42.12-  1q43 1q44-q43  peaks:  Amplifications  366 oeaks: 510 540 1661 936  17q21.32-23.2  17cj2132 21.33A 9  M2185P06  M2016D17  500  17q22-23,2B  N0236L13 N0440F10  N0614N14 N0794A13  449 865  17q23.2C  1q44  N0778E23  N0071K05  1720  33 p 4 . 6 1-lS. 1 4q21.1 9pl33  N0270103 N0598G02 N0069E18  N0652B07 N0772N01 N0795P12  1210 2700 2060  9q33.l-34.13  M2248M11  N0738114  12600  11q13.1-13.5 11q13IA  N0813P09 N0029K11  1:l3.4 B  11q22.l-2 . 2 14q11.2-q21.1 14q31.3-32.12  oeaks:  15q1 1.2 -q12 15:11.2 A 15q11.2B 17q12-21.2  N0360N22  19800  N0093M11  N0804F01 N0598G03  704 1030  N0795D03 N0597A11 N0771008  N0347H03 N0254B15 N0325L17  3560 21,230 3080  17q24.1-24.3 peak:  17:24.1  N0583F02  N0394K10  482  N0295M02 N0134G22 N0601G07  N0441D06 N0022El5 N0552G16  6310 800 1300  N0809G24  N0261P09  14800  N0648D07 N0305P22  N0694110 N0134L13  305 1380  N0129E04 N0709J21 N0171G22  N0291024 M22024J17 N0774C15  5390 3200 670  N0712N14  N0464F07  7140  17q22-23.2  N0349F01  N0153J08  5600  1723.2.4 17:23.2 B 17q23.3-24.1 Deletions 1p36.33-35.3 1p35.2-35.1 3 . 34 ‘p 1p21.2-I3.3 4q21.1  N0159D12 N0381A05 N0614F05  M2023J07 N0634F05 N0712A10  298 3560 481  N0206L10 N0629A12 N0452M14 N0293L15 N0184F15  N0758C04 N0068H10 N0020P17 N0813H10 N0184J11  26700 775 1510 10800 151  N0149B15  M2270L17  57600  N0082D14 N0278E15 N0368A16  N0I04J23 N0046122 N0379H09  6020 7700 994  19p13.2-l3.12 2Opl2.I 20q11.22  20q13.33 20q13.33-tel  Deletions 5q14.l-14.3 6q24.1 20q11.22 Opll. 3 2 -  ZR 75 30 Amplifications  peaks:  Amplifications  8q11.21  N0569108  N0513013  8q13.3 8q21.2 8q21.3 8q22.l 8q22.2-24.3  N0367C12 N0319A24 N0565H20 N0381107 N0281D17  N0634L17 N0317JI0 NOIO2DO3 N0103P22 N0620H01  8q23.JA 24.12B 9 8 8q24.22C  N0078J05 N0760H22 N0100J23  N0357L07 N0088J18 N0422120  8cj24.3D  N0662P06  F0530P15  440 2740  17q1I.I-II.2  N0260A09  N0I47N18  3050  I7q12 I7q12 17q12-21.I  N0600K04 N0722D15 N0689B15  N0560P04 N0062P03 N0032H06  980 1360  682 538 oeaks: 1350 264 729 45600 907  1440  490  85  I7qII.2-12 I7q21.I-21.32 I7q23.33-22  17q21.3221.33  N0771D19  M2190C10  3910  17q22  N0466D20  N0515J20  21q11.2q22.11  N0615H23  N0694N16  16300  N0118A14 N0104F06 p  N0614L12 N0674B07 p  995 1380 p  N0062P03 N0305C04  N0606M07 N0781F24  3390 1550  p  p  881  UACC  893 Amplifications 7p2l.l 11q133-q14.3 11q22.1 17q12-21.2  N0116D07 M2011L13 N0743115 N0600J16  N0746H13 N0613M07 N0659E10 N0686E05  I 7q2 1.33-24.2 17g25.1  N0095M07 N0076004  N0068K09 N0449J21  1490 17700 999  3p22.2 3q25.1 3q22.3-q23  N0091E04 N0739J07 N0657M13  N0325M12 N0118L18 N0718H02  432 678 2000  3q26.2-q26.31  N0190105  N0590E04  3300  7q31.1-q32.3 8q13.3-21.13 8q21.2-21.3  M2023N18 N0746L20 N0509F16  N0019B03 N0125J17 N0561A10  17800 10600 5950  N0694L21 N0129P07 N0627A06  N0606L]6 V0196C06 1 V0529J09 1  309 559 499  Deletions llplS.l l6pl2.I 17p arm  733  24700 619  SKBR3 Amplifications  peaks:  8q21.2A 8q21.3B 8q21.3C  8q23.2-24.21  Amplifications  17q12-2L2 17q25.3 2Op arm 20q11.22-  p  N0454F1 I  N0254N13  4510  N0434N22  N0290D09  16200  8q21.3-22.1 8q22.3-23.1  N0230C03 N0739L19  N0804A07 NOO2SPII  6040 4920  8q24.22 8q24.23-24.3  N0015L05 N0467B24  N1147M08 N0613F12  1570 3810  10q22.3-25.2  N0635P19 N0061L24  N0257E05 M2155D19  30900 1380  20q12-13.32 Deletions  NOl 14008  N0294P07  17100  8q23.3A  N0058003  N0531C21  746  8q23.3B  N0164M09  /v12118B16  365  N0412G23  N0138116  5360  8q24.12 C 8q24.21D  N0389M07 N0288B17  N0047A23 N0294P07  821 716  17q11.2-12 17q21.2-23.2  N0634A23 N0400F19  N0607B02 N0767P09  6460 18100  10q21.1-22.3 1 q 4 31.3-32.12 2 17g11.l-ll.  N0195P24 N0046B20 N0458L21  N0506E07 N0386D20 N0193M11  25500 7020 3170  18q12.2-21.2 19pl3.Il  N0645123 N0715L15  N0664P08 N0723M22  20400 4710  Amplifications 6p21.3l-2l.2  N0479F12  N0450J18  3510  Deletions 8q24.13  N0346M14  N0391A17  1020  6p2l.2-2I.l  N0259H15  N0769C16  3290  9p24.2-22.2  N0654D08  N0460F23  14500  7q35  N0703N05  N0340G20  670  peaks:  12q23.3-24.11  12q24.21-  MDA  MB 231  peak:  N0141K07  N0460F23  2000  N0066P03  N0486D12  5210  N03151]4  N0730N17  280  Deletions l2pl3.32-l2.3 1.32  N0312A03  N0056F05  11700  N0619C21  N0193E15  349  18q21.1 Xq  N0699023 q  N0093N16 q  311  p 9 2 3 . 2 2  I. p 9 1 2 3 . peak:  9p21.3  T47D Amplifications 3q26.2-29(tel)  N0415B12  M2110L16  28700  11q233-25(tel)  N0081113  M2013A02  15200  Deletions 8p23.1  N0485105  M2185F10  8p23.l  N0367E11  N0801121  1000  704  High Level peaks within complex alterations are denoted by italics. Alterations not previously characterized are in bold. Clones beginning with NO or FO belong to the Roswell Park Cancer Institute library-il (RP11) and RP13 respectively, while those beginning with M belong to the Caltech-D (CTD) library.  86  Table 4.2. Components of the EGFR pathway affected by copy number change Gene  Locus  EGF  4q25  TGFA  2pl3.3  NRG2  5q31.2  NRGI  8pl2  EGFR  7p12.3p12.1  ErBB2  17q12  ErBB3  12q13  ErBB4  2q33.3q34  SH3KBP I  Xp22.12  RASA2  3q23  VAV2  9q34.2  GRB2  17q25.I  PLCGI  20q12  PLCG2  16q23.2  RNTRE  lOpl4  SSH3BP  IOpl2.l  SOSI  2p22.l  PTK2B  8p2l.2  SRC  20q 11.23  NRAS  lpl3.2  CDC42  1p36.l2  RACI  7p22.l  MCF-7  BT-474  ZR-75-30  UACC893  Sk-Br-3  MDA MB-231  T47D  + -  -  + --  --  -  + +++  + +++  +++  +++  --  + +  +  + -  ++ +++  +  -  -  + +  -  + + -  -  -  --  --  +++  --  +++  ++  -  -  +  RINI  11q13.2  RAFI  3p2S.2  MAP3K4  6q26  MAP3K1 I  11q13.1  PAKI  11q13.5  +  ADAM9  8pI 1.23  +  ADAMI2  10q26.2  ADAM17  2p2S.I  RAB5a  3p24.3  + + --  + -  +  ---  -  + +  87  MAP2KI  15q21  MAP2K7  l9p]3.2  MAP2K4  17q11.2  MAP2K3  l7pll.2  MAP2K6  17q24.3  +++  PDPK1  16p13.3  +  CHN2  7pl5.I  +  PRKCA  17q24,2  +++  ERK1  16p11.2  +  ERK2  22q11.21  -  -  IOqll.22 5q35.3  MAPKI4  6p2l.3I  AKTI  14q32.33  BAD  11q13.1  +++  LIMKI  7q1 1.23  +  PLDI  3q26.31  RPS6KA 3  Xp22.12  FOS  14q24.3  JUN  lp32-p3l  TP53  I7pI3.1  MYC  8q24  ELKI  Xpll.23 1 . 32 ‘p 4q25  + —  = =  -  -  -  +  +  +  ++  +  +  +  -  MAPK9  EGF  --  ++  MAPK8  JUN  --  + -  -  ++ +  +  +  +++  +++  +  +  ++  +++  +  -  + -  +  --  +++  ++  +++  -  -  -  + +  -  +  log2 ratio +0.4 to + 0.7, ++ Iog2 ratio —0.4 to 0.7, —  ——  = =  log2 ratio +0.7 to +0.99, +++ log2 ratio —0.7 to —0.99, ———  88  =  log2 ratio > +1.0 log2 ratio <—1.0  .1  14  I :  I  j:  :  chr2  chri  chr3  1 ¶‘  :  £4  j  f  chrl2  chril  .  chrl3  chrl4  “U,.  Segmental loss  /  Segmental gain  amplilication  chriS  chri6  -  .,  chrl7  ‘  —  t  i  Li  chriS  I  Segmental  — High level  LH4 ?  /  chrlO  ;..  chr6  1  :  chr9  chrB  j t !‘  chr5  •  chr7  4i  chr4  chrl9  chrZO  chr2i  chr22  I  eb rX  Figure 4.4.1 Comprehensive SMRT aCGH profile of cell line UACC-893. Whole genome SeeGH karyogram UACC-893. Individual data points denote log2 ratios plotted to corresponding chromosomal location. Log 2 ± 0.5 scale bars are included for reference. Displacement of data points to the right and left of the centre line represents gain and loss respectively. Inset shows a magnified view of complex alteration on chromosome 1 7q.  89  -1.0  +1.0 •  •  1p21.1  •  •.  lp133  •  I 1  1p132  a I  I  lpl3J  1p12 a  al_A,  Figure 4.4.2 Magnified SMRT aCGH profile of ip2i.i-pii.i region in MCF-7. ±1 scale bars denote log2 ratio scale. Blue highlighted regions indicate locations of independent amplicons.  90  ()  <b)  .1_a  -ljO  +1.ci  +1.0  I -  I  -*  Figure 4.4.3 Fluorescence in situ hybridization (FISH) analysis in SK-BR-3 and BT-474 cells. A. BT-474 interphase FISH. Clone RP11-813P03 labeled in spectrum red is located at the peak of a 940 kb amplicon at 1q32.1 while clone RP11-790113 labeled in spectrum green is locate within an adjacent unchanged region. B. SK-BR-3 interphase FISH. Clone RP 11-11 8L 18 labeled in spectrum green is located within a 680 kb amplicon at 3q25.1 while clone RPll 4l9H08 labeled in spectrum red denotes an unchanged site at 3q11.2.  91  81-474  MCF-1 -10  -1.0  +1.0  78-7530  SK-8R-3  UAC-893  ÷1.0  -1.0  -10  +1.0  -1.0  +LU  ..  •  I i ,‘,•J.  -.  ....  4 Figure 4.4.4. 1 7q SMRT aCGH profile of five cell lines sharing multiple sites of minimum altered regions (MAR). ±1 scale bars denote log2 ratio scale. Blue highlighted regions indicate location of MARs.  92  4.5. References  1. 2. 3. 4. 5. 6.  7.  8.  9. 10. 11. 12.  13.  14.  15.  Parkin DM, Bray F, Ferlay J, Pisani P: Global cancer statistics, 2002. CA Cancer J Clin 2005, 55(2):74-108. Bray F, McCarron P, Parkin DM: The changing global patterns of female breast cancer incidence and mortality. Breast Cancer Res 2004, 6(6):229-239. Simpson PT, Reis-Filho JS, Gale T, Lakhani SR: Molecular evolution of breast cancer. J Pathol 2005, 205(2):248-254. Hanahan D, Weinberg RA: The hallmarks of cancer. Cell 2000, 100(l):57-70. Garnis C, Buys TP, Lam WL: Genetic alteration and gene expression modulation during cancer progression. Mol Cancer 2004, 3(l):9. Kallioniemi OP, Kallioniemi A, Kurisu W, Thor A, Chen LC, Smith HS, Waidman FM, Pinkel D, Gray JW: ERBB2 amplification in breast cancer analyzed by fluorescence in situ hybridization. Proc Natl Acad Sci USA 1992, 89(12):532 1-5 325. Shimada M, Imura J, Kozaki T, Fujimori T, Asakawa S, Shimizu N, Kawaguchi R: Detection of Her2/neu, c-MYC and ZNF2 17 gene amplification during breast cancer progression using fluorescence in situ hybridization. Oncol Rep 2005, 13(4):633-641. Jarvinen TA, Tanner M, Rantanen V, Barlund M, Borg A, Grenman S, Isola J: Amplification and deletion of topoisomerase Ilalpha associate with ErbB-2 amplification and affect sensitivity to topoisomerase II inhibitor doxorubicin in breast cancer. Am J Pathol 2000, 156(3):839-847. Lacroix M, Leclercq G: Relevance of breast cancer cell lines as models for breast tumours: an update. Breast Cancer Res Treat 2004, 83(3):249-289. Emens LA, Davidson NE: Trastuzumab in breast cancer. Oncology (Huntingt) 2004, 18(9):1117-1128; discussion 1131-1112, 1137-1118. Baselga J: Herceptin alone or in combination with chemotherapy in the treatment of HER2-positive metastatic breast cancer: pivotal trials. Oncology 2001, 61 Suppl 2:14-21. Vogel CL, Cobleigh MA, Tripathy D, Gutheil JC, Harris LN, Fehrenbacher L, Slamon DJ, Murphy M, Novotny WF, Burchmore M et al: First-line Herceptin monotherapy in metastatic breast cancer. Oncology 2001, 61 Suppl 2:37-42. Kallioniemi A, Kallioniemi OP, Piper J, Tanner M, Stokke T, Chen L, Smith HS, Pinkel D, Gray JW, Waldman FM: Detection and mapping of amplified DNA sequences in breast cancer by comparative genomic hybridization. Proc Nati Acad Sci USA 1994, 91(6):2156-2160. Kytola 5, Rummukainen J, Nordgren A, Karhu R, Farnebo F, Isola J, Larsson C: Chromosomal alterations in 15 breast cancer cell lines by comparative genomic hybridization and spectral karyotyping. Genes Chromosomes Cancer 2000, 28(3):308317. Forozan F, Mahlamaki EH, Monni 0, Chen Y, Veldman R, Jiang Y, Gooden GC, Ethier SP, Kallioniemi A, Kallioniemi OP: Comparative genomic hybridization analysis of 38  93  16. 17. 18.  19.  20. 21.  22. 23.  24.  25.  26.  27.  28.  breast cancer cell lines: a basis for interpreting complementary DNA microarray data. Cancer Res 2000, 60(1 6):45 19-4525. Davies JJ, Wilson IM, Lam WL: Array CGH technologies and their applications to cancer genomes. Chromosome Res 2005, 13(3):237-248. Pinkel D, Albertson DG: Array comparative genomic hybridization and its applications in cancer. Nat Genet 2005, 37 Suppl:S1 1-17. Collins C, Volik S, Kowbel D, Ginzinger D, Ylstra B, Cloutier T, Hawkins T, Predki P, Martin C, Wernick M et al: Comprehensive genome sequence analysis of a breast cancer amplicon. Genome Res 2001, 11(6): 1034-1042. Albertson DG, Ylstra B, Segraves R, Collins C, Dairkee SH, Kowbel D, Kuo WL, Gray JW, Pinkel D: Quantitative mapping of amplicon structure by array CGH identifies CYP24 as a candidate oncogene. Nat Genet 2000, 25(2):144-146. Hodgson JG, Chin K, Collins C, Gray JW: Genome amplification of chromosome 20 in breast cancer. Breast Cancer Res Treat 2003, 78(3):337-345. Orsetti B, Nugoli M, Cervera N, Lasorsa L, Chuchana P, Ursule L, Nguyen C, Redon R, du Manoir S, Rodriguez C et al: Genomic and expression profiling of chromosome 17 in breast cancer reveals complex patterns of alterations and novel candidate genes. Cancer Res 2004, 64(18):6453-6460. Albertson DG: Profiling breast cancer by array CGH. Breast Cancer Res Treat 2003, 78(3):289-298. Nessling M, Richter K, Schwaenen C, Roerig P, Wrobel G, Wessendorf S, Fritz B, Bentz M, Sinn HP, Radlwimmer B et al: Candidate genes in breast cancer revealed by microarray-based comparative genomic hybridization of archived tissue. Cancer Res 2005, 65(2):439-447. Kauraniemi P, Barlund M, Monni 0, Kallioniemi A: New amplified and highly expressed genes discovered in the ERBB2 amplicon in breast cancer by cDNA microarrays. Cancer Res 2001, 61(22):8235-8240. Monni 0, Barlund M, Mousses S, Kononen J, Sauter G, Heiskanen M, Paavola P, Avela K, Chen Y, Bittner ML et al: Comprehensive copy number and gene expression profiling of the 17q23 amplicon in human breast cancer. Proc NatlAcadSci USA 2001, 98(10):571 1-5716. Hyman E, Kauraniemi P, Hautaniemi S, Wolf M, Mousses S, Rozenblum E, Ringner M, Sauter 0, Monni 0, Elkahloun A et al: Impact of DNA amplification on gene expression patterns in breast cancer. Cancer Res 2002, 62(21):6240-6245. Clark J, Edwards S, John M, Flohr P, Gordon T, Maillard K, Giddings I, Brown C, Bagherzadeh A, Campbell C et al: Identification of amplified and expressed genes in breast cancer by comparative hybridization onto microarrays of randomly selected cDNA clones. Genes Chromosomes Cancer 2002, 34(1):104-114. Pollack JR, Sorlie T, Perou CM, Rees CA, Jeffrey SS, Lonning PE, Tibshirani R, Botstein D, Borresen-Dale AL, Brown P0: Microarray analysis reveals a major direct  94  29.  30.  31.  32.  33.  34.  35.  36.  37.  38.  39.  role of DNA copy number alteration in the transcriptional program of human breast tumors. Proc Nati Acad Sci US A 2002, 99(20): 12963-12968. Ishkanian AS, Malloff CA, Watson SK, DeLeeuw RJ, Chi B, Coe BP, Snijders A, Albertson DG, Pinkel D, Marra MA et al: A tiling resolution DNA microarray with complete coverage of the human genome. Nat Genet 2004, 36(3):299-303. de Leeuw RJ, Davies JJ, Rosenwald A, Bebb G, Gascoyne RD, Dyer Mi, Staudt LM, Martinez-Climent JA, Lam WL: Comprehensive whole genome array CGH profiling of mantle cell lymphoma model genomes. Hum Mol Genet 2004, 13(17):1827-1837. Watson SK, deLeeuw RJ, Ishkanian AS, Malloff CA, Lam WL: Methods for high throughput validation of amplified fragment pools of BAC DNA for constructing high resolution CGH arrays. BMC Genomics 2004, 5(1):6. Chi B, DeLeeuw RJ, Coe BP, MacAulay C, Lam WL: SeeGH--a software tool for visualization of whole genome array comparative genomic hybridization data. BMC Bioinformatics 2004, 5(1):13. Jong K, Marchiori E, Meijer G, Vaart AV, Ylstra B: Breakpoint identification and smoothing of array comparative genomic hybridization data. Bioinformatics 2004, 20(1 8):3636-3637. Henderson U, Lestou VS, Ludkovski 0, Robichaud M, Chhanabhai M, Gascoyne RD, Kiasa RJ, Connors JM, Marra MA, Horsman DE, Lam WL.: Delineation of a minimal region of deletion at 6q16.3 in follicular lymphoma and construction of a bacterial artificial chromosome contig spanning a 6-megabase region of 6q 1 6-q2 1. Genes Chromosomes Cancer 2004, 40(1):60-65. Savinainen KJ, Linja MJ, Saramaki OR, Tammela TL, Chang GT, Brinkmann AO, Visakorpi T: Expression and copy number analysis ofTRPS1, EIF3S3 and MYC genes in breast and prostate cancer. BrJCancer 2004, 90(5):1041-1046. Rummukainen J, Kytola S, Karhu R, Farnebo F, Larsson C, Isola JJ: Aberrations of chromosome 8 in 16 breast cancer cell lines by comparative genomic hybridization, fluorescence in situ hybridization, and spectral karyotyping. Cancer Genet Cytogenet 2001, 126(1):1-7. Zhao X, Li C, Paez JG, Chin K, Janne PA, Chen TH, Girard L, Minna J, Christiani D, Leo C et al: An integrated view of copy number and allelic alterations in the cancer genome using single nucleotide polymorphism arrays. Cancer Res 2004, 64(9):30603071. Maass N, Rosel F, Schem C, Hitomi J, Jonat W, Nagasaki K: Amplification of the BCAS2 gene at chromosome lpl3.3-2l in human primary breast cancer. Cancer Lett 2002, 185(2):219-223. Qi C, Zhu YT, Chang J, Yeldandi AV, Rao MS, Zhu YJ: Potentiation of estrogen receptor transcriptional activity by breast cancer amplified sequence 2. Biochem Biophys Res Commun 2005, 328(2):393-398.  95  40.  41.  42. 43.  44.  45.  46.  47.  48.  49.  50.  51.  Nagasaki K, Maass N, Manabe T, Hanzawa H, Tsukada T, Kikuchi K, Yamaguchi K: Identification of a novel gene, DAM1, amplified at chromosome ipi3.3-2i region in human breast cancer cell lines. Cancer Lett 1999, 140(1-2):219-226. Lapuk A, Volik 5, Vincent R, Chin K, Kuo WL, de Jong P, Collins C, Gray JW: Computational BAC clone contig assembly for comprehensive genome analysis. Genes Chromosomes Cancer 2004, 40(1):66-71. Falck J, Coates J, Jackson SP: Conserved modes of recruitment of ATM, ATR and DNA PKcs to sites of DNA damage. Nature 2005, 434(7033):605-61 1. Jaquemar D, Schenker T, Trueb B: An ankyrin-like protein with transmembrane domains is specifically lost after oncogenic transformation of human fibroblasts. JBiol Chem 1999, 274(1 1):7325-7333. Dantzig AH, Hoskins JA, Tabas LB, Bright 5, Shepard RL, Jenkins IL, Duckworth DC, Sportsman JR, Mackensen D, Rosteck PR, Jr. et al: Association of intestinal peptide transport with a protein related to the cadherin superfamily. Science 1994, 264(51 57):430-433. Ma Y, Pannicke U, Schwarz K, Lieber MR: Hairpin opening and overhang processing by an Artemis/DNA-dependent protein kinase complex in nonhomologous end joining and V(D)J recombination. Cell 2002, 108(6):781-794. Fletcher GC, Patel 5, Tyson K, Adam PJ, Schenker M, Loader JA, Daviet L, Legrain P, Parekh R, Harris AL et al: hAG-2 and hAG-3, human homologues of genes involved in differentiation, are associated with oestrogen receptor-positive breast tumours and interact with metastasis gene C4.4a and dystroglycan. Br J Cancer 2003, 88(4):579-585. Snijders AM, Nowak N, Segraves R, Blackwood S, Brown N, Conroy J, Hamilton G, Hindle AK, Huey B, Kimura K et al: Assembly of microarrays for genome-wide measurement of DNA copy number. Nat Genet 2001, 29(3):263-264. Xie D, Jauch A, Miller CW, Bartram CR, Koeffler HP: Discovery of over-expressed genes and genetic alterations in breast cancer cells using a combination of suppression subtractive hybridization, multiplex FISH and comparative genomic hybridization. mt J Oncol 2002, 21(3):499-507. Watson MB, Bahia H, Ashman TN, Berrieman HK, Drew P, Lind MJ, Greenman J, Cawkwell L: Chromosomal alterations in breast cancer revealed by multicolour fluorescence in situ hybridization. IntJOncol 2004, 25(2):277-283. Barlund M, Tirkkonen M, Forozan F, Tanner MM, Kallioniemi 0, Kallioniemi A: Increased copy number at 1 7q22-q24 by CGH in breast cancer is due to high-level amplification of two separate regions. Genes Chromosomes Cancer 1997, 20(4):372-376. Lowenstein EJ, Daly RJ, Batzer AG, Li W, Margolis B, Lammers R, Ullrich A, Skolnik EY, Bar-Sagi D, Schlessinger J: The SH2 and 5H3 domain-containing protein GRB2 links receptor tyrosine kinases to ras signaling. Cell 1992, 70(3):43 1-442.  96  52.  53.  54. 55.  Wu GJ, Sinclair CS, Paape J, Ingle IN, Roche PC, James CD, Couch FJ: 17q23 amplifications in breast cancer involve the PAT1, RAD51C, PS6K, and SIGma1B genes. Cancer Res 2000, 60(19):5371-5375. Bhargava R, Gerald WL, Li AR, Pan Q, Lal P, Ladanyi M, Chen B: EGFR gene amplification in breast cancer: correlation with epidermal growth factor receptor mRNA and protein expression and HER-2 status and absence of EGFR-activating mutations. Mod Pathol 2005, 18(8): 1027-1033. Oda K, Matsuoka Y, Funahashi A, Kitano H: A comprehensive pathway map of epidermal growth factor receptor signaling. Molecular Systems Biology 2005. Graus-Porta D, Beerli RR, Daly JM, Hynes NE: ErbB-2, the preferred heterodimerization partner of all ErbB receptors, is a mediator of lateral signaling. EmboJ 1997, 16(7):16471655.  97  5.  NRG1  GENE  REARRANGEMENTS  IN  CLINICAL  BREAST  CANCER:  IDENTIFICATION OF AN ADJACENT NOVEL AMPLICON ASSOCIATED WITH POOR PROGNOSIS  5.1. Introduction Chromosomal translocations are a common feature of sarcoma, leukemia and lymphoma, but little is known about their role in common epithelial malignancies. A translocation t(12;15) leading to ETV6-NTRK3 gene fusion is a specific event in secretory carcinoma, a rare form of breast cancer [1, 2]. This was the first report of a recurrent and subtype-specific translocation in breast cancer and raises the possibility that translocations could define other clinically relevant subtypes of breast cancer. The short arm of chromosome 8 is a site of frequent chromosomal abnormalities in breast cancers and breast cancer cell lines, including gene amplification events and translocations. In breast cancers, the NRG1 and FGFR1 genes have frequently been involved in cytogenetic abnormalities of 8pl 1—12 [3-8]. NRG1 codes for at least 15 different isoforms through alternative splicing, including four heregulin (HRG) isoforms, which are ligands for members of the ERBB/ human epidermal growth factor receptor (HER) family [3, 9]. These HRG isoforms start transcription at exon 2, completely bypassing exon 1, which is located 955 kb away at the telomeric end of the gene. Binding of HRG ligands to HER3 facilitates heterodimerization of HER3 with HER2 and stimulates a signaling cascade that effects proliferation, survival, and differentiation (as reviewed by Yarden and Sliwkowski, 2001). HER2 is overexpressed in 15— 25% of ductal breast cancers and this over expression correlates with poor prognosis [10]. HER 1 and HER3 over expression are also associated with poor prognosis in breast cancer [11]. Furthermore, HRG has been shown to promote tumor growth in vivo and induce metastasis [12]. The known oncogenic role of HRG has provided the rationale for considering NRG1 as a potential target of 8pl2 rearrangements in breast carcinoma.  A version of this chapter has been published Prentice LM, Shadeo A, Lestou VS, Miller MA, deleeuw RJ, Makretsov N, Turbin D, Brown LA, Macpherson N, Yorida E, Cheang MC, Bentley J, Chia S, Nielsen TO, Gilks CB, Lam W, Huntsman DG. (2005) NRG1 gene rearrangements in clinical breast cancer: identification of an adjacent novel amplicon associated with poor prognosis. Oncogene. Nov 10,24(49): 7281-9. 98  In breast cancer cell lines, there have been seven breakpoints described within or around NRG 1, and a translocation with DOC4 located on chromosome 11 has been described in the MDA-MB 175 cell line [3, 6, 13]. This translocation gives rise to a secreted fusion protein that has been  demonstrated to have an autocrine growth effect [14]. Huang et al. (2004) identified NRG 1 breaks in 6% of breast cancers. The cases with NRG1 breaks tended to be HER2 negative and estrogen receptor (ER) positive [15]. They also reported a significant correlation with FGFR1 over expression. They hypothesized that these breakpoints result in translocation events that lead to breast cancer development through abnormal expression of the 3’-end of NRG1. They have shown translocation events in four breast cancer cell lines (ZR-75-1, HCC1937, UACC-812, and MDA-MB-175) and demonstrated gain of the 8pll—l2 region in three cell lines (SUM-52, ZR75-1, and MDA-MB-134) [3]. It is, however, possible that chromosomal rearrangements within  intron 1 of the NRG 1 gene do not represent gene specific translocations and that these events may be part of some other cytogenetic process. Factors that make the gene-specific hypothesis less likely include: (1) the lack of common fusion partners or a clearly oncogenic fusion transcript, (2) the lack of a consistent association between NRG 1 breaks and NRG 1 mRNA or protein levels, (3) the lack of any correlation between NRG1 abnormalities and expression of the epidermal growth factor receptors (HER 1, HER2, HER3, and HER4), and (4) the lack of strong and specific clinical associations with NRG1 rearrangements, as has been seen for the t(12;1 5) translocation and secretory carcinomas.  To determine the frequency and further delineate the clinical significance of chromosomal rearrangements within and surrounding the NRG1 gene, we analysed 438 clinically annotated breast cancers using fluorescent in situ hybridization (FISH) and Immunohistochemistry (IHC). In five cases with NRG1 rearrangements, bacterial artificial chromosome (BAC) array comparative genomic hybridization (CGH) was performed. A second series of cases with matched frozen tissue samples was used to determine the association between gene copy number and expression levels of specific genes identified as being amplified through BAC array CGH.  99  5.2. Materials and Methods 5.2.1. Breast Cancer Case Series and TMA Construction  The initial, larger TMA was constructed using formalin-fixed, paraffin-embedded tissue blocks received from the Department of Pathology at Vancouver General Hospital during the period 1974—1995. Case selection and TMA construction have been described previously [1, 16, 17]. Breast tumor samples used for the second, smaller TMA were obtained from women undergoing lumpectomy or mastectomy for breast cancer in Victoria, British Columbia (Royal Jubilee Hospital and Victoria General Hospital), in collaboration with the Department of General Surgery and the Department of Anatomic Pathology from June 1998 to June 2000. Briefly, representative areas of carcinoma were selected and marked on the hematoxylin and eosin slides and corresponding tissue blocks for TMA construction. The TMAs were assembled using a tissue arraying instrument (Beecher Instruments, Silver Springs, MD, USA), with two 0.6mm cores per case. Outcome data were available for all 495 patients on the larger TMA, with median follow-up of 15.4 years (range 6.3—26.6 years). For statistical analysis, only 438 cases with invasive breast carcinoma and known tumor grade are analysed. Ethical approval was obtained to perform this study from the Clinical Research Ethics Board of the University of British Columbia.  5.2.2. RNA Isolation Total cellular RNA was extracted from samples by the acid—phenol guanidium method [18] with TRIzol as recommended by the manufacturer (www.lifetech.com). After isolation, the RNA was treated with 1 U DNAse (RQ 1, Promega, Wisconsin, USA) in the buffer supplied to remove contaminating DNA. The quality of RNA was assessed by confirming the presence of intact 18 and 26s ribosomal RNA bands by agarose gel electrophoresis.  5.2.3. Fluorescent in situ Hybridization Sections (6 mm thick) of the TMA slides were pretreated for FISH as described elsewhere [1]. Locus specific FISH analysis for all genes, including both ends ofNRG1, FGFR1, and the novel amplicon, was performed using the following BACs from the Human BAC Library RPCI-1 1 (BACPAC Resources Centre, Children’s Hospital Oakland Research Institute) (listed telomeric to centromeric): 566H8, 10L15, 97N12, 478B14, 15H14, 1002K11, 11N9, 692P18, 451018 (for 100  the NRG1 gene), 863K10 (novel amplicon), 350Nl5, and 675F6 (FGFR1). BACs 566H8, lOLl5, 350N15, 675F6, 97N12, and 478B14 were directly labelled with SpectrumGreen and BACs 11N9, 692P18, 451018, 863Kb, 1002K11, and 15H14 were directly labeled with Spectrum Orange. The chromosomal locations of all BACs were validated using normal metaphases (results not shown). The well-characterized breast cancer cell line MDA-MB- 175 was also used to verif,’ the flanking NRG1 probes and after multicolor karyotyping (MFISH) treatment showed t(8;11). The aqua-labelled a-satellite control DNA probe for chromosome 8 (CEP8) was purchased from Vysis (Downers Grove, IL, USA). Probe labelling and FISH was performed using Vysis reagents according to the manufacturer’s protocols (Vysis, IL, USA). Slides were counterstained with 4,6- diamidino 2-phenylindole for microscopy. For each of the TMA slides, FISH signals and patterns were identified on a Zeiss Axioplan epifluorescent microscope and were scored either manually (oil immersion 100) or using Metasystems Metafer software (MetaSystems Group Inc., Belmont, MA, USA) and enumerated in approximately 40 morphologically intact and nonoverlapping nuclei. MFISH of the cell line MDA-MB-l75 was performed as published in detail elsewhere (Khoury et al., 2003). Amplification was defined as a copy number to control copy number ratio of 1.5 or greater; loss was consider as a ratio of 0.6 or less.  5.2.4. Immunohistochemical Analysis of HRG, HER3, and HER4  Immunohistochemical analysis was performed on 4-mm-thick sections from the TMA after heatinduced epitope retrieval. Primary antibodies were applied as follows: HRG (clone 7D5, diluted 1: 10), HER3 (rabbit polyclonal, diluted 1: 400), and HER4 (clone HFR-1, diluted 1: 160), all purchased from NeoMarkers, Lab Vision (Freemont, CA, USA). HRG immunoreactivity was scored according to the percentage of cells positive for the cytoplasmic reaction as follows: negative 10%, weak 10—40%, and strong >40% of cells stained. HER3 and HER4 were scored as negative or positive [19]. Linkage of the genetic data and HRG staining with tumor size, nodal status, the immunohistochemical biomarkers ER, PR, p53, Ki67, HER1, HER2, and clinical outcome data was performed as described previously [20].  1o  5.2.5. BAC array CGH A subset of five cases representing all types of NRG 1 gene rearrangements, that is, case 77 (type A), cases 179 and 286 (type B), and cases 303 and 285 (type C) were analysed by BAC array CGH to both verify the FISH results and provide high-resolution mapping (BO. 1 Mb) of the 8p1 1—12 region. We performed array CGH on microarrays containing 32 433 BAC-derived amplified fragment pools spotted in triplicate (97 299 elements) over two 18mm 54mm arrays, as described previously [21].  5.2.6. BAC array CGH Imaging and Analysis Cyanine-3 and cyanine-5 images (16 bit) were acquired using a CCD camera system (Applied Precision, Issaquah, WA, USA). Images were then analysed using SoftWoRx Tracker Spot Analysis software (Applied Precision). Genomic imbalances and their associated breakpoints were identified using genetic local search algorithms within the software package aCGH smooth developed by Jong et al. as described previously [22, 23]. Custom software (SeeGH) was used to visualize all data and is available upon request, as published elsewhere [24, 25].  5.2.7. Quantitative RT—PCR The probe sequences used for SPFH2 and FLJ 14299 were CCAGAGGCAATCCG and AGACTAGCTTCAGCCTC, respectively. The forward and reverse primer sequences for SPFH2 are CGGGTAACAAAGCCCAACAT, TCACTTTC CATCAACTCGTAGTTTCT, while the forward and reverse primer sequences for FLJ 14299 are GCCATACGCGCTG TATGGA, GAGGAAGAGCTGTAGTTACTGGTATCC. PROSC and FGFR1  probes with primers  included were obtained as a 20x target assay from Applied Biosystems (Foster, CA, USA). All probes were used at 100 nM and all primers at 900 nM concentrations. Cycling was set at 951 C for 10 mm, then 40 cycles of 951C (15 s)/601C (1 mm). Results were assessed using ABI software and normalized to Stratagene’s Universal Human Reference RNA mix consisting of 10 different human cell lines (La Jolla, CA, USA). Over expression was defined as a 2.0-fold or greater level of mRNA as compared to control.  102  5.2.8. Statistical Analysis Statistical analysis was performed using SAS software version 8.02. The optimal amplification ratio cutoff value generated by X-tile (Camp et al., 2004) was 1.38. A cutoff value of 1.5 was used for all analysis for more stringent results. The prognostic significance of the novel amplicon, amplified FGFR1, and NRG1 aberrations was assessed using Kaplan—Meier survival estimates and the log-rank test. Fisher’s exact test was used to test for associations between the novel amplicon, amplified FGFR1, and NRG1 aberrations. After a Bonferonni adjustment for multiple comparisons was made, associations were deemed to be statistically significant if the adjusted P-value was less than 0.05/14=0.00357. Multivariate analysis of the prognostic significance of the novel amplicon was performed using Cox’s proportional hazards model and a backwards stepwise method to remove variables from the model. Multivariate analysis was performed with a model that included the novel amplicon, lymph node status, HER2, Grade, and ER as variables. Fisher’s exact test was used to test for correlations between amplification and over expression.  5.3. Results 5.3.1. Three Distinct Types of NRG1 Aberrations A three-color FISH assay consisting of a centromere 8 probe (aqua) together with probes flanking the 5’-end (green) and the 3’-end (orange) of the NRG1 gene was performed on the tissue microarray (TMA) series of 438 clinically annotated breast cancers. A total of 358 cases produced FISH signals of sufficient intensity and clarity for analysis (8 1.7% of all cases). The remaining cases were not assessable due to poor FISH hybridization, insufficient numbers of tumor cells, tissue loss, or tissue over/underdigestion. In all, 17 (4.7%) of the cases showed abnormal NRG 1 signals, excluding aneuploidy. Three distinct types of abnormalities were seen: (A) low copy number of the 3’ (centromeric)-end of the gene with respect to the 5’ (telomeric) end of the gene (2 cases), (B) low copy number of the 5’-end of the gene with respect to the 3’end of the gene (12 cases), and (C) amplification of both ends of the gene (three cases) (Figure 5.5.1 a). The two cases with type A NRG 1 aberrations showed either loss of the 3’-end of NRG 1 (case number 152) or gain of the 5’-end of NRG1 (case number 77) (Figure 5.5.lb). Type B 103  aberrations were the most frequently observed, with loss of the 5’ signal being present in 11 cases, two of which also had 3’ amplification events (cases 87 and 179). The remaining nine cases with type B abnormalities had increased copy number of the 3’-end ofNRG1 due to loss of the 5’-end of the gene without 3’ amplification. A single case (case 286) with type B abnormality showed amplification of the 3’-end of NRG1 only (Figure 5.5.lc). Three cases: 285, 303, and 382 showed type C abnormalities (Figure 5.5.ld), and this is most likely indicative of whole gene amplification.  A second set of FISH probes was applied to the 17 cases demonstrating NRG1 aberrations, one covering exon 1 and the other covering exons 2—17 of the gene. Exon 2 is located 955 kb away from exon 1 and specifically represents the start of the HRG coding region (Adelaide et al., 2003). This specific (internal) probe set showed fewer abnormalities than were seen with the flanking (external) probes. For example, both cases showing type A abnormalities (loss of the 30-end) no longer showed a difference in copy number between the 5’ and the 3 ‘-end of NRG 1. The different locations of both specific and flanking probe sets are demonstrated in Figure 5.5.2a and complete results are listed in Figure 5.5.la.  5.3.2. HRG Correlates Significantly with Tumour Grade, p53, and HER1 There was no statistically significant correlation between NGR1 aberrations, either individually or taken as a group, and the biomarkers, ER, p53, Ki67, HER1, HER2, or HER3, nor with tumor grade, nodal status, or histological subtype. NRG1 abnormalities were not associated with poor survival. There was no significant correlation between HRG IHC and NRG1 aberrations. HRG immunostaining was seen in 60% of the tumors, with  19% demonstrating strong  immunostaining. HRG immunostaining intensity correlated positively with tumor grade (P 0.007), p 53 (P PR (P  =  0.001), and Ki67 (P  =  0.01), and correlated negatively with ER (P  =  =  0.016) and  0.0 17). A positive and statistically significant correlation between HRG and HER1 (P  =  0.015) was observed but not with the other members of the epidermal growth factor receptor family, HER2 (P  =  0.5), HER3 (P  =  0.4), and HER4 (P  104  =  0.2).  5.3.3. BAC array CGH Confirms NRG1 Aberrations and Identifies an Amplicon Centromeric to NRG1  Five cases showing a variety of NRG1 aberrations by FISH (cases 77, 179, 303, 285, and 286) were analysed by BAC array CGH, focusing on chromosome 8p to gain more detail of the NRG1 region. In case 77, FISH with the flanking probe set showed 5’ amplification, while the specific 5’ probe did not show amplification (type A); the CGH results confirmed both findings (Figure 5.5.2b). In case 179, array CGH showed neither 5’ deletion nor 3’ amplification in contrast to FISH, which showed type B abnormalities. Case 303 shows slight amplification of both 5’ and 3’ NRG1 by array CGH; type C abnormalities were demonstrated by FISH. Case 285 shows higher levels of amplification of both ends of NRG1 by array CGH, which is also consistent with the type C FISH results. For case 286, array CGH results also correlate with type B FISH and clearly demonstrate high-level amplification of 3’ NRG1 without gain of the 5’-end. Discrepancies can be accounted for based on the different BACs used for FISH and CGH. All five cases show significant levels of amplification by array CGH and validated by FISH for an amplicon centromeric to the NRG1 gene and telomeric to FGFR1, while FGFR1 shows gain of copy number in only two of five cases by array CGH (Figure 5.5.2b) and gain in four of five cases by FISH (Figure 5.5.2c). For all five cases, the minimal common region of amplification within the novel amplicon contains two Refseq annotated genes, SPFH2 and FLJ14299. SPFH2 was formally known as C8orf2, and is the approved symbol as determined by the HUGO Gene Nomenclature Committee (HGNC). The minimal region of amplification was determined using SMRT aCGH and SeeGH filtered raw data (Figure 5.5.3). Three of these five cases (285, 286, and 303) also included PROSC as part of the amplicon; the FISH probe encompassed all three genes due to their proximity. The average amplification ratios for the BACs representing the four different regions of 8p, as determined by FISH, are shown in Figure 5.5.2c. The amplification ratios relative to centromere copy number, for the novel amplicon range from 3.8 to 5.9, compared to FGFR1 amplification ratios that range from no amplification (ratio 1.1) to a ratio of 4.0.  105  5.3.4. The Novel Amplicon is Present in 24% of Clinical Breast Cancer Cases and Shows a Positive Correlation with Poor Survival.  To clarify the prevalence of this amplicon in breast cancer, the same 438 case TMA was hybridized with a three-color probe specific for the region of amplification (orange), FGFR1 (green), and centromere 8 (aqua). Results were obtained for 262 of 495 cases, with 63/262 (24%) cases showing a 1.5 or greater amplification ratio for the novel amplicon. Comparatively, only 39/262 (15%) cases had an amplification ratio of 1.5 or greater for FGFR1. Of the 63 cases with the novel amplicon, 36 cases also demonstrate FGFR1 amplification. The novel amplicon shows a positive correlation with poor survival using the Kaplan—Meier method and a cut-off amplification ratio of 1.5 (P  =  0.0101) (Figure 5.5.4a and b). Significance of the novel amplicon  was maintained in multivariate analysis (P  =  0.0007) (Figure 5.5.4c). In contrast, FGFR1 does  not show a significant correlation with patient survival (P  =  0.0953). Amplification of neither the  novel amplicon nor FGFR1 showed correlation with any of the biomarkers MYC, Cyclin-DI, HRG, HER 1, HER2, and HER3  —  —  ER, p53, Ki67, C  or with the clinicopathological markers  —  tumor grade, nodal status, or histological subtype, after a Bonferroni adjustment for multiple comparisons. There is, however, a highly significant correlation between the novel amplicon and FGFR1 (P  <  0.00 1), and both the novel amplicon and FGFRI show significant correlation with  NRG1 aberrations (P <0.001, P= 0.002, respectively). NRG1 aberrations do not correlate with amplification events in other chromosomal regions: C-MYC (P  =  0.241) and CCND1 (P  =  0.140)  (Table 5.1).  5.3.5. SPFH2 Most Significantly Correlates Over Expression with Amplification as Determined by Real-time PCR  A second, independent breast cancer array was prepared from 40 cases, each with paired RNA samples isolated from snap frozen malignant and surrounding benign breast tissue. The array was hybridized with a combination of three probes derived from the new 8p amplicon, FGFR1, and centromere 8. Using this array, 29 cases had interpretable FISH results and nine of these had the novel amplicon. Real-time PCR was used to determine gene expression levels for FLJ14299, SPFH2, PROSC, and FGFR1 in order to correlate expression with amplification. Using a twofold cutoff value as indicative of over expression, four cases were found to overexpress SPFH2 and 106  over expression correlated significantly with gene amplification as determined by Fisher’s exact test (P  =  0.005) (Table 5.2). These four cases had the highest amplification ratios (8.4, 2.4, 2.1,  1.9). Six cases showed over expression of FLJ14299, with a borderline significant correlation with gene amplification (P  =  0.056). Of these six cases, two did not exhibit gene amplification.  PROSC over expression occurred in five cases and showed no significant correlation with amplification (P  0.287). Two of these five cases did not show gene amplification. FGFR1 was  overexpressed in nine cases, with a significant correlation with gene amplification (P  =  0.022),  but more than half of these cases (5/9) did not have gene amplification.  5.4. Discussion We report the presence of NRG 1 gene rearrangements in 4.7% of breast cancers. Our findings are similar to those of Huang et al. (2004), who found 6% of breast cancer cases with NRG 1 gene rearrangements [15]. These are the first two reports of NRG1 gene aberrations in clinical tumor samples, although these rearrangements have been previously characterized in breast cancer cell lines [3]. Unlike the current study, Huang et al. do not report any cases of amplification with their NRG1 probes, only loss. Neither Huang et al. nor our study shows correlation of NRG1 gene aberrations with the expression of HRG, the HRG receptors (HER 1, HER2, HER3, or HER4), or with any specific molecularly or clinically defined subtype of breast cancer. These findings suggest that NRGI rearrangements are not part of a gene-specific oncogenic translocation event. Although NRG1 translocations in clinical breast cancers could have a role in breast cancer development, a more likely scenario is that these breakpoints result in break—fusion—bridge (BFB) cycles that create amplified regions centromeric to the NRG1 breakpoint, leading to over expression of novel oncogenes, a possibility suggested by Birnbaum et a!. (2003) [5]. This hypothesis is supported by the discovery of the novel amplified region centromeric to NRGI and its correlation with poor patient survival. This region is amplified in 24% of breast cancer cases and is prognostically significant. FISH data from three breast cancer cell lines (T-47D, MDA-MB-361, ZR-75-1) demonstrated that translocations involving 8 2 p i are accompanied not only by amplification of various genes within the region but also by additional rearrangements such as deletion, duplication, and inversion [26]. These breakpoints and those found by Adelaide et al. (2003) could be the initiating events that lead to the BFB mechanism for 107  gene amplification as has been suggested for Her2 and Topo2A [27]. The BFI3 cycle is characterized by breakpoints surrounding the amplicon and loss of genes telomeric to the amplified region [281. Consistent with the BFB model, the loss of the 5’-end of the NRG1 gene, which is telomeric to the site of amplification, is seen much more frequently than loss of the 3’end of NRG1 (11 cases versus one case). In order for the BFB cycle to lead to successful amplification, a second breakpoint region centromeric to the amplicon is required (Figure 5.5). There have been several reports of entire 8p arm loss in breast cancer cell lines and this may reflect a second breakpoint region centromeric to the amplified region [29]. Two breakpoint regions centromeric to the novel amplicon, with one located telomeric to FGFR 1, would explain the difference in amplification frequencies between the novel amplicon (24%) and FGFR1 (15%). In a region such as 8p1 1—12, the BFB-based mechanism through which multiple amplicons can develop is likely very complex and requires multiple discrete breakpoints. The underlying genetic defects that permit the development of multiple amplicons within breast cancer genomes are not yet characterized.  The novel 8p12-derived amplicon described herein contains two genes, FLJ14299 and SPFH2. When the expression levels of these genes, as assessed by quantitative real-time PCR, are compared with gene copy number, SPFH2 expression most significantly correlates with gene amplification. Although the function of the translated product of SPFH2 has yet to be elucidated, its gene product has been isolated from caveolae and lipid raft-enriched fractions of the human endothelial membrane [30]. By dissociating lipid rafts and HER2 clusters, Nagy et al. (2002) demonstrated that HER2 association with HER3 decreased, as was EGF- and HRG-induced tyrosine phosphorylation of Shc [31]. The location of SPFH2 within a clinically significant amplicon in breast cancers, the association between amplification and over expression, and its presence in lipid rafts all suggest an oncogenic role for SPFH2 in breast cancer. FLJ 14299 amplification has been described in the cell lines SUM-44 and SUM-225 and is associated with over expression at the mRNA level [7]. FLJ14299 gene contains the C2H2 zinc-finger domain present in several other tumor-related genes such as BCL6 (lymphoma), ZNF217 (breast carcinoma), and GLI (sarcoma), and could indicate a DNA-binding region [7]. Our study shows that FLJ 14299 is the most frequently overexpressed of the three genes and shows a borderline significant correlation with amplification. As PROSC was part of the amplicon in only three of 108  five BAC array CGH cases and does not show correlation between amplification and over expression in the smaller series, it is likely not an oncogenically important element in the amplicon. The adjacent FGFR1 gene has been shown to be amplified in 9—15% of breast cancers, which is consistent with our findings (15%), and has been implicated as the driver gene for the 8pl 1—12 amplicon [8]. Similar to our study, Huang et al. (2004) found a significant association of FGFRI expression with NRG1 aberrations. FGFR, like other single transmembrane growth factor receptors, transduces signals across the cell membrane after binding extracellular ligands [32]. However, although high FGFR1 oncogene amplification was demonstrated in three breast cancer cell lines SUM-44, SUM-52, and SUM-225, FGFR inhibition failed to slow growth, suggesting that other genes may be responsible for this amplicon and/or breast cancer progression [7]. FGFR1 was also found to be coamplified with CCND1 (1 1q13) in the breast cancer cell line MDAMB- 134 and 12 of 225 breast cancer specimens [33]. FGFR1 and CCND1 gene coamplification was associated with a worse prognosis in a cohort of 640 breast cancer cases [34]. A link between FGFR and D cyclins has been found within the G1 phase of the cell cycle. When FGFR activity is inhibited, cyclins Dl and D2 are downregulated resulting in cell cycle arrest at Gi mediated by the RB phosphorylation pathway [35]. Chromosome llq was assessed by BAC array CGH in our five select cases to determine if coamplification events existed. Besides a minor amplification at Ilq 14.1 (results not shown), which did not include the genes DOC4 and CCNDI or other regions previously implicated in NRG1 translocations [3, 33], our study did not find any coamplification events in the five select cases analysed. Our study also showed no correlation between FGFR1 and CCND1 amplifications as determined by FISH. While FGFR1 amplification may prove to be significant in breast cancer, it appears to be distinct from the novel amplicon. Compared to the novel amplicon, FGFR1 is less commonly amplified as determined by array CGH and FISH, and it is not a significant prognostic indicator.  Our study has demonstrated a novel amplification event located within the 8pl 1—12 region that is significantly associated with poor survival and maintains significance as an independent prognostic indicator in multivariate analysis. This novel amplicon also correlates with NRG 1 aberrations supporting a hypothesis that breakpoints within the NRG1 gene lead to BFB cycles that result in amplification of a novel oncogene. SPFH2 is a potential candidate for this oncogene.  109  Table 5.1. Correlations between 8p abnormalities and clinical/pathological variables Marker I NRG1 aberrations FGFR1 Novel amplicon  HER1 HER2 HER3 HRG ER Ki67 p53 Tumor grade Nodal status Histological subtype C-MYC Cyclin-D NRGJ aberrations FGFR] * Indicates significance after (P=0.05/14=0.003 57)  I  0.565 0.175 0.537 1.000 0.598 0.806 0.792 0.446 0.807 0.250 0.241 0.140  0.309 1.000 0.736 1.000 0.330 1.000 0.194 0.613 0.398 0.366 0.110 0.304 0.002*  —  —  a  Bonferroni  110  adjustment  0.667 0.350 1.000 0.738 0.688 0.503 0.021 0.593 0.17 1 0.170 0.099 0.520 <0.001* <0.001* for  multiple  comparisons  Table 5.2. Correlations between expression and amplification status of genes from chromosome 8p [Novel amplicon Amplified (N=9) Not amplified (N20) ESPFH2P=o.oo5*  Overexpressed (N=4) Not overexpressed (N=25)  4 5  0 20  FLJ14299P=0.056  Overexpressed (N=6) Not overexpressed (N=23)  4 5  2 18  PROSCP=0.287  Overexpressed (N=5) Not overexpressed (N=24)  3 6  2  FGFR1 amplicon FGFR]P=0.022* *  Ampfl/led (N5) Overexpressed (N=9) 4 jNot overexpressed N=20)j1 Indicates statistical significance  111  18 Not ampfl/led (N24) 19  a Type  Aberration  A More copies of 5 than 3  DeI3  1  0  Amp  I  0  9  2  B More copies of 3 than 5  C  Flanking Probes (N’17)  eI5 5• and Amp 3  Specific Probes (N’4)  2  2  Amp3  I  I  Both5and3Amp  3  3  Figure 5.5.1 (a) Summary of NRG1 aberrations as determined by FISH. Two probe sets were used: one probe set flanks the 5’ and 3’-end of the NRG1 gene and the second probe set specifically covers the 5’ and 3’-end of the gene, and is internal to the first probe set. In all, 17 cases showed NRG1 abnormalities using the flanking probe set, while eight cases showed abnormalities with the internal, specific probes. (b—d)Microscopic images demonstrating examples of the three types of NRG1 aberrations as determined by FISH. Green spots represent the 5’-end of the NRG 1 gene, red spots represent the 3’-end of the NRG 1 gene, and blue spots are the chromosome 8 centromere. Increased spot count as compared to centromere signal indicates amplification; loss of signal as compared to centromere spot count indicates loss of genomic material. (b) Amplification of the 5’-end of NRG1 (type A aberration) as shown by multiple green signals. (c) Amplification of the 3’-end of NRG1 (type B aberration). (d) Amplification of both 5’ and 3’-end of the NRG1 gene as compared to the centromere (type C aberration)  112  ___________________  a  b Teto mere Case # 91H12 4?I14  31M  5.  NRGI 3. ISKI4  1Ll  I IN 92P1 41O1  S2N22  1  H  -,  SN1S  I FGFRI 3IH15  C  6.5  .  6.0 5.5 5.0 4.5 4.0  39M  I  E  •1  a 3.5 I  3.0  2 2.0  1.5 1.0  Ceritrornere  0-s 0.0  77  179  303 Case #  285  286  Figure 5.5.2 (a) A schematic representation of the region of interest on the 8p arm, showing the location of BACs used in this study with corresponding BAC identification numbers. BACs located on the right side of the schematic are those used in BAC array CGH, while those on the left side are BACs used for FISH. (b) BAC array CGH results. For each individual case, there are three vertical lines representing the amplification scale. Vertical red left line represents a log 2 ratio between the sample and reference channels of -0.5, vertical purple middle line represents a log 2 ratio of 0, and the vertical green right line represents a log 2 ratio of +0.5. Each individual black spot represents one BAC clone and shows the amplification ratio as compared to normal DNA with ratios to left and right of the purple line representing losses and gains, respectively. There is consistent amplification of the novel amplicon in all five samples. (c) A graphical representation of amplification ratios obtained from FISH for all BACs located on the left side of the schematic, grouped according to color. Graph bars from left to right; 5’ NRG 1 (blue), 3’ NRG1 (tan), novel amplicon (purple), and FGFR1 (burgundy). Only the novel amplicon consistently shows increased copy number in these five cases  113  PRQSC  I I  77  179  ij  iIj  03  285  i  286 I  —  I .0.5  I +0.5  0.5  0 .05  -0.5  0  —  .0.5  -0.5  0 403  I  —  -0.5  0  .05  Figure 5.5.3 Minimum common region of amplification (MAR) at 8pl2 using unsmoothed array CGH. Log 2 ratios of the data points obtained from the five cases are shown in multiple alignment, SeeGH Karyogram. Vertical line segments illustrate the individual BACs covering the corresponding chromosomal regions. Centerline is designated as 0 and movement towards the right line (+0.5) indicates a gain in genomic material of the corresponding chromosomal region, while movement to the left (-0.5) indicates a loss. Horizontal lines specify the boundaries of the MAR. Case 77 determined the centromeric boundary and case 303 the telomeric boundary. FLJ 14299 is entirely within the MAR, while SPFH2 is mostly within the boundaries. PROSC is excluded  114  a to 0.9 >  0.8 L0  0,7 0) 0  0 0) 0)  06 0,5  0  0.4 0  b  C  10 12 8 Total Follow-up (Years)  or lnterpretab(e Cases  Group  No.  NRG1 aberrations (A,B,C)  358  0.7695  NRG1 aberrations (8&C)  355  0.3957  GFR1  238  0,0953  Novel Aniplicon  238  0.0101’  p-value  Marker  p-value  Relative Risk (99% CI)  8p (1.5 AR Threshold)  00007’  3.0 (13-7.0)  Lympt node status  0.0006e  2.9 (1.36.6)  Figure 5.5.4 (a) Kaplan—Meier survival curve demonstrating the prognostic significance of the novel amplicon. (b) Summary of univariate survival analysis for NRG 1 aberrations, FGFR1, and the novel amplicon. Only the novel amplicon is prognostically significant. (c) Multivariate analysis results using Cox’s regression in a backwards stepwise elimination using the following variables: HER2, Grade, ER status, lymph node status, and the novel amplicon. Only the novel amplicon and lymph node status are of independent prognostic significance. Asterisk indicates significant P-value  115  8  NRG1 --—-  FOFRI  C4  -  I  ——-----  I OS Break ——---—  S-phase Replication  w  4 —-  Fusion ot Broken ends  Bridge  —  —-  —  SS Break  Mitosis M-phase  —---.-.-  I,  S-phase Repikatton  —  OS Break -__-_-  — —  Repeat BFB cycle several times results In amplification  Figure 5.5.5 Schematic representation of the BFB mechanism. Initiation of the BFB cycle by a double-stranded break of sister chromatids after S-phase replication. The breakpoint within the NRG1 gene is represented by a broken line. Broken ends become fused and form a chromosomal bridge during mitosis with spindle attachment to the centromeres. The bridged chromosome breaks during anaphase leaving duplicate copies of the novel amplicon and FGFR1 on the same chromosome. Cycle repeats with a second breakpoint that creates specific amplification of the novel amplicon as compared to FGFR1 or NRG1  116  5.5. References  1.  2.  3.  4.  5.  6.  7.  8.  9. 10. 11.  Makretsov N, He M, Hayes M, Chia 5, Horsman DE, Sorensen PH, Huntsman DG: A fluorescence in situ hybridization study of ETV6-NTRK3 fusion gene in secretory breast carcinoma. Genes Chromosomes Cancer 2004, 40(2):152-157. Tognon C, Knezevich SR, Huntsman D, Roskelley CD, Melnyk N, Mathers JA, Becker L, Carneiro F, MacPherson N, Horsman D et a!: Expression of the ETV6-NTRK3 gene fusion as a primary event in human secretory breast carcinoma. Cancer Cell 2002, 2(5):367-376. Adelaide J, Huang HE, Murati A, Alsop AE, Orsetti B, Mozziconacci MJ, Popovici C, Ginestier C, Letessier A, Basset C et al: A recurrent chromosome translocation breakpoint in breast and pancreatic cancer cell lines targets the neuregulinfNRGl gene. Genes Chromosomes Cancer 2003, 37(4):333-345. Bernardino J, Apiou F, Gerbault-Seureau M, Malfoy B, Dutrillaux B: Characterization of recurrent homogeneously staining regions in 72 breast carcinomas. Genes Chromosomes Cancer 1998, 23(2):100-108. Bimbaum D, Adelaide J, Popovici C, Charafe-Jauffret E, Mozziconacci MJ, Chaffanet M: Chromosome arm 8p and cancer: a fragile hypothesis. Lancet Oncol 2003, 4(10):639642. Liu X, Baker E, Eyre HJ, Sutherland GR, Zhou M: Gamma-heregulin: a fusion gene of DOC-4 and neuregulin- 1 derived from a chromosome translocation. Oncogene 1999, 18(50):71 10-7114. Ray ME, Yang ZQ, Albertson D, Kleer CG, Washburn JG, Macoska JA, Ethier SP: Genomic and expression analysis of the 8pl 1-12 amplicon in human breast cancer cell lines. Cancer Res 2004, 64(1):40-47. Ugolini F, Adelaide J, Charafe-Jauffret E, Nguyen C, Jacquemier J, Jordan B, Birnbaum D, Pebusque MJ: Differential expression assay of chromosome arm 8p genes identifies Frizzled-related (FRP1/FRZB) and Fibroblast Growth Factor Receptor 1 (FGFR1) as candidate breast cancer genes. Oncogene 1999, 18(10):1903-1910. Falls DL: Neuregulins: functions, forms, and signaling strategies. Exp Cell Res 2003, 284(1): 14-30. Yarden Y, Sliwkowski MX: Untangling the ErbB signalling network. Nat Rev Mo! Cell Biol 2001, 2(2):127-137. Wiseman BS, Werb Z: Stromal effects on mammary gland development and breast cancer. Science 2002, 296:1046 1049. Atlas E, Cardillo M, Mehmi I, Zahedkargaran H, Tang C, Lupu R: Heregulin is sufficient for the promotion of tumorigenicity and metastasis of breast cancer cells in vivo. Mo! Cancer Res 2003, 1(3):165-175. Wang XZ, Jolicoeur EM, Conte N, Chaffanet M, Zhang Y, Mozziconacci MJ, Feiner H, Birnbaum D, Pebusque MJ, Ron D: gamma-heregulin is the product of a chromosomal translocation fusing the DOC4 and HGL/NRG 1 genes in the MDA-MB- 175 breast cancer cell line. Oncogene 1999, 18(41):5718-5721. Schaefer G, Fitzpatrick VD, Sliwkowski MX: Gamma-heregulin: a novel heregulin isoform that is an autocrine growth factor for the human breast cancer cell line, MDA MB-175. Oncogene 1997, 15(12):l385-1394. -  12.  13.  14.  117  15.  16.  17.  18. 19.  20.  21.  22.  23.  24.  25.  26. 27.  28.  29.  Huang HE, Chin SF, Ginestier C, Bardou VJ, Adelaide J, Iyer NG, Garcia MJ, Pole JC, Callagy GM, Hewitt SM et al: A recurrent chromosome breakpoint in breast cancer at the NRG1/neuregulin 1/heregulin gene. Cancer Res 2004, 64(19):6840-6844. Makretsov N, Gilks CB, Coidman AJ, Hayes M, Huntsman D: Tissue microarray analysis of neuroendocrine differentiation and its prognostic significance in breast cancer. Hum Pathol 2003, 34(i0):1001-1008. Parker RL, Huntsman DG, Lesack DW, Cupples JB, Grant DR, Akbari M, Gilks CB: Assessment of interlaboratory variation in the immunohistochemical determination of estrogen receptor status using a breast cancer tissue microarray. Am J Clin Pathol 2002, 1 17(5):723-728. Chomczynski P, Sacchi N: Single-step method of RNA isolation by acid guanidinium thiocyanate-phenol-chioroform extraction. Anal Biochem 1987, 162(1): 156-159. Wiseman SM, Makretsov N, Nielsen TO, Gilks B, Yorida E, Cheang M, Turbin D, Gelmon K, Huntsman DG: Coexpression of the type 1 growth factor receptor family members HER-i, HER-2, and HER-3 has a synergistic negative prognostic effect on breast carcinoma survival. Cancer 2005, 103(9):1770-1777. Liu CL, Prapong W, Natkunam Y, Alizadeh A, Montgomery K, Gilks CB, van de Rijn M: Software tools for high-throughput analysis and archiving of immunohistochemistry staining data obtained with tissue microarrays. Am JPathol 2002, 161(5):1557-1565. Ishkanian AS, Malloff CA, Watson SK, DeLeeuw RJ, Chi B, Coe BP, Snijders A, Albertson DG, Pinkel D, Marra MA et al: A tiling resolution DNA microarray with complete coverage of the human genome. Nat Genet 2004. de Leeuw RJ, Davies JJ, Rosenwald A, Bebb G, Gascoyne RD, Dyer MJ, Staudt LM, Martinez-Climent JA, Lam WL: Comprehensive whole genome array CGH profiling of mantle cell lymphoma model genomes. Hum Mo! Genet 2004, 13(17):1827-1837. Jong K, Marchiori E, Meijer G, Vaart AV, Ylstra B: Breakpoint identification and smoothing of array comparative genomic hybridization data. Bioinformatics 2004, 20(18):3636-3637. Chi B dR, Coe BP, MacAulay C, Lam WL: SeeGH A software tool for visualization of whole genome array comparative genomic hybridization data. BMC Bioinformatics 2004, 5(13). Watson SK, deLeeuw RJ, Ishkanian AS, Malloff CA, Lam WL: Methods for high throughput validation of amplified fragment pools of BAC DNA for constructing high resolution CGH arrays. BMC Genomics 2004, 5(1):6. Courtay-Cahen C, Morris JS, Edwards PA: Chromosome translocations in breast cancer with breakpoints at 8pl2. Genomics 2000, 66(1):15-25. Jacobson KK, Morrison LE, Henderson BT, Blondin BA, Wilber KA, Legator MS, O’Hare A, Van Stedum SC, Proffitt JH, Seelig SA et al: Gene copy mapping of the ERBB2/TOP2A region in breast cancer. Genes Chromosomes Cancer 2004, 40(1): 19-31. Coquelle A, Pipiras E, Toledo F, Buttin G, Debatisse M: Expression of fragile sites triggers intrachromosomal mammalian gene amplification and sets boundaries to early amplicons. Cell 1997, 89(2):215-225. Rummukainen J, Kytola 5, Karhu R, Famebo F, Larsson C, Isola JJ: Aberrations of chromosome 8 in 16 breast cancer cell lines by comparative genomic hybridization, fluorescence in situ hybridization, and spectral karyotyping. Cancer Genet Cytogenet 2001, 126(1):1-7. -  118  30.  31.  32. 33.  34.  35.  Sprenger RR, Speijer D, Back JW, De Koster CG, Pannekoek H, Horrevoets AJ: Comparative proteomics of human endothelial cell caveolae and rafts using twodimensional gel electrophoresis and mass spectrometry. Electrophoresis 2004, 25(1): 156172. Nagy P, Vereb G, Sebestyen Z, Horvath G, Lockett SJ, Damjanovich S, Park JW, Jovin TM, Szollosi J: Lipid rafts and the local density of ErbB proteins influence the biological role of homo- and heteroassociations of ErbB2. J Cell Sci 2002, 1 15(Pt 22):4251-4262. Powers CJ, McLeskey SW, Welistein A: Fibroblast growth factors, their receptors and signaling. Endocr Relat Cancer 2000, 7(3): 165-197. Bautista 5, Theillet C: CCND1 and FGFR1 coamplification results in the colocalization of 1 1q13 and 8pi2 sequences in breast tumor nuclei. Genes Chromosomes Cancer 1998, 22(4):268-277. Cuny M, Kramar A, Courjal F, Johannsdottir V, Tacopetta B, Fontaine H, Grenier J, Culine 5, Theillet C: Relating genotype and phenotype in breast cancer: an analysis of the prognostic significance of amplification at eight different genes or loci and of p53 mutations. Cancer Res 2000, 60(4): 1077-1083. Koziczak M, Hoibro T, Hynes NE: Blocking of FGFR signaling inhibits breast cancer cell proliferation through downregulation of D-type cyclins. Oncogene 2004, 23(20):3501-3508.  119  6. GENOMIC ALTERATIONS IN LOBULAR NEOPLASIA: A MICROARRAY COMPARATIVE  GENOMIC  HYBRIDIZATION  SIGNATURE  FOR  EARLY  NEOPLASTIC PROLIFERATION IN THE BREAST  6.1. Introduction Neoplastic breast lesions represent phenotypic consequences of the accumulation of genetic events. The growth of a neoplastic lesion would suggest that the first events attributable to cancer development and progression might be found in the cells that comprise hyperplasia and in situ carcinoma. Studying such lesions provides the opportunity for identification of early genetic events that promote breast cancer progression.  Lobular neoplasia includes lesions of atypical lobular hyperplasia (ALH) and lobular carcinoma in situ (LCIS) [1-3]. Studies have suggested that lobular neoplastic lesions, when adjacent to an  invasive carcinoma, possess some of the characteristics of precursors [4, 5]. However, lobular neoplasia without associated invasive carcinoma has not been sufficiently investigated and therefore the label of precursor lesion may be premature. Epidemiological studies confirm these lesions as risk factors in the development of invasive carcinoma [6-8]; however, there have been few molecular genetic studies because of the size and rarity of these lesions.  Furthermore,  solitary lobular neoplastic breast lesions are difficult to collect prospectively since they are not efficiently detected by modem imaging capabilities. Thus, archival tissue offers an opportunity for the study of these lesions that are difficult to collect prospectively.  Conventional chromosomal comparative genomic hybridization (CGH) demonstrated that alterations occur at chromosomes 6, 16, 17 and 22 in lobular neoplastic lesions, with certain cases developing subsequent invasive breast cancer [4]. Based on the frequency of alterations, Lu et al. [4] concluded that ALH and LCIS may represent similar genetic stages of progression. Two studies analyzed LCIS with adjacent invasive carcinoma using microarray-CGH. Hwang et al. [9] evaluated 24 synchronous in situ and invasive lobular lesions and found alterations A version of this chapter has been published Mastracci T L, Shadeo A, Colby S M, Tuck A B, O’Malley F P, Bull B B, Lam W L, Andrulis I L. (2006) Genomic Alterations in Neoplastic Lesions of the Breast. Genes Chromosomes and Cancer. Nov;45( 11): 1007-17.  120  including 16q loss, lq gain, 1 1q14-1 lqter loss, and 1 iqi i-q13 loss. Moreover, Nyante et al. [10] published a case report that evaluated a patient with three synchronous lesions including ductal carcinoma in situ (DCIS), LCIS and invasive lobular carcinoma (ILC); two different genomic signatures were detected.  The LCTS and ILC lesions were found to harbor  chromosomal loss at ip, 16q and 17p/q whereas the DCIS lesion showed loss at 6q and ilq, and gain at lOp, llq, and 15q.  Advances in microarray-CGH technology [11] have facilitated high-resolution examination of tumor genomes. In this study we used the recently developed tiling path resolution BAC-array technology [12] in the analysis of archival solitary ALH and LCIS tissue at the whole genome level, in order to identify early events in the development of lobular breast cancer.  6.2. Materials and Methods 6.2.1 Lobular Neoplastic Cases Twenty-one formalin-fixed, paraffin-embedded archived cases were studied, including 12 ALH lesions (A3, A4, A7 Li3, L28).  —  A9, All  —  A14, A18, A21, A25) and 13 LCIS lesions (Li, L2, L3, L5  —  Four of these cases contained paired ALH and LCIS lesions (A3/L3, A4/L5,  Al 1/L12, A14/Li3), while only two cases (Ai4/L13, L28) were subsequently found to contain not only lobular neoplastic lesions but also an adjacent focus of invasive carcinoma (the invasive lesions were not studied).  All cases were accrued through Mount Sinai Hospital Pathology  Department and were either in-house cases accessioned at Mount Sinai Hospital (1988-2003) or were received as consultation cases (submitted to FPO’M).  Hematoxylin and eosin stained  sections from the cases were reviewed by the study pathologists (FPO’M and ABT) using previously described histological characteristics [2] to classify the lobular lesions (Table 5.1). The majority of these cases were previously studied by our group for E-cadherin, beta-catenin, alpha-catenin and p120-catenin protein expression as well as for mutations in the E-cadherin gene (CDHJ) and loss of heterozygosity (LOH) at l6q [13].  Previously described  microdissection and DNA extraction methods [13] were used to obtain DNA from each lesion.  121  In addition, regions of adjacent normal breast tissue were identified by a pathologist (FPO’M) in four of the cases in the collection (A8, A13, L8, L9). Similar to the lesions, these regions were microdissected, and the DNA extracted and assessed by microarray-CGH for chromosomal gains and losses.  6.2.2. Mieroarray Comparative Genomic Hybridization DNA template was prepared for submegabase resolution tiling array (SMRT) CGH hybridization as previously described [12, 14-16]. Briefly, test DNA (ALH or LCIS) and reference DNA (female genomic DNA; Novagen) were labeled with Cy3- and Cy5-labelled dCTP, respectively. The amount of test and reference DNA labeled ranged from l2Ong to 300ng depending on the amount of available starting material. Labeled DNA probe was hybridized to the microarray and incubated at 45°C for approximately 36-40 hours. Following hybridization, the microarrays were washed to remove any excess probe solution and dried by centrifugation.  6.2.3. Imaging and Data Visualization  Each microarray was scanned using a charged coupled device based imaging system (Arrayworx eAuto, Applied Precision) and analyzed using SoftWoRx Tracker Spot Analysis software. The raw data were exported from this program and subsequently processed using SeeGH software [17]. To view the data using SeeGH, the raw data were stringently filtered to remove all clones with a standard deviation greater than 0.075 and a signal intensity to background intensity noise ratio exceeding fifteen. Areas of gain or loss were visualized chromosome by chromosome on log ratio plots.  Regions of alteration were noted where overlapping clones showed a copy  number ratio >1.5 (gain) or <0.5 (loss) consistently across multiple cases.  6.2.4. Microarray-CGH Data Pre-processing and Normalization  For each microarray, the raw log2 copy number ratios exported from the image analysis software were normalized by subarray-specific median subtraction [1 8]. Clones were spotted in duplicate on the microarray and these spot-pairs were averaged.  Clones mapping to the X or Y  chromosome, as well as clones without a genomic position, were excluded. number ratios were then ordered by their genomic map positions [17].  122  The log2 copy  6.2.5. Statistical Tabulation of Gains and Losses  Each microarray was analyzed chromosome by chromosome, using a two-step procedure. First, the log2 ratios were partitioned into regions of equal copy number. Second, each region was defined as either a region of gain, loss, or normal copy number.  Region delineation was  automated using the Olshen et al. [19] circular binary segmentation algorithm implemented in the Bioconductor [20] package DNAcopy for the analysis of microarray-CGH data. Circular binary segmentation is a recursive change-point method; the statistical significance of the change-points is assessed by permutation.  As recommended in the software documentation,  outliers were smoothed prior to segmentation (default settings used). Default settings were also used for segmentation, except a more stringent cut-off of ct=O.OO 1 was chosen for the significance level. removed  using  In addition, change-points detected due to local trends in the data were sum  of  squares  criteria  (Software  documentation:  http ://www.mskcc .org/mskcc/shared/graphics/epidemiology/AdamOlshen/DNAcopy.pdf). Following segmentation, each identified region was evaluated for chromosomal gain or loss. Specifically, we defined a gain (loss) as a region in which (a) median copy number ratio was >2 (<0.5), or, (b) one-sided t-test H : i=1 versus H 0 : ji>1 (Hi: 11<1) was significant at p<O.00005, 1  and >80% of the clones had a copy number ratio >1 (<1). These criteria were determined by calibrating the procedure against known alterations in the breast cancer cell line BT474 [21]. All analyses were carried out in R, a language and environment for statistical computing [22] (http ://www.R-proj ect.org).  After regions of chromosomal gain and loss were identified for each lesion, the alterations were tabulated within and across the lesions. To facilitate this tabulation, we defined a minimum common region for any set of regions sharing at least one clone. That is, cases with a given alteration were aligned by clone and the smallest contiguous subset of clones was determined. For each such region, a two-sided Fisher’s exact test was used to compare the frequency counts in the ALH and LCIS lesions. In addition, to test for differences between the ALH and LCIS groups in the mean number of alterations, we calculated exact permutation linear rank test statistics for the number of gains, the number of losses, and the number of alterations detected in each lesion, using the observed number (of gains, losses, or alterations) as the weighting score  123  (StatXact 5 Software). Independent of the ALH and LCIS cases, the data from the four adjacent normals was tabulated for gains and losses (data not shown). The regions identified as altered in the adjacent normal were compared with those found for ALH and LCIS.  6.2.6. ALH and LCIS Class Discrimination To investigate the ability of this genomic copy number region signature to discriminate between ALH and LCIS lesions, a supervised analysis of differential copy number was performed using the mean log2 copy number ratios of the clones within each of the 33 regions deemed relevant to lobular neoplasia. The random variance model implemented in the software BRB-ArrayTools (developed by Dr. Richard Simon and Amy Peng Lam) was used for all univariate tests [23]. For visual display of the lesion-specific copy number signatures, signed logio transformed p values from a two-sided t-test comparing the mean log2 ratios of the clones within each of the 33 regions of alteration to zero were used to cluster regions of gain and loss as well as to cluster lesions.  Multivariate class prediction models for ALH versus LCIS were developed and  evaluated using regional mean log2 ratios for all 33 regions, including diagonal linear discriminant analysis and support vector machines. Misclassification rates were estimated using leave-one-out cross validation.  Global p-values  for testing whether the estimated  misclassification rates were lower than would be expected by chance were obtained by permutation (2,000 permutations). BRB-ArrayTools software was used for all class prediction analyses [23].  6.2.7. Real-Time PCR A probe/primer set was used specific for a gene of interest within an identified region of gain (selected for validation based on probe availability). The gene-specific probe/primer sets (Assay mix) used for validation include COPEB  (lOpl5.2-pl5.l),  HOXB6 (17q21.32), and CEBPB  (20q 13.13). PCR was performed according to the manufacturers instructions and contained 2X TaqMan Universal PCR Master Mix (Applied Biosystems), 20X Assay mix (Assay-on-Demand, Applied Biosystems), and 10-2Ong of microdissected DNA template aliquoted into 96-well optical plates (MicroAmp, Applied Biosystems). Real-Time PCR was executed using the ABI Prism 7900HT (Applied Biosystems) and the resulting data was analyzed with SDS 2.0 software (Applied Biosystems).  124  For each case, DNA was also extracted from adjacent areas of normal tissue and included as a control for the Real-Time PCR. As well, DNA from the cell line BT474 was used as a control. For all reactions, female genomic DNA was used to construct the standard curve and an Assay mix specific to the asparagine synthetase gene was used as the control. PCR reactions were completed in duplicate for each probe. Copy number gain was calculated from the Real-Time PCR raw data using the ratio of the quantity mean for the gene of interest versus the quantity mean for the control gene. By Real-Time PCR, a mean ratio greater than 1.5 was considered to be a copy number gain.  6.2.8. Fluorescence in situ Hybridization (FISH) Sections (4iim) were cut from the archival tissue blocks, placed on glass slides, and incubated at 60°C overnight. Each section was then deparaffinized using xylene and graded ethanol, and pretreated in citrate buffer. Following two washes in SSC, the slides were protease treated in a pepsinlHCl solution. The probe mixture for hybridization contained an equal amount of each FISH probe (test and control) and Cot-i DNA, combined in hybridization buffer (Vysis). This solution was denatured and applied to the prepared sections. In the HyBrite system (Vysis), slides were denatured and then incubated at 37°C for 16-18 hours. Following hybridization, slides were washed, counterstained with DAPI, and then visualized using a triple filter fluorescent microscope. Probe hybridization was observed in greater than 80% of the lobular neoplastic cells and an average of 100 non-overlapping nuclei was scored for each case. Images were captured using Cytovision (Applied Imaging, San Jose, California). A BAC clone of interest (selected for validation based on size, with a minimum DNA insert of 150kb) was chosen from each region of loss to be validated (test probe). Control probes were chosen to match the chromosome containing the loss and were located on the same chromosome as the gene of interest. The test and control probes were labeled with spectrum orange and spectrum green, respectively (Hospital for Sick Children, Toronto, Canada; http://www.tcag.ca/).  125  6.3. Results 6.3.1. Regions of Gain and Loss in ALH and LCIS In total, 25 lobular neoplastic lesions (Table 5.1) including 12 ALH lesions and 13 LCIS lesions were arrayed using the SMRT-array platform [12]. Within the ALH group, an average of 13.8 regions of copy number loss and 13.7 regions of copy number gain were identified per lesion. In contrast, the LCIS group carried an average of 5.8 losses and 4.5 gains per lesion (exact twosided p-values of 0.02 and 0.03 respectively for differences between the two groups). For all lobular neoplastic lesions combined, the statistical analysis revealed 97 regions of gain and 68 regions of loss. Based on location and frequency of occurrence, a subset of 33 alterations was chosen to define the genomic signature of lobular neoplasia. Table 2 summarizes the regions of gain and loss deemed relevant to lobular neoplasia, the locations (including BAC clone boundaries) and the affected lesions. As described earlier, regions were identified statistically using the circular binary segmentation algorithm as well as visually using the SeeGH software. An example of the pictorial output from these programs demonstrating a region of chromosomal gain is illustrated in Figure 6.1.  For several regions of alteration, the frequency of occurrence differed significantly (p<O.O5) between the ALH and LCIS lesions (Table 5.2). Alterations more common to ALH include a gain at 2pll.2 (6/12 ALH, 0/13 LCIS; p=O.OO52) and losses at 7pll.2-pll.l (10/12 ALH, 3/13 LCIS; p=O.OO48) and 22q11.1 (6/12 ALH, 1/13 LCIS; p=O.O3). Alterations more common to LCIS include a gain at 20q13.13 (2/12 ALH, 8/13 LCIS; p=O.O4.) and a loss at 19q13.2-q13.31 (2/12 ALH, 8/13 LCIS; p=O.O4). While one in 20 tests are expected to be significant at the 5% level by chance alone, corresponding to 1.65 comparisons out of 33, we observed five such regions. As previously reported by other groups, we observed a loss of the chromosomal region of 16q (Figure 6.2), specifically the region of 16q21-q23.1 within which the E-cadherin gene (CDHJ) is located, in both ALH and LCIS in equal frequency.  The signed logio transformed p-values from a two-sided t-test comparing the regional mean log2 ratio to zero in each of the 33 regions of alteration for all the lesions are illustrated in the heatmap with accompanying cluster dendrograms (Figure 6.3). 126  Based on the lesion-specific  regional mean log2 copy number ratios for the 33 regions, the diagonal linear discriminant analysis and support vector machine class predictors each performed the best of the class predictors implemented in BRB-ArrayTools [23].  Each had a leave-one-out cross-validated  misclassification rate of 32% with global p-values 0.078 and 0.046 respectively, indicating modest evidence to discriminate between classes. Although few such cases were available, the copy number profiles outlined in Table 2 for the cases containing paired ALH and LCIS lesions (A3/L3, A4/L5, Al 1/L12, A14/L13) show no one region as common to all four pairs. Furthermore, Figure 3 illustrates that, with the exception of case A3/L3, the genomic signatures for individual lesions of each pair do not show great similarity.  DNA extracted from regions of adjacent normal tissue in four cases (A8, A13, L8, L9) was assessed for chromosomal gains and losses and the microarray-CGHdata was processed by the same statistical method as applied to the data from the lesions. Consistent with recent reports of large-scale normal variation [24, 25], we detected regions of alteration in the DNA from adjacent normal tissue (data not shown), however, none of the areas documented as common to ALH or LCIS were found to be altered in the corresponding adjacent normal.  6.3.2. Validation of Microarray-CGH Alterations  Using Real-Time PCR, we validated three regions of gain (lOpls.2, COPEB; 17q21.32, HOXB6; 20q13.13, CEBPB) using gene-specific probe/primer sets. All 25 lesions studied by microarray CGH were examined by Real-Time PCR.  For the alteration at lOpl5.2-pl5.l, six of seven  lesions identified with a gain by microarray-CGH were confirmed as positive for the alteration. Similar results were obtained for the copy number gains at 17q21.32 and 20q13.13 (Table 6.3). Real-Time PCR was also used to assess DNA from the normal regions adjacent to each lobular neoplastic lesion evaluated by microarray-CGH and all non-lesion areas were found to have normal copy number (data not shown).  FISH was used to confirm regions of loss at 14q32.33 and i9pi3.ii-pi2. Optimal signal and lack of cross hybridization was verified for each set of probes using both normal metaphase spreads and a formalin-fixed, paraffin-embedded cell line MDA-MB-23 I (data not shown).  27  Probe hybridization was observed in greater than 80% of the lobular neoplastic cells. Two cases examined for the copy number loss detected at 14q32.33 by microarray-CGH were validated by FISH (Figure 4). In addition, the loss observed at l9pl3.ll-pl2 was confirmed in the one case studied.  6.4. Discussion This study represents the first whole genome investigation of early breast lesions using clinical archival specimens. We have examined solitary lobular neoplasia by microarray-CGH and have defined a global genomic signature (applicable to both ALH and LCIS), which includes several novel genomic events. Our validation of a number of regions of gain and loss, by Real-Time PCR and FISH, lends support to the genomic alterations we have described for lobular neoplasia.  Histologically there is a gradation in the extent of neoplastic involvement from ALH to LCIS, which is also reflected in the relative risk to the patient in the development of invasive disease. However, this trend does not follow for the frequency of genomic alterations. A greater number of alterations were observed in the ALH lesions compared to the LCIS lesions. For our group of LCIS, the frequency of copy number alterations is similar to what has been previously reported in microarray-CGH and chromosomal-CGH studies of synchronous lobular in situ and invasive carcinoma [4, 9, 10, 26-28]. However, the frequency of alteration in the ALH lesions is far greater than previously reported for any type of lobular lesion as well as greater than the frequency of alterations found in adjacent normal DNA. This would suggest that ALH is far more amenable to sporadic genetic alteration, which could explain why these lesions have a decreased likelihood for progression to invasive disease.  The observation that LCIS lesions carry fewer genomic alterations compared to ALH lesions also points to these in situ lesions having experienced genomic selection and the likelihood that the alterations found in LCIS may have functional impact. In particular, the region of gain at 20q13.13 was identified as more common to LCIS and is proximal to a region of gain commonly reported in breast cancer [29]. The region we have identified contains only two known genes; CCAAT/enhancer binding protein beta (CEBPB), one of the two genes in this altered region, was previously reported to play a role in mammary gland development. The balance of the different 128  isoforms of CEBPB function to control both proliferation and differentiation (reviewed in [30]). The specific expression of CEBPB-2 has been reported in both normal mammary and neoplastic cells and was shown to have the ability to activate genes that will cause these cells to divide [31]. Further studies are required to conclusively determine if CEBPB plays a functional role in the development of LCIS.  There have been few genomic studies investigating lobular neoplasia and therefore only a handful of genomic alterations have been reported for these lesions. One of the most frequent genomic events found in lobular carcinoma is loss of the chromosome region 1 6q, specifically the area containing CDHJ. We have identified this region of loss (1 6q2 1 -q23. 1) in both ALH and LCIS  ()  in equal frequency but not in the DNA from adjacent normal. Given the frequency  of this alteration in invasive breast cancer, the presence of a loss of 1 6q in early lobular lesions points to the possible progression towards an invasive phenotype in these cases.  Altered regions previously reported in studies of synchronous in situ lobular and invasive breast cancer [9, 10] have described the chromosomal areas that contribute to and maintain the invasive phenotype i.e. loss at ip, 1 Iq, 16q, 17p/q, and gain at lq. In addition, loss at 16q and gain at lq were specifically identified as highly prevalent in cases containing synchronous LCIS and ILC [9]. With the exception of loss at 16q, these particular regions were not frequently altered in our solitary lobular neoplastic lesions. As only a fraction of lobular neoplastic lesions progress to invasive disease, we speculate that the regions we identified that were originally reported in lesions with adjacent invasive carcinoma may harbor genes that are important to the progression of lobular lesions.  Further evidence of this involves the regions of alteration found by Lu et al. [4] in ALH and LCIS with associated invasive carcinoma using chromosomal-CGH. These alterations were also detected in our cases and can be refined to the specific regions of l6pll.2-pll.l (loss), 16q21q23.1 (loss), and 22q1 1.1 (loss).  Given these alterations were originally reported in lobular  neoplastic lesions with adjacent invasive carcinoma, it is possible these regions identify those solitary lobular neoplastic lesions more likely to progress.  129  The majority of alterations we have identified occur in both ALH and LCIS. Alterations more common to either ALH or LCIS were identified, but even these regions did not occur exclusively in one group. The statistical class prediction analysis of ALH versus LCIS using copy number signatures was equivocal, although the power of this assessment was likely limited by the sample size available. Therefore, although described histologically as distinct entities we speculate that the events responsible for lobular neoplastic development can be found in both ALH and LCIS.  Altered developmental pathways have been previously implicated in the tumorigenesis of many cancers and we have identified a number of regions of alteration in lobular neoplastic lesions that contain genes involved in development.  These regions include lp32.l (covering 15 clones  including JUN), l6p13 .3 (covering 25 clones including AKIN]), 1 7q2 1.32 (covering four clones including almost exclusively the HOXB cluster), and 1 7q25 .3 (covering 23 clones including RAC3). Lobular neoplastic lesions are discohesive proliferations within the lumen of mammary acini and alterations impacting on both development/differentiation and proliferation could contribute to the lobular neoplastic phenotype. As DNA copy number changes influence gene expression [32, 33], the functional effect of these particular genes on the pathogenesis of lobular neoplasia is worthy of further investigation.  In addition, alterations that affect the normal mammary acinar structure, within which lobular neoplasia initiates, could contribute to this neoplastic development.  During luminal  morphogenesis, the cells located in the center of the luminal space lack AKT pathway survival signals, which allow for their selective cell death ultimately contributing in the formation of the lumen [34].  From our microarray-CGH study we observed a copy number gain in the  chromosomal region of 14q32.33 in both ALH and LCIS. This alteration includes 13 clones and nine known genes including AKT].  An increase in AKT1 would provide pro-proliferative  signals that contribute to the survival and proliferation of cells located within the luminal space. Moreover, we observed a copy number gain at 5q32-33.1 in both ALH and LCIS, a region covered by 23 clones and 25 known genes. Located in this region and of interest to lobular neoplasia is the gene encoding colony stimulating factor 1 receptor (CSF-]R). CSF-1R protein expression has been reported in lactating epithelial cells whereas normal human mammary 130  epithelium does not express the antigen [35]. More recently, constitutive CSF-1R signalling has been reported to cause uncontrolled proliferation as well as disruption of E-cadherin mediated cell adhesion in a three-dimensional in vitro acinar model [36]. Both ALH and LCIS are known to lack expression of the E-cadherin adhesion complex at the membrane [4, 13, 37] and previous work by our group has shown that this loss of E-cadherin expression in solitary lobular neoplastic lesions is not due to mutation or LOH [13]. A gain at 5q32-33.1 including CSF-JR could not only contribute to the discohesive nature and proliferation of lobular neoplastic cells but could also explain the loss of cell-cell adhesion previously observed in these early lobular lesions.  Using microarray-CGH we have completed the first whole genome investigation of early breast lesions using clinical archival specimens. The genomic signature, reported for solitary lobular neoplasia, identifies copy number alterations that have been previously reported as well as several that are novel.  The alterations common to invasive breast cancer may identify those  solitary lobular lesions that are likely to progress. However, those alterations not reported in studies investigating invasive disease are also of particular interest. The genomic signature we have identified as common to both ALH and LCIS suggests a role for the acquisition of these novel genomic alterations in the aberrant cellular proliferation that defines lobular neoplasia.  131  Table 6.1 Pathological classification of lobular neoplastic cases evaluated by microarray-CGH  Case Lesion A3 A4 A7 A8 A9 Al 1 A12 A13 A14  ALH ALH ALH ALH ALH ALH ALH ALH ALH  A18 A21 A25 Li L2 L3 L5 L6 L7 L8 L9 L1O Lii L12 L13  ALH ALH ALH LCIS LCIS LCIS LCIS LCIS LCIS LCIS LCIS LCIS LCIS LCIS LCIS  L28  LCIS  Lesion description ALH paired with LCIS (L3) ALH paired with LCIS (L5) Solitary ALH Solitary ALH Solitary ALH ALH paired with LCIS (L12) Solitary ALH Solitary ALH ALH paired with LCIS (L13); adjacent invasive lesion Solitary ALH Solitary ALH Solitary ALl-I Solitary LCIS Solitary LCIS LCIS paired with ALH (A3) LCIS paired with ALH (A4) Solitary LCIS Solitary LCIS Solitary LCIS Solitary LCIS Solitary LCIS solitary LCIS LCIS paired with ALH (Al 1) LCIS paired with ALH (A 14); adjacent invasive lesion Solitary LCIS; adjacent invasive lesion  ALH, atypical lobular hyperplasia; LCIS, lobular carcinoma in situ.  132  -I 1 . i’I 3 5.2-15.I 1 p pI 5.5 p12.1-1 1.23 q24.22. 24.23  ‘  (124.31.24.33 q32.33 (132.33 qI 1.1—11.2 q26.3 p1 1.2 . 3 pI q12.l q21-23.1 q 21.32 q25.3  qI3.l3  GaIW Loas 4u3CIl  NO5$LI2  N051?A04  NO.4HO  NO7!.4UO  End Clone Nt)75li24  Sian Clon?  opto  N0032119  M2034M24  NO462!  oco  ini  N()6Mrn4  Mfl(14022  oiJoi  #N4)  ugiti  NOILI2  MKNt4IJ2  MtIl )AO  No67sioT  XOI4  N0  N0176P20  oioo  0594GI  NC)l(WiJ’l  toOlI3  o2ci  NO8IILT(I  wri  NUME1)i  rO772M(9  )7  OitI23  NO4-I2  NI4I((W  NQJ70O6  o&ci  NO?54E19  oaoicie  ?.Jfla)Oi)O  N4)l,IOl  N0653rn6  NO77BI3  1G24  ND213C06  NO4YJI’U7  4oOIjO7 ‘KU’2r%07  N0323G21  tIQGo2  YMI9  0 G L G L. G 6 6 C L L G G G L L C L L 6 L G L L  4J734I-!l  N,MlU  G L L L G  NflMCI4  NO71iHt  0 6 L  #ot  r- nc  [I  i  Solkary LCIS  1 11— I  13•j  ri  Table 6.2 Genomic signature for lobular neoplasia  Cyloband  2 plI. p1 I.2-1 LI q37.2-37.3 qIl.2 q32-33. I  .l 2 i3 -I [2 2 I 2l.1  5  1,13,1-12 q 12 q13.2—l 3.31 q12  Chr I I I 2 2 2 5 6 7 9 II) II 12 12 12 13 14 IS IS 16 16 16 16 17 17 19 19 19 20 (122.3 aII.I  ‘i21 p1 1.2-Ill  20 21 22  —  .1  r.  vi ‘.t  Paired Lesions  tt  I  .  *  z:z::z.  r-’  Ciones beg mkig w Mi belong to the Roswefl Park Cancer Institute lirary- 11 (RPI I) and those beginning with hi belong to the Caitecb.D (CTD)lbasrt bReOfl was identified vis.olIy usingSeeGHanftware and found to be consistarrdy altered across uses.  Solila  EE  ALIt  ‘‘  H  H  L’4  I 5 2 6  S  2  7 3 6  5 5 6 4  1009 0.202 0.593 (1.0(15 (1.16(1 (1.322 (1.073 0.096 0.220 0.0(15 0.68k (J.202  1000 0.411 (1.673 0.411 1.0(K) 0,096 0.411 0.411 0.658 0.041 (1.073  0h95  9.16(1 0.673 0.22 1.0(X) 9.695  0.160  2 13 15 7 6 5 7 9 5 Ii 12 4 8 7 N S 4 9 16 IS It) 6  4 I 5 3 2 10 8 5 1 4 4 6 2 6 5 2 5 4 5  0 3 9 S 2 5  10 0,041 5 0.2(12 7 (11)3(1  4  2 2 6  Table 6.3. Comparison of alterations found by microarray-CGH and subsequently validated by Real-Time PCR  Region lOpl5.2-15.l 17q21.32 20q13.13  Gene COPEB HOXB6 CEBPB  Lesions positive Lesions positive by by microarray- microarray confirmed CGH by Real-Time PCR 7/25 4/25 10/25  6/7 3/4 8/10  COPEB, core promoter element binding protein; HOXB6, homeobox B6; CEBPB, CCAAT/enhancer binding protein beta.  134  Figure 6.1 Region of copy number gain at chromosome lOpl5.2-pl5.l. Genomic alterations were identified statistically using the circular binary segmentation algorithm and visually using SeeGH software. (A) Output from the circular binary segmentation algorithm (a graph plotting log2 copy number ratio vs. genomic position) illustrating regions of copy number gain (in green) on chromosome 10 from one case of lobular neoplasia. The region of gain at lOpl5.2-p15.l is denoted with an arrow and the centromere region is indicated with a hatched bar. (B) The identical region of gain at lOp 15 .2-p 15.1 as depicted using SeeGH and demonstrating the overlapping BAC clones (area denoted with a bracket) involved in the altered region. Green bars on the right denote log2 ratios + 1 and +0.5; those on the left (red) represent log2 ratios —1 and 0.5.  —  135  Figure 6.2 Region of copy number loss at chromosome 1 6q2 1 -q23.l. Loss of 1 6q is one of the most frequent genomic events found in lobular carcinoma and we have identified this region of loss in both ALH and LCIS in equal frequency. (A) Output from the circular binary segmentation algorithm (a graph plotting log2 copy number ratio vs. genomic position) illustrating the large region of loss (in red) on chromosome 16 in one case of lobular neoplasia. (B) The common region of 1 6q2 1 -q23.1 lost in eight cases of lobular neoplasia, depicted using SeeGil and demonstrating the overlapping BAC clones involved in the altered region. Green bars on the right denote log2 ratios +1 and +0.5; those on the left (red) represent log2 ratios —l and —0.5.  136  __  ______  usIo.1s  L2 A# 14  L7  LI  I LII ‘.13 LII L ‘.11 ,$ii .%2 lID .4 L13 ‘.4  1  414  LI  S I  I 9p1 3.I-I2  I Sql 1.1-I 1.2 14q32J3  _____.J  pU1-qfl p 2 1I.l 1I. 7pI I .2-11.1  Iql2 1q21l 22q1 I I IqL3213JI 17q15i IlpIS.5  21q2.2-3 q32.fl 4 1  rJ j  3 p 6 I 1 ’ I6q2I-23I tópII2 12q24.22-24.2 I2 q J l 3 4 ll 237.2.fl3 16qt2.I I7q2IJ2 12q12.I.I 1.23 1p32 I q3243. I 2p11.2  I  L5q26J 6q21 %qlI.2 12 p 1 2 —II.  20q12 20q1313 LI 52-I 5.1 -1  -  -0  •Z  1?  0  ¶..  12  I,  •1  Figure 6.3 Heatmap with accompanying dendrograms illustrating the clustering of lobular neoplastic lesions using the identified 33 region genomic signature for lobular neoplasia. Hierarchical clustering with average linkage and centered Pearson correlation distance was used to cluster regions of gain and loss as well as to cluster lesions. Rows represent the regions of alteration in the genomic signature and columns represent the ALH (denoted with ‘A’) and LCIS (denoted with ‘L’) lesions. The gradient of chromosomal alterations is illustrated in the legend; orange depicts copy number loss and blue depicts copy number gain based on the signed logio transformed p-values from two-sided t-tests comparing the mean log2 ratios of the clones within each of the 33 regions of alteration to zero. A class prediction analysis using the lesion-specific regional mean log2 copy number ratios for the 33 regions provides modest evidence of differences in the copy number signature between the classes.  137  .4  14q32.33 II  -  .c  Figure 6.4 Region of copy number loss at chromosome 14q32.33 validated using fluorescence in situ hybridization (FISH). (A) The region of copy number loss at 14q32.33 as depicted using SeeGH (specific area of loss denoted by a bracket). Green bars on the right denote log2 ratios +1 and +0.5; those on the left (red) represent log2 ratios —l and —0.5. (B) Lobular neoplastic cells illustrating the loss at chromosome 14 with hybridized FISH probes for the region of interest (orange) and the control probe (green), both on chromosome 14, in a ratio of 1:2. (C) An area within the same section illustrating cells with normal copy number (cells showing two signals from each probe).  138  6.5. References I. 2. 3. 4.  5.  6.  7. 8.  9.  10.  11.  12.  13.  14.  15.  Foote F, Stewart F: Lobular carcinoma in situ: a rare found mammary carcinoma. American Journal ofPathology 1941, 17: 491-496. Dupont WD, Page DL: Risk factors for breast cancer in women with proliferative breast disease. NEnglJMed 1985, 312(3):146-l51. Simpson PT, Reis-Filho JS, Gale T, Lakhani SR: Molecular evolution of breast cancer. J Pathol 2005, 205(2):248-254. De Leeuw WJ, Berx G, Vos CB, Peterse JL, Van de Vijver MJ, Litvinov 5, Van Roy F, Cornelisse CJ, Cleton-Jansen AM: Simultaneous loss of E-cadherin and catenins in invasive lobular breast cancer and lobular carcinoma in situ. JPathol 1997, 183(4):404411. Wheeler JE, Enterline HT, Roseman JM, Tomasulo JP, Mcllvaine CH, Fitts WT, Jr., Kirshenbaum J: Lobular carcinoma in situ of the breast. Long-term followup. Cancer 1974, 34(3):554-563. Page DL, Schuyler PA, Dupont WD, Jensen RA, Plummer WD, Jr., Simpson JF: Atypical lobular hyperplasia as a unilateral predictor of breast cancer risk: a retrospective cohort study. Lancet 2003, 361(9352):125-129. London SJ, Connolly JL, Schnitt SJ, Colditz GA: A prospective study of benign breast disease and the risk of breast cancer. Jama 1992, 267(7):941-944. Page DL, Kidd TE, Jr., Dupont WD, Simpson JF, Rogers LW: Lobular neoplasia of the breast: higher risk for subsequent invasive cancer predicted by more extensive disease. Hum Pathol 1991, 22(12):1232-1239. Shelley Hwang E, Nyante SJ, Yi Chen Y, Moore D, DeVries S, Korkola JE, Esserman U, Waidman FM: Clonality of lobular carcinoma in situ and synchronous invasive lobular carcinoma. Cancer 2004, l00(12):2562-2572. Nyante SJ, Devries 5, Chen YY, Hwang ES:,Array-based comparative genomic hybridization of ductal carcinoma in situ and synchronous invasive lobular cancer. Hum Pathol 2004, 35(6):759-763. Pinkel D, Segraves R, Sudar D, Clark 5, Poole I, Kowbel D, Collins C, Kuo WL, Chen C, Zhai Y et al: High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays. Nat Genet 1998, 20(2):207-211. Ishkanian AS, Malloff CA, Watson SK, DeLeeuw RJ, Chi B, Coe BP, Snijders A, Albertson DG, Pinkel D, Marra MA et al: A tiling resolution DNA microarray with complete coverage of the human genome. Nat Genet 2004, 36(3):299-303. Mastracci TL, Tjan S, Bane AL, O’Malley FP, Andrulis IL: E-cadherin alterations in atypical lobular hyperplasia and lobular carcinoma in situ of the breast. Mod Pathol 2005, 18(6):741-751. Garnis C, Coe BP, Lam SL, MacAulay C, Lam WL: High-resolution array CGH increases heterogeneity tolerance in the analysis of clinical samples. Genomics 2005, 85(6):790-793. Prentice LM, Shadeo A, Lestou VS, Miller MA, deLeeuw RJ, Makretsov N, Turbin D, Brown LA, Macpherson N, Yorida E et al: NRG1 gene rearrangements in clinical breast cancer: identification of an adjacent novel amplicon associated with poor prognosis. Oncogene 2005, 24(49):728l-7289.  139  16. 17.  18.  19. 20.  21.  22. 23.  24. 25.  26.  27.  28.  29. 30. 31. 32.  Shadeo A, Lam WL: Comprehensive copy number profiles of breast cancer cell model genomes. Breast Cancer Res 2006, 8(1):R9. Chi B, DeLeeuw RJ, Coe BP, MacAulay C, Lam WL: SeeGH--a software tool for visualization of whole genome array comparative genomic hybridization data. BMC Bioinformatics 2004, 5:13. Yang YH, Dudoit S, Luu P, Lin DM, Peng V, Ngai J, Speed TP: Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res 2002, 30(4):e15. Olshen AB, Venkatraman ES, Lucito R, Wigler M: Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics 2004, 5(4):557-572. Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J et al: Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 2004, 5(10):R80. Snijders AM, Nowak N, Segraves R, Blackwood 5, Brown N, Conroy J, Hamilton G, Hindle AK, Huey B, Kimura K et al: Assembly of microarrays for genome-wide measurement of DNA copy number. Nat Genet 2001, 29(3):263-264. Ihaka R, Gentleman R: R: A language for data analysis and graphics. Journal of Computational and Graphical Statistics 1996, 5(3):299-3 14. Simon R, Lam A: Technical report 028. In: BRB Array-Tools Users Guide (version 33). Division of Cancer Treatment and Diagnosis, Biometric Research Branch, National Cancer Institute.; 2005. lafrate AJ, Feuk L, Rivera MN, Listewnik ML, Donahoe PK, Qi Y, Scherer SW, Lee C: Detection of large-scale variation in the human genome. Nat Genet 2004, 36(9):949-951. Sebat J, Lakshmi B, Troge J, Alexander J, Young J, Lundin P. Maner S, Massa H, Walker M, Chi M et al: Large-scale copy number polymorphism in the human genome. Science 2004, 305(5683):525-528. Buerger H, Simon R, Schafer KL, Diallo R, Littmann R, Poremba C, van Diest PJ, Dockhom-Dwomiczak B, Bocker W: Genetic relation of lobular carcinoma in situ, ductal carcinoma in situ, and associated invasive carcinoma of the breast. Mol Pathol 2000, 53(3):1 18-121. Gunther K, Merkelbach-Bruse S, Amo-Takyi BK, Handt 5, Schroder W, Tietze L: Differences in genetic alterations between primary lobular and ductal breast cancers detected by comparative genomic hybridization. JPathol 2001, 193(1):40-47. Weber-Mangal 5, Sinn HP, Popp 5, Klaes R, Emig R, Bentz M, Mansmann U, Bastert G, Bartram CR, Jauch A: Breast cancer in young women (< or = 35 years): Genomic aberrations detected by comparative genomic hybridization. mt J Cancer 2003, 107(4):583-592. Hodgson JG, Chin K, Collins C, Gray JW: Genome amplification of chromosome 20 in breast cancer. Breast Cancer Res Treat 2003, 78(3):337-345. Grimm SL, Rosen JM: The role of C/EBPbeta in mammary gland development and breast cancer. J Mammary Gland Biol Neoplasia 2003, 8(2): 191-204. Eaton EM, Hanlon M, Bundy L, Sealy L: Characterization of C/EBPbeta isoforms in normal versus neoplastic mammary epithelial cells. J Cell Physiol 2001, 189(1 ):9 1-105. Hyman E, Kauraniemi P, Hautaniemi 5, Wolf M, Mousses 5, Rozenblum E, Ringner M, Sauter G, Monni 0, Elkahloun A et al: Impact of DNA amplification on gene expression patterns in breast cancer. Cancer Res 2002, 62(21):6240-6245.  140  33.  34.  35.  36.  37.  Pollack JR, Sorlie T, Perou CM, Rees CA, Jeffrey SS, Lonning PE, Tibshirani R, Botstein D, Borresen-Dale AL, Brown P0: Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors. Proc NatlAcadSci USA 2002, 99(20):12963-12968. Debnath J, Mills KR, Collins NL, Reginato MJ, Muthuswamy SK, Brugge JS: The role of apoptosis in creating and maintaining luminal space within normal and oncogene expressing mammary acini. Cell 2002, 11 1(1):29-40. Sapi E, Flick MB, Rodov S, Carter D, Kacinski BM: Expression of CSF-I and CSF-I receptor by normal lactating mammary epithelial cells. J Soc Gynecol Investig 1998, 5(2):94-101. Wrobel CN, Debnath J, Lin E, Beausoleil S, Roussel MF, Brugge JS: Autocrine CSF-1R activation promotes Src-dependent disruption of mammary epithelial architecture. J Cell Biol 2004, 165(2):263-273. Sarrio D, Perez-Mies B, Hardisson D, Moreno-Bueno G, Suarez A, Cano A, Martin Perez J, Gamallo C, Palacios J: Cytoplasmic localization ofpl2octn and E-cadherin loss characterize lobular breast carcinoma from preinvasive to metastatic lesions. Oncogene 2004, 23(19):3272-3283.  141  7. CONCLUSION  7.1. Discussion  The highest rates of cervical cancer are found in developing countries. Frontline monitoring has reduced these rates in developed countries and present day screening programs primarily identif,’ precancerous lesions termed cervical intraepithelial neoplasia (CIN). Investigation of genes expressed in pre cancer lesions compared to those expressed in normal cervical epithelium will yield insight into the early stages of disease. As such, establishing a baseline with which to compare to, is critical in elucidating the abnormal biology of disease. Thoughtful consideration of gene expression changes paralleling the progressive pre invasive neoplastic development yields insight into the key casual events involved in cervical cancer development and provides the potential for novel biomarkers for improved detection and therapeutic intervention. In this work I have successfully accomplished Aim 1 and have performed an unbiased global gene expression analysis identifiing multiple genes influencing biologically relevant functions not previously implicated in cervical intraepithelial neoplasia. My results indicate deregulation of the chromatin remodelling complex components and its influencing factors occur in the development of CIN lesions. The increase in SWI/SNF stabilizing molecule SMARCC 1 and other novel genes has not been previously illustrated as events in the early stages of dysplasia development and thus provides novel candidate markers for screening and an interesting avenue in investigating the biological processes in neoplasia development. This is discussed further in section 7.1.1. Advances in genome-wide approaches have contributed to molecular classification with respect to genomic changes and their subsequent important effects on gene expression. The identification of genomic alterations occurring in preneoplastic lesions provides insight into potential causal events in disease occurrence and progression. My second aim in this work was to perform high resolution global copy number analysis of pre cancer lesions of the breast to identifi novel alterations. This work was achieved by first establishing the methodology in breast cancer cell lines and archival specimens that lead to the identification of novel alterations and delineated alteration boundaries and is discussed further in section 7.1.2. Significantly, the investigation of archival breast cancer specimens yielded the discovery of a novel discrete region 142  of amplification on 8p that was of prognostic significance independent of traditional parameters such as hormone receptor status and provides a novel target for future investigation. Evaluation of atypical lobular hyperplasia and lobular carcinoma in situ by the established methododology revealed not only novel regions of alterations but also a 33 region signature for the identification of lobular carcinoma in situ (section 7.1.3.).  With the information acquired through aims one and two, I have addressed the following hypothesis: Global transcriptome and high resolution genome analysis will identify altered genes that are shared in pre cancer lesions of both the cervix and the breast. I have found that CIN III and LCIS share aberrations in the transcription factor, HoxB7. These findings are discussed in section 7.1.4.  Collectively, this work demonstrates that through a whole genome approach of assessment of genomic copy number and expression we can identify novel genes and gene networks that are targeted during the development of pre cancer lesions.  7.1.1. Global Gene Expression Analysis by SAGE in Cervical Intraepithelial Neoplasia In chapter two, I examined the normal cervical tissue transcriptome and investigated the similarities and differences in relation to CIN III. One-hundred and eighteen unique tags were highly expressed in normal cervical tissue and 107 of them mapped to unique genes, most belong to the ribosomal, calcium-binding and keratinizing gene families. I assessed these genes for aberrant expression in CIN III and five genes showed altered expression. In addition, twelve unique HPV 16 SAGE tags were identified in the CIN ITT libraries absent in the normal libraries. Establishing a baseline of gene expression in normal cervical tissue is key in identifying changes in cancer. The utility of this baseline data is demonstrated through the identification of genes with aberrant expression in CIN III when compared to normal tissue. In chapter three, I have identified gene expression changes across 16 cervical cases (CIN I, CIN II, CIN III and normal cervical epithelium) using the unbiased long serial analysis of gene expression (L-SAGE) method. I have identified 222 genes differentially expressed between normal cervical tissue and CIN III. Many of these genes influence biological functions, characteristic of cancer, such as cell 143  death, cell growth/proliferation and cellular movement. Evaluation of these genes through network interactions identified multiple candidates that influence regulation of cellular transcription through chromatin remodelling (SMARCC1, NCOR1, MRFAP1 and MORF4L2). Chromatin remodelling is a mechanism to regulate gene transcription by altering the surrounding chromatin structure [1]. Further, these expression events are focused at the critical junction in disease development of moderate dysplasia (CIN II) indicating a role for chromatin remodelling as part of cervical cancer development.  Recent investigation of SMARCC 1 in prostate cancer revealed that increased expression is associated with dedifferentiation, progression, metastasis and time to recurrence [2]. Conversely, an increased level of SMARCC 1 in colorectal adenocarcinoma in combination with core-binding factor, beta subunit (CBFB) correlated with improved survival rates [3]. The work presented in chapter two describes an increase in SMACRCC 1 expression in CIN III. In light of these findings it is plausible that SMARCC 1 plays different roles in the development of squamous cell carcinoma and adenocarcinoma. Further investigation into the contribution of SMARCC 1 over expression to cervical cancer is needed. Specifically, it would be valuable to investigate SMARCC1 expression in cervical adenocarcinoma in comparison to the findings in chapter two of this work.  Fluorouracil is a chemotherapeutic agent currently used in combination therapy for the treatment of gastrointestinal neoplasms and was investigated in the treatment of locally advanced cervical cancer [4]. In one study by the Radiation Therapy Oncology Group (trial RTOG-9001), it was revealed that patients showed improved survival rates when treated with cisplatin, fluorouracil in addition to radiotherapy than if treated with radiotherapy alone (73% versus 58% survival rates) [5]. Additional trials investigated the efficacy of fluorouracil and found that patients with locally advanced cervical cancer did not benefit from fluorouracil treatment in combination with cisplatin and radiation or in combination with radiation therapy alone [6, 7]. Further, a recent study described increased expression of MORF4L2 was found to predict decreased sensitivity to chemotherapeutic agent fluorouracil in colorectal cancer [8]. Together, these results lend support to our findings of increased expression for MORF4L2 in cervical carcinoma in situ and not only  144  suggest a marker for fluorouracil sensitivity in cervical cancer but also a putative mechanism in resistance through chromatin remodelling which may be established at a pre cancerous stage.  A limitation of this work is that gene expression changes may be due to a variety of mechanisms including changes in gene copy number, epigenetic silencing, miRNA mediated gene silencing, upstream gene expression or protein functioning aberrations. As a result, studies of the transcriptome landscape by high throughput technologies such as SAGE capture changes in all genes whether it is due to a primary causal gene change or a secondary change in response to an upstream causal event affecting another gene. Genomic instability is characteristic of neoplastic transformation and results in copy number gain or loss at loci harbouring oncogenes or tumour suppressor genes, respectively. This motivated the investigation of genome wide copy number change in pre cancer to search for causal events in the early stages of tumourogenesis. The methodology was firstly established in breast cancer cell lines and archival breast cancer specimens described in 7.1.2.  7.1.2. Efficacy of Whole Genome Tiling Resolution aCGH in Breast Cancer  Cell lines have provided a renewable resource that is readily used as model systems for cancer cell biology. A thorough characterization of their genomes to identify regions of segmental DNA loss (potential tumor-suppressor-containing loci) and gain (potential oncogenic loci) would greatly facilitate the interpretation of biological data derived from such cells. In chapter 4, I describe a first study in the first whole-genome investigation of seven of the most commonly used breast cancer model cell lines using whole-genome tiling path genomic DNA array. Briefly, breast cancer model cell lines MCF-7, BT-474, MDA-MB-231, T47D, SK-BR-3, UACC-893 and ZR-75-30 were investigated for genomic alterations with the submegabase-resolution tiling array comparative genomic hybridization platform which offers tiling coverage of the human genome and break-point detection at about 80 kilobases resolution. I identified novel high-level alterations and fine-mapped previously reported regions yielding candidate genes. Specifically, 75 high-level gains and 48 losses were observed and their respective boundaries were documented. Complex alterations involving multiple levels of change were observed on chromosome arms ip, 8q, 9p, I lq, 15q, 17q and 20q. The described cell lines serve as frequently used models in investigation of the molecular biology of breast cancer and as a result it is 145  important to account for the potential influence of genetic alterations inherent in these lines when interpreting experimental results. This published study provides a resource for investigators that has been utilized and cited in approximately 40 articles to date [9-18]. This information is commonly used in a comparative sense for the advancement of methodologies in cancer genome investigations. This was most recently demonstrated by Hampton et aT. in a study that involved sequencing the MCF7 cell line genome using next generation sequencing technology in order to assess copy number (454 Life Sciences) and further utilized this information to identify novel breakpoints and fusion genes [11]. With the advancement of sequencing technologies, such applications to breast cancer evaluation will be important stepping stones in the quest for tailored treatment. In another study, T47D was used to demonstrate a method in aCGH copy number detection of single cells and utilized our cell line profile of T47D to validate their findings [13]. Staaf et a! commented on our chosen threshold method of calling alterations and suggested that this would lead to missed aberrations when applied to a large sample set [16]. In our report, we stipulate that threshold was set quite high in order to identify only high-level changes since these changes are associated with expression changes as demonstrated in HER2 positive tumours. Although potentially important, lower level changes were not included in our analysis due to the complex ploidy of these long established cell lines.  Furthermore, alignment of the whole-genome profiles enabled simultaneous assessment of copy number status of multiple components of the same biological pathway. Investigation of approximately 60 loci containing genes associated with the epidermal growth factor family (epidermal growth factor receptor, HER2, HER3 and HER4) revealed that all seven cell lines harbour copy number changes to multiple genes in these pathways. Overall, I have confirmed the effectiveness of aCGH in delineating known alteration boundaries in addition to demonstrating its usefulness in identifying novel and shared alterations.  Chapter five demonstrates the efficacy and usefulness in applying aCGH analysis to the study of archival breast cancer specimens. Rearrangements of NRG1 have been implicated in breast carcinoma oncogenesis. To determine the frequency and clinical significance of NRG1 aberrations in clinical breast tumours, a breast cancer tissue microarray was screened for NRGI aberrations by fluorescent in situ hybridization (FISH) using a two-color split-apart probe 146  combination flanking the NRG1 gene. Rearrangements ofNRG1 were identified in 17/3 82 cases by FISH, and bacterial artificial chromosome array comparative genomic hybridization was applied to five of these cases to further map the chromosome 8p abnormalities. I identified a novel amplicon centromeric to NRG 1 with a minimum common region of amplification encompassing two genes, SPFH2 and FLJ 14299 in the five cases investigated. Our results concur with other groups who also identify amplification of the 8pii-pi2 region as a frequent event in sporadic and familial breast cancer although additional adjacent candidate genes were also suggested for oncogenic potential in these studies including BRF2 and RAB11FIP [19, 20]. Subsequent FISH analysis for the novel amplicon identified, revealed that it was present in 63/262 cases. Abnormalities of NRG1 did not correlate with patient outcome, but the novel amplicon was associated with poor prognosis in univariate analysis, and in multivariate analysis was of prognostic significance independent of nodal status, tumour grade, estrogen receptor status, and human epidermal growth factor receptor (HER2) over expression. Of the two genes in the novel amplicon, expression of SPFH2 correlated most significantly with amplification. Although the function of SPFH2 is not established, recently SPFH2 has been suggested to play a role in the endoplasmic reticulum associated degradation of inositol 1 ,4,5-trisphosphate receptors Rs) which are involved in a range of cellular processes including development, secretion and 3 (IP apoptosis [21]. This poses an interesting role for SPFH2 in negatively regulating potentially pivotal cell pathways in breast cancer. This work contributes to the overall objective of my thesis by demonstrating the efficacy of tiling resolution aCGH analysis in archival breast samples.  7.1.3. Whole Genome Tiling Resolution aCGH Analysis in Precancerous Lesions  In chapter six, aCGH was utilized to investigate genetic changes in atypical lobular hyperplasia and lobular carcinoma in situ, as the presence of these lobular neoplastic lesions is an indicator of risk in the development of invasive breast cancer. In this study, twelve microdissected archival atypical lobular hyperplasia specimens and thirteen microdissected archival lobular carcinoma in situ lesions lacking adjacent invasive carcinoma were subjected to whole-genome tiling path aCGH. Copy number alterations were identified using statistical criteria and validated with RT PCR and fluorescence in situ hybridization. Surprisingly, a greater number of alterations were observed in atypical lobular hyperplasia compared to lobular carcinoma in situ suggesting an 147  increased likelihood of these lesions for sporadic genetic alteration and may explain why these lesions are unlikely to progress to invasive disease. Alterations more common in atypical lobular hyperplasia include gain at 2pl 1.2 and loss at 7pl 1-p 11.1 and 22q 11.1. Alternatively, alterations more common to lobular carcinoma in situ include gains at 17q21.32, 20q13.13 which are also frequently reported in invasive cancer, and loss at 1 9q 13 .2-q 13.31. In both atypical lobular hyperplasia and lobular carcinoma in situ, loss of l6q21-q23.1, an altered region previously identified in lobular neoplasia, invasive lobular carcinoma and low grade ductal carcinoma, was observed. [22-26]. Recently, additional putative tumour suppressor genes have been identified on 1 6q in addition to CDH 1 in lobular carcinoma in situ (CTCF and DPEP 1) suggesting that 1 6q may play a more complex role in tumour development than simply harbouring the frequently implicated CDH1 locus that is involved in cell-cell adhesion [27]. The identified 33 loci genomic signature includes copy number alterations not previously identified for lobular neoplasia. This genomic signature, common to atypical lobular hyperplasia and lobular carcinoma in situ, suggests a role for the acquisition of novel genomic alterations in the aberrant cellular proliferation that defines lobular neoplasia.  This study represents the first whole-genome investigation of lobular neoplastic breast lesions using clinical archival specimens. Due to the relative rarity and size of these lesions this study was limited by the number of cases. It would be useful to expand this analysis to include a larger sample size of ALH and lobular carcinoma in situ to evaluate the robustness of these findings. In a recent study, Aulmann et al presented genetic support that some lobular carcinoma in situ do in fact progress to ILC clonality [28]. Although, genomic DNA was not analysed in that study, it would be valuable to evaluate DNA copy number status of the cases identified to progress to invasive disease in comparison to our reported signature. Such data would focus the search for candidate molecular markers for early staged lesions that are more likely to progress and would further implicate these genes in early tumourogenic transformation events. Further it would be interesting to include an investigation of low grade ductal carcinoma in situ to identify any similarities in copy number change (in addition to the reported lq gain and 16q loss) as a clonal genetic relationship of highly differentiated ductal carcinoma in situ progression to low grade invasive ductal carcinoma may share a mechanistic process with lobular carcinoma in situ progression to invasive disease [29, 30]. 148  Finally, a recent study investigating lobular carcinoma in situ gene expression by SAGE found that a tight junction protein (Claudin 4) is decreased in expression in lobular carcinoma in situ while matrix metalloproteinase 9 (MMP9) is over expressed in lobular carcinoma in situ. Although we did not observe copy number loss at the Claudin 4 locus in our study we did observe gain at the MMP9 locus suggesting that copy number gain may be a causal event in over expression of this gene[31].  7.1.4. Alteration of HoxB7 Shared in Precancerous Lesions of the Cervix and Breast HoxB7 is a transcription factor that belongs to the HOX family of 39 genes that that have been shown to regulate cellular differentiation [32]. All members contain a highly conserved 1 89bp domain responsible for encoding amino acids that bind to target DNA. There are four clusters of Hox genes, A, B, C and that are located on chromosomes 7pl5.3, 17p2I.3, 12q 13.3 and 2q31, respectively. In this work, I observed a seven fold increase in expression of HoxB7 in CIN III and four of thirteen lobular carcinoma in situ cases showed gains at the Hox B cluster locus encompassing HoxB7. Lopez et al investigated HoxB gene cluster expression by RT-PCR in normal cervical epithelium and squamous cervical carcinoma. Their findings indicate that HoxBl, HoxB3, HoxB5, HoxB6, HoxB7, HoxB8 and HoxB9 are expressed in both normal and neoplastic cells of the cervix [33]. Interestingly, they also found that HoxB2, HoxB4 and HoxB 13 are only expressed in the cervical cancer specimens suggesting that these genes may have a role in disease. Alternatively, the work in chapter two describes an increase in HoxB7 expression in CIN III compared to normal cervical epithelium. This discrepancy may be a result of differences in methodologies. For example, Lopez et al chose GAPDH as internal control in their RT-PCR experiments. While GAPDH is a housekeeping gene commonly used as an internal control, like other commonly used housekeeping genes, it can vary in expression between tissue types, experimental conditions and disease state [34, 35]. If GAPDH did vary in normal cervical epithelium versus cervical carcinoma then the results of the RT-PCR would also vary accordingly. In the case of SAGE data, all raw tag counts are scaled to ‘tags per million’ in order to compare expression between libraries and not a housekeeping gene. It is also plausible that HoxB7 expression could escalate in CIN progression but then reduce to levels close to that seen in normal cervical epithelium at the invasive cancer stage which may suggest an increased 149  effort towards the preservation of normal epithelium in severe dysplasia but loss of that effort in invasive disease. It would be valuable to investigate HoxB7 expression in a larger cohort of CIN III and invasive cervical cancer using a single method for direct comparison.  HoxB7 was first identified as over expressed due to genomic amplification in breast cancer by in 2002 [36]. Subsequently, over expression of HoxB7 in primary breast cancer and breast cancer metastasis was associated with the epithelial-mesenchymal transition [37]. Epithelial mesenchymal transition is a proposed mechanism for the conversion of pre cancer to invasive disease where fully differentiated cells loose epithelial cell properties and instead acquire characteristics that facilitate cell migration and invasion. Wu et al found HoxB7 to be 18 fold over expressed in bone metastasis of breast cancer and compared to normal breast epithelium and three fold higher in bone metastasis when compared to primary breast cancer [37]. They further investigated effects of over expression of HoxB7 in normal epithelial cells in mice and found HoxB7 to induce EMT which was characterised by loss of expression of epithelial proteins CHD1, Claudin-4 and gain of mesenchymal proteins vimentin, u-smooth muscle actin and gain of migration and invasion characteristics.  Evidence is accumulating for the support of a role for HOX genes in adult cells and in cancer development. The described studies support our hypothesis suggesting an oncogenic role for HoxB7 in breast cancer and suggest support for our findings of gain at this locus in pre caner lesions of the breast.  7.2. Conclusions and Significance of Findings  Breast and cervical cancer are the most common cancers in women worldwide. An improved understanding of why only some pre cancer lesions of the cervix progress and others do not is an important step in moving towards selecting women who should be treated and sparing treatment from those who would not benefit. Analogously, an improved understanding and characterization of pre cancer breast lesions will create valuable opportunities for precise marker based detection and intervention.  150  The overall objective of this collective work was to identify genes in biological pathways or networks that are altered in pre cancer lesions of the breast and cervix using global analysis tools. In response to this and specifically the first aim of this study, I have established a global gene expression analysis of cervical intraepithelial neoplasia that has led to the discovery of genes not previously implicated in CIN III and upon further analysis revealed several of these genes participate in chromatin remodelling. In addition, work has provided a valuable publically available resource for the study of gene expression in precancerous cervical lesions. I have addressed the second aim through an aCGH analysis in which we have identified novel aberrations in atypical lobular hyperplasia and lobular carcinoma in situ and have established a genomic aberration signature for lobular carcinoma in situ. The hypothesis, “Global transcriptome and high resolution genome analysis will identify altered genes that are shared in pre cancer lesions of both the cervix and breast.” was found to be true with respect to the HoxB7 gene which was increased in expression in CIN III and gained in copy number in lobular carcinoma in situ.  7.3. Future Directions  7.3.1. Histological Variants of Precursor Lesions of Interest Investigation of breast cancer subtypes is a flourishing field and is hoped to pave the road to true tailored therapy. A novel subtype of interest is pleomorphic lobular carcinoma in situ (PLCIS). PLCIS is a subtype of lobular carcinoma in situ that possess molecular characteristics which imply a high-grade phenotype (ERJPR negative, Her2 positive, raised Ki67 index and p53 immunoreactivity) [38]. A recent study investigated copy number changes in pleomorphic subtype of lobular carcinoma which predictably appears to have an aggressive clinical course [39, 40]. There were six regions of overlap between our study and Simpson et al (llpl5.5, l6pl3.3, 16q12.l, 16q21-q23.l, 17q25.3, 20q13.13). This suggests that perhaps the additional regions identified by Simpson et al are responsible for the aggressive biology of this subgroup and would be of interest in follow-up studies.  151  The establishment of CIN and specifically CIN III as true precursors to cervical cancer is fairly rigorous however this stepwise relationship is less clear with respect to atypical lobular hyperplasia, lobular carcinoma in situ, and invasive breast cancer. An investigation of true precursor lesions is crucial for the elucidation of the altered biology in disease development. It has been suggested that atypical lobular hyperplasia and lobular carcinoma in situ may not be the earliest precursor. Columnar cell lesions (CCL) are columnar shaped epithelial cells found in an enlarged TDLU and are frequently identified in conjunction with lobular carcinoma in situ and ductal carcinoma in situ [41]. This is a common abnormality in breast and has been suggested to be the earliest non-obligate precursor of breast cancer and more specifically, of low nuclear grade breast cancer [41, 42]. In this instance, perhaps atypical lobular hyperplasia and lobular carcinoma in situ may already be too far along to detect the absolute initial changes. To date there is no published study on copy number assessment of columnar cell lesions of the breast as a result I am unable to compare our findings to aberrations in CCL. Perhaps with the continued advancement of copy number detection technology, hurdles such as very low copy DNA quantity and quality, may be overcome and lesions such as CCL can be investigated for their contribution to cancer development.  7.3.2. Integrative Studies Gene expression changes can be caused by several mechanisms including direct events such as changes in gene dosage (genomic copy number) and epigenetic silencing or indirect events such as expression changes in genes up-stream in a cascade which leads to disrupted expression in downstream genes.  It is estimated that 34-62% of amplified genes are also over expressed  however nearly 90% of over expressed genes are not amplified [36, 43-45]. Investigation of DNA alterations in conjunction with gene expression should distinguish key direct events from indirect changes. For example, loss of the 16q22.I region (encompassing CDH1 in addition to other genes) is the most common event in ILC and LCIS. Recently Green et al investigated gene expression of CCCTC-binding factor (CTCF) in LCIS which also located within the 1 6q22. I locus. CTCF was found to be significantly reduced in expression at both the mRNA and protein level further supporting our findings of copy number loss of this region in LCIS [27]. The integration of gene copy number, mRNA expression and protein expression allows for a comprehensive tool in understanding the relationship between gene dosage and gene expression 152  in disease. In the future, integration of analysis of multiple mechanisms such as microRNA expression and epigenetic silencing in addition to copy number and gene expression will facilitate gene discovery in the context of pathways key to disease development. 7.3.3. HPV Screening Several advances have been made in HPV screening within the past year. The PapilloCheck is a PCR based assay that detects 24 HPV strains and performs comparably to the FDA approved HC2 test [46]. Liquid bead microarray assay is another novel HPV genotyping assay recently available [47]. Can detect low copy numbers of the virus (>50 copies) and is highly reproducible. Support for liquid-based cytology is continuing and was most recently demonstrated in a study that identified a greater proportion of atypical cases by liquid-based cytology than by conventional cytology (2.97% versus 1.64%) [48].  Liquid-based cytology further offers an  opportunity for HPV type and molecular marker testing from the same sample. Despite these advancements an important issue that remains to be resolved is the detection of multiple HPV type infection in a single sample and most if not all HPV assays are challenged by this. Another hurdle was described in a recent report where HPV18 positive CIN III lesions were found to be diagnosed later than HPV 16 positive CIN III lesions [49]. It was proposed that this delay may be due to poor detection of HPV 18 by cytology and prompts a need for improved molecular markers in the diagnosis of precancerous lesions irrespective of HPV type even with the recent advances in cervical cancer screening.  In the future, as second generation sequencing technologies become more cost effective and sequencing reads become longer thus alleviating challenges in bioinformatics, analysis of suspicious lesions could be investigated thoroughly for causal events in addition to copy number and gene expression analysis such as the identification of novel fusion genes and mutations. In addition deep sequencing may quantitatively be able to detect specific virus strains within the same sample. In the case of HPV, specific strains would be identified through the sequence of the Li capsid gene. Further, since this technology can already identif’ fusion gene events, it is plausible that details into the events of viral integration into the host genome may also be investigated through this method.  153  7.3.4. HoxB7  Although I have identified HoxB7 to be increased in expression in CIN III lesions seven fold in comparison to normal cervical epithelium and gained in copy number in lobular carcinoma in situ, its precise role in cancer development and progression remains to be described. It is interesting to note that Wu et al described an epithelial-mesenchymal transformation induced by over expression of HoxB7 in normal epithelial cells that was characterised by loss of Claudin 4, a tight junction protein [37]. Recently, Cao et al also described loss of Claudin 4 in lobular carcinoma in situ [31].  It would be valuable to further investigate the relationship between  HoxB7, Claudin 4 and cancer development. One avenue might be the retrospective evaluation of a large cohort of adjacent in situ and invasive disease versus solitary in situ disease by immunihistochemical analysis of the two genes. The over expression of HoxB7 and loss of Claudin 4 together in in situ and adjacent invasive disease and not in solitary in situ lesions may suggests that these genes are important in the progression to invasive cancer. Additional investigation into the potential of HoxB7 to directly negatively regulate Claudin 4 would follow. Finally, it would be valuable to investigate the role of HoxB7 in other epithelial cancers in the likelihood that if it is altered in cervical intraepithelial neoplasia and lobular carcinoma in situ then perhaps its reach may extend to additional cancer types.  154  7.4. References 1. 2.  3.  4.  5.  6.  7.  8.  9. 10.  11.  12.  13.  14. 15.. 16.  Bassett A, Cooper S, Wu C, Travers A: The folding and unfolding of eukaryotic chromatin. Current Opinion in Genetics & Development, In Press, Corrected Proof. Heeboll 5, Borre M, Ottosen PD, Andersen CL, Mansilla F, Dyrskjot L, Orntoft TF, Torring N: SMARCC I expression is upregulated in prostate cancer and positively correlated with tumour recurrence and dedifferentiation. Histol Histopathol 2008, 23(9): 1069-1076. Andersen CL, Christensen LL, Thorsen K, Schepeler T, Sorensen FB, Verspaget HW, Simon R, Kruhoffer M, Aaltonen LA, Laurberg S et al: Dysregulation of the transcription factors SOX4, CBFB and SMARCC 1 correlates with outcome of colorectal cancer. Br J Cancer 2009, 100(3):51 1-523. González-Cortijo L, Carballo N, Gonzalez-Martin A, Corraliza V, Chiva de Agustin L, Lapuente Sastre F, Garcia Garcia JF, Rojo Sebastian A, Hornedo J, Colomer R: Novel chemotherapy approaches in chemoradiation protocols. Gynecologic Oncology 2008, 110(3, Supplement 2):S45S48. Morris M, Eifel PJ, Lu J, Grigsby PW, Levenback C, Stevens RE, Rotman M, Gershenson DM, Mutch DG: Pelvic radiation with concurrent chemotherapy compared with pelvic and para-aortic radiation for high-risk cervical cancer. NEng1J Med 1999, 340(15):1 137-1143. Lanciano R, Calkins A, Bundy BN, Parham G, Lucci JA, 3rd, Moore DH, Monk BJ, O’Connor DM: Randomized comparison of weekly cisplatin or protracted venous infusion of fluorouracil in combination with pelvic radiation in advanced cervix cancer: a gynecologic oncology group study. J Clin Oncol 2005, 23(33):8289-8295. Thomas G, Dembo A, Ackerman I, Franssen E, Balogh J, Fyles A, Levin W: A randomized trial of standard versus partially hyperfractionated radiation with or without concurrent 5-fluorouracil in locally advanced cervical cancer. Gynecol Oncol 1998, 69(2):137-145. Pezo RC, Gandhi SJ, Shirley LA, Pestell RG, Augenlicht LH, Singer RH: Single-Cell Transcription Site Activation Predicts Chemotherapy Response in Human Colorectal Tumors. Cancer Res 2008, 68(13):4977-4982. Cooke SL, Pole JC, Chin SF, Ellis 10, Caldas C, Edwards PA: High-resolution array CGH clarifies events occurring on Sp in carcinogenesis. BMC Cancer 2008, 8:288. Habibi G, Leung 5, Law JH, Gelmon K, Masoudi H, Turbin D, Pollak M, Nielsen TO, Huntsman D, Dunn SE: Redefining prognostic factors for breast cancer: YB-I is a stronger predictor of relapse and disease-specific survival than estrogen receptor or HER-2 across all tumor subtypes. Breast Cancer Res 2008, l0(5):R86. Hampton OA, Den Hollander P, Miller CA, Delgado DA, Li J, Coarfa C, Harris RA, Richards S, Scherer SE, Muzny DM et al: A sequence-level map of chromosomal breakpoints in the MCF-7 breast cancer cell line yields insights into the evolution of a cancer genome. Genome Res 2009, 19(2):167-177. Chari R, Lockwood WW, Coe BP, Chu A, Macey D, Thomson A, Davies JJ, MacAulay C, Lam WL: SIGMA: a system for integrative genomic microarray analysis of cancer genomes. BMC Genomics 2006, 7:324. Fuhrmann C, Schmidt-Kittler 0, Stoecklein NH, Petat-Dutter K, Vay C, Bockler K, Reinhardt R, Ragg T, Klein CA: High-resolution array comparative genomic hybridization of single micrometastatic tumor cells. Nucleic Acids Res 2008, 36(7):e39. Joosse SA, van Beers EH, Nederlof PM: Automated array-CGH optimized for archival formalin fixed, paraffin-embedded tumor material. BMC Cancer 2007, 7:43. Pandita A, Balasubramaniam A, Perrin R, Shannon P, Guha A: Malignant and benign ganglioglioma: a pathological and molecular study. Neuro Oncol 2007, 9(2):124-134. Staaf J, Jonsson G, Ringner M, Vallon-Christersson J: Normalization of array-CGH data: influence of copy number imbalances. BMC Genomics 2007, 8:3 82. 155  17. 18. 19. 20.  21.  22.  Urquidi V, Goodison S: Genomic signatures of breast cancer metastasis. Cytogenet Genome Res 2007, 1 18(2-4):1 16-129. van Beers EH, Nederlof PM: Array-CGH and breast cancer. Breast Cancer Res 2006, 8(3):210. Lorenzo Meichor MJGEHJCMPSAPAWECCJDBJB: Genomic analysis of the 8pl 1-12 amplicon in familial breast cancer. International Journal of Cancer 2007, 120(3):714-717. Garcia MJ, Pole JCM, Chin S-F, Teschendorff A, Naderi A, Ozdag H, Vias M, Kranjac T, Subkhankulova T, Paish C et a!: A 1 [thinsp]Mb minimal amplicon at 8pl 1-12 in breast cancer identifies new candidate oncogenes. Oncogene 2005, 24(33): 5235-5245. Pearce MMP, Wormer DB, Wilkens 5, Wojcikiewicz RJH: An Endoplasmic Reticulum (ER) Membrane Complex Composed of SPFH1 and SPFH2 Mediates the ER-associated Degradation of Inositol I ,4,5-Trisphosphate Receptors. J Biol Chem 2009, 284(16): 10433-10445. Buerger H, Simon R, Schafer KL, Diallo R, Littmann R, Poremba C, van Diest PJ, Dockhom Dworniczak B, Bocker W: Genetic relation of lobular carcinoma in situ, ductal carcinoma in situ, and associated invasive carcinoma of the breast. Mol Pathol 2000, 53:118 121. Gunther K, Merkelbach-Bruse 5, Amo-Takyi BK, Handt 5, Schroder W, Tietze L: Differences in genetic alterations between primary lobular and ductal breast cancers detected by comparative genomic hybridization. JPathol 2001, 193:40 47. Nishizaki T, Chew K, Chu L, Isola J, Kallioniemi A, Weidner N, Waldman FM: Genetic alterations in lobular breast cancer by comparative genomic hybridization. mt j Cancer 1997, 74:513 517. Loveday RL, Greenman J, Simcox DL, Speirs V, Drew PJ, Monson JR, Kerin MJ: Genetic changes in breast cancer detected by comparative genomic hybridisation. mt Cancer 2000, 86(4):494-500. Richard F, Pacyna-Gengelbach M, Schluns K, Fleige B, Winzer KJ, Szymas J, Dietel M, Petersen I, Schwendel A: Patterns of chromosomal imbalances in invasive breast cancer. mt j Cancer 2000, 89(3):305-310. Green AR, Krivinskas 5, Young P, Rakha EA, Paish EC, Powe DG, Ellis TO: Loss of expression of chromosome 16q genes DPEP1 and CTCF in lobular carcinoma in situ of the breast. Breast Cancer Res Treat 2009, 1 13(1):59-66. Aulmann S, Penzel R, Longerich T, Funke B, Schirmacher P, Sinn HP: Clonality of lobular carcinoma in situ (LCIS) and metachronous invasive breast cancer. Breast Cancer Res Treat 2008, 107(3):331-335. Boecker W, Buerger H, Schmitz K, Ellis IA, van Diest PJ, Sinn HP, Geradts J, Diallo R, Poremba C, Herbst H: Ductal epithelial proliferations of the breast: a biological continuum? Comparative genomic hybridization and high-molecular-weight cytokeratin expression patterns. JPathol 2001, 195(4):415-421. Moore E, Magee H, Coyne J, Gorey T, Dervan PA: Widespread chromosomal abnormalities in high-grade ductal carcinoma in situ of the breast. Comparative genomic hybridization study of pure high-grade DCIS. JPathol 1999, 187(4):403-409. Cao D, Polyak K, Halushka M, Nassar H, Kouprina N, Iacobuzio-Donahue C, Wu X, Sukumar 5, Hicks J, De Marzo A eta!: Serial analysis of gene expression of lobular carcinoma in situ identifies down regulation of claudin 4 and overexpression of matrix metalloproteinase 9. Breast Cancer Research 2008, l0(5):R91. Gehring WJ, Hiromi Y: Homeotic genes and the homeobox. Annu Rev Genet 1986, 20:147-173. Lopez R, Garrido E, Pina P, Hidalgo A, Lazos M, Ochoa R, Salcedo M: HOXB homeobox gene expression in cervical carcinoma. IntJ Gynecol Cancer 2006, 16(1):329-335. de Jonge Hi, Fehrmann RS, de Bont ES, Hofstra RM, Gerbens F, Kamps WA, de Vries EG, van der Zee AG, te Meerman GJ, ter Elst A: Evidence based selection of housekeeping genes. PLoS ONE 2007, 2(9):e898. Lee 5, Jo M, Lee J, Koh SS, Kim 5: Identification of novel universal housekeeping genes by statistical analysis of microarray data. JBiochem Mo! Biol 2007, 40(2):226-231. -  23.  -  24.  -  25.  26.  27.  28.  29.  30.  31.  32. 33. 34.  35.  156  36.  37.  38. 39. 40.  41.  42.  43.  44.  45.  46.  47.  48.  49.  Hyman E, Kauraniemi P, Hautaniemi S, Wolf M, Mousses 5, Rozenbium E, Ringner M, Sauter G, Monni 0, Elkahloun A et al: Impact of DNA amplification on gene expression patterns in breast cancer. Cancer Res 2002, 62(21):6240-6245. Wu X, Chen H, Parker B, Rubin E, Zhu T, Lee JS, Argani P, Sukumar S: HOXB7, a Homeodomain Protein, Is Overexpressed in Breast Cancer and Confers Epithelial-Mesenchymal Transition. Cancer Res 2006, 66(19):9527-9534. Hanby AM, Hughes TA: In situ and invasive lobular neoplasia of the breast. Histopathology 2008, 52(1):58-66. Vargas A-C, Lakhani SR, Simpson PT: Pleomorphic lobular carcinoma of the breast: molecular pathology and clinical impact. Future Oncology 2009, 5(2):233-243. Simpson PT, Reis-Filho JS, Lambros MB, Jones C, Steele D, Mackay A, Iravani M, Fenwick K, Dexter T, Jones A et al: Molecular profiling pleomorphic lobular carcinomas of the breast: evidence for a common molecular genetic pathway with classic lobular carcinomas. JPathol 2008, 215(3):231-244. Turashvili G, Hayes M, Gilks B, Watson P, Aparicio S: Are columnar cell lesions the earliest histologically detectable non-obligate precursor of breast cancer? Virchows Arch 2008, 452(6):589-598. Abdel-Fatah TM, Powe DG, Hodi Z, Reis-Filho JS, Lee AH, Ellis JO: Morphologic and molecular evolutionary pathways of low nuclear grade invasive breast cancers and their putative precursor lesions: further evidence to support the concept of low nuclear grade breast neoplasia family. Am J Surg Pathol 2008, 32(4):5 13-523. Chin K, DeVries 5, Fridlyand J, Spellman PT, Roydasgupta R, Kuo WL, Lapuk A, Neve RM, Qian Z, Ryder T et al: Genomic and transcriptional aberrations linked to breast cancer pathophysiologies. Cancer Cell 2006, 1 0(6):529-54 1. Chin SF, Wang Y, Thome NP, TeschendorffAE, Pinder SE, Vias M, Naderi A, Roberts I, Barbosa-Morais NL, Garcia Mi et al: Using array-comparative genomic hybridization to define molecular portraits of primary breast cancers. Oncogene 2007, 26(13):1959-1970. Pollack JR, Sorlie T, Perou CM, Rees CA, Jeffrey SS, Lonning PE, Tibshirani R, Botstein D, Borresen-Dale AL, Brown P0: Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors. Proc Natl Acad Sd US A 2002, 99(20):12963-12968. Dalstein V, Merlin 5, Bali C, Saunier M, Dachez R, Ronsin C: Analytical evaluation of the PapilloCheck test, a new commercial DNA chip for detection and genotyping of human papillomavirus. Journal of Virological Methods 2009, 156(1-2): 77-83. Feng Q, Cherne S, Winer RL, Balasubramanian A, Lee S-K, Hawes SE, Kiviat NB, Koutsky LA: Development and Evaluation of a Liquid Bead Microarray Assay for Genotyping Genital Human Papillomaviruses. J Clin Microbiol 2009, 47(3): 547-553. Beerman H, van Dorst EBL, Kuenen-Boumeester V, Hogendoorn PCW: Superior performance of liquid-based versus conventional cytology in a population-based cervical cancer screening program. Qynecologic Oncology 2009, 1 12(3):572-576. Safaeian M, Schiffman M, Gage J, Solomon D, Wheeler CM, Castle PE: Detection of Precancerous Cervical Lesions Is Differential by Human Papillomavirus Type. Cancer Res 2009, 69(8):3262-3266.  157  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0072300/manifest

Comment

Related Items